[ViewVC] Diff of: cvs/cvsroot/AnyEvent-Fork/README

Comparing cvsroot/AnyEvent-Fork/README (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.2 by root, Thu Apr 4 07:27:09 2013 UTC

		1	NAME
		2	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
		3
		4	ATTENTION, this is a very early release, and very untested. Consider it
		5	a technology preview.
		6
		7	SYNOPSIS
		8	use AnyEvent::Fork;
		9
		10	##################################################################
		11	# create a single new process, tell it to run your worker function
		12
		13	AnyEvent::Fork
		14	->new
		15	->require ("MyModule")
		16	->run ("MyModule::worker, sub {
		17	my ($master_filehandle) = @_;
		18
		19	# now $master_filehandle is connected to the
		20	# $slave_filehandle in the new process.
		21	});
		22
		23	# MyModule::worker might look like this
		24	sub MyModule::worker {
		25	my ($slave_filehandle) = @_;
		26
		27	# now $slave_filehandle is connected to the $master_filehandle
		28	# in the original prorcess. have fun!
		29	}
		30
		31	##################################################################
		32	# create a pool of server processes all accepting on the same socket
		33
		34	# create listener socket
		35	my $listener = ...;
		36
		37	# create a pool template, initialise it and give it the socket
		38	my $pool = AnyEvent::Fork
		39	->new
		40	->require ("Some::Stuff", "My::Server")
		41	->send_fh ($listener);
		42
		43	# now create 10 identical workers
		44	for my $id (1..10) {
		45	$pool
		46	->fork
		47	->send_arg ($id)
		48	->run ("My::Server::run");
		49	}
		50
		51	# now do other things - maybe use the filehandle provided by run
		52	# to wait for the processes to die. or whatever.
		53
		54	# My::Server::run might look like this
		55	sub My::Server::run {
		56	my ($slave, $listener, $id) = @_;
		57
		58	close $slave; # we do not use the socket, so close it to save resources
		59
		60	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
		61	# or anything we usually couldn't do in a process forked normally.
		62	while (my $socket = $listener->accept) {
		63	# do sth. with new socket
		64	}
		65	}
		66
		67	DESCRIPTION
		68	This module allows you to create new processes, without actually forking
		69	them from your current process (avoiding the problems of forking), but
		70	preserving most of the advantages of fork.
		71
		72	It can be used to create new worker processes or new independent
		73	subprocesses for short- and long-running jobs, process pools (e.g. for
		74	use in pre-forked servers) but also to spawn new external processes
		75	(such as CGI scripts from a webserver), which can be faster (and more
		76	well behaved) than using fork+exec in big processes.
		77
		78	Special care has been taken to make this module useful from other
		79	modules, while still supporting specialised environments such as
		80	App::Staticperl or PAR::Packer.
		81
		82	PROBLEM STATEMENT
		83	There are two ways to implement parallel processing on UNIX like
		84	operating systems - fork and process, and fork+exec and process. They
		85	have different advantages and disadvantages that I describe below,
		86	together with how this module tries to mitigate the disadvantages.
		87
		88	Forking from a big process can be very slow (a 5GB process needs 0.05s
		89	to fork on my 3.6GHz amd64 GNU/Linux box for example). This overhead is
		90	often shared with exec (because you have to fork first), but in some
		91	circumstances (e.g. when vfork is used), fork+exec can be much faster.
		92	This module can help here by telling a small(er) helper process to
		93	fork, or fork+exec instead.
		94
		95	Forking usually creates a copy-on-write copy of the parent process.
		96	Memory (for example, modules or data files that have been will not take
		97	additional memory). When exec'ing a new process, modules and data files
		98	might need to be loaded again, at extra cpu and memory cost. Likewise
		99	when forking, all data structures are copied as well - if the program
		100	frees them and replaces them by new data, the child processes will
		101	retain the memory even if it isn't used.
		102	This module allows the main program to do a controlled fork, and
		103	allows modules to exec processes safely at any time. When creating a
		104	custom process pool you can take advantage of data sharing via fork
		105	without risking to share large dynamic data structures that will
		106	blow up child memory usage.
		107
		108	Exec'ing a new perl process might be difficult and slow. For example, it
		109	is not easy to find the correct path to the perl interpreter, and all
		110	modules have to be loaded from disk again. Long running processes might
		111	run into problems when perl is upgraded for example.
		112	This module supports creating pre-initialised perl processes to be
		113	used as template, and also tries hard to identify the correct path
		114	to the perl interpreter. With a cooperative main program, exec'ing
		115	the interpreter might not even be necessary.
		116
		117	Forking might be impossible when a program is running. For example,
		118	POSIX makes it almost impossible to fork from a multithreaded program
		119	and do anything useful in the child - strictly speaking, if your perl
		120	program uses posix threads (even indirectly via e.g. IO::AIO or
		121	threads), you cannot call fork on the perl level anymore, at all.
		122	This module can safely fork helper processes at any time, by caling
		123	fork+exec in C, in a POSIX-compatible way.
		124
		125	Parallel processing with fork might be inconvenient or difficult to
		126	implement. For example, when a program uses an event loop and creates
		127	watchers it becomes very hard to use the event loop from a child
		128	program, as the watchers already exist but are only meaningful in the
		129	parent. Worse, a module might want to use such a system, not knowing
		130	whether another module or the main program also does, leading to
		131	problems.
		132	This module only lets the main program create pools by forking
		133	(because only the main program can know when it is still safe to do
		134	so) - all other pools are created by fork+exec, after which such
		135	modules can again be loaded.
		136
		137	CONCEPTS
		138	This module can create new processes either by executing a new perl
		139	process, or by forking from an existing "template" process.
		140
		141	Each such process comes with its own file handle that can be used to
		142	communicate with it (it's actually a socket - one end in the new
		143	process, one end in the main process), and among the things you can do
		144	in it are load modules, fork new processes, send file handles to it, and
		145	execute functions.
		146
		147	There are multiple ways to create additional processes to execute some
		148	jobs:
		149
		150	fork a new process from the "default" template process, load code, run
		151	it
		152	This module has a "default" template process which it executes when
		153	it is needed the first time. Forking from this process shares the
		154	memory used for the perl interpreter with the new process, but
		155	loading modules takes time, and the memory is not shared with
		156	anything else.
		157
		158	This is ideal for when you only need one extra process of a kind,
		159	with the option of starting and stipping it on demand.
		160
		161	Example:
		162
		163	AnyEvent::Fork
		164	->new
		165	->require ("Some::Module")
		166	->run ("Some::Module::run", sub {
		167	my ($fork_fh) = @_;
		168	});
		169
		170	fork a new template process, load code, then fork processes off of it
		171	and run the code
		172	When you need to have a bunch of processes that all execute the same
		173	(or very similar) tasks, then a good way is to create a new template
		174	process for them, loading all the modules you need, and then create
		175	your worker processes from this new template process.
		176
		177	This way, all code (and data structures) that can be shared (e.g.
		178	the modules you loaded) is shared between the processes, and each
		179	new process consumes relatively little memory of its own.
		180
		181	The disadvantage of this approach is that you need to create a
		182	template process for the sole purpose of forking new processes from
		183	it, but if you only need a fixed number of proceses you can create
		184	them, and then destroy the template process.
		185
		186	Example:
		187
		188	my $template = AnyEvent::Fork->new->require ("Some::Module");
		189
		190	for (1..10) {
		191	$template->fork->run ("Some::Module::run", sub {
		192	my ($fork_fh) = @_;
		193	});
		194	}
		195
		196	# at this point, you can keep $template around to fork new processes
		197	# later, or you can destroy it, which causes it to vanish.
		198
		199	execute a new perl interpreter, load some code, run it
		200	This is relatively slow, and doesn't allow you to share memory
		201	between multiple processes.
		202
		203	The only advantage is that you don't have to have a template process
		204	hanging around all the time to fork off some new processes, which
		205	might be an advantage when there are long time spans where no extra
		206	processes are needed.
		207
		208	Example:
		209
		210	AnyEvent::Fork
		211	->new_exec
		212	->require ("Some::Module")
		213	->run ("Some::Module::run", sub {
		214	my ($fork_fh) = @_;
		215	});
		216
		217	FUNCTIONS
		218	my $pool = new AnyEvent::Fork key => value...
		219	Create a new process pool. The following named parameters are
		220	supported:
		221
		222	my $proc = new AnyEvent::Fork
		223	Create a new "empty" perl interpreter process and returns its
		224	process object for further manipulation.
		225
		226	The new process is forked from a template process that is kept
		227	around for this purpose. When it doesn't exist yet, it is created by
		228	a call to "new_exec" and kept around for future calls.
		229
		230	When the process object is destroyed, it will release the file
		231	handle that connects it with the new process. When the new process
		232	has not yet called "run", then the process will exit. Otherwise,
		233	what happens depends entirely on the code that is executed.
		234
		235	$new_proc = $proc->fork
		236	Forks $proc, creating a new process, and returns the process object
		237	of the new process.
		238
		239	If any of the "send_" functions have been called before fork, then
		240	they will be cloned in the child. For example, in a pre-forked
		241	server, you might "send_fh" the listening socket into the template
		242	process, and then keep calling "fork" and "run".
		243
		244	my $proc = new_exec AnyEvent::Fork
		245	Create a new "empty" perl interpreter process and returns its
		246	process object for further manipulation.
		247
		248	Unlike the "new" method, this method always spawns a new perl
		249	process (except in some cases, see AnyEvent::Fork::Early for
		250	details). This reduces the amount of memory sharing that is
		251	possible, and is also slower.
		252
		253	You should use "new" whenever possible, except when having a
		254	template process around is unacceptable.
		255
		256	The path to the perl interpreter is divined usign various methods -
		257	first $^X is investigated to see if the path ends with something
		258	that sounds as if it were the perl interpreter. Failing this, the
		259	module falls back to using $Config::Config{perlpath}.
		260
		261	$proc = $proc->eval ($perlcode, @args)
		262	Evaluates the given $perlcode as ... perl code, while setting @_ to
		263	the strings specified by @args.
		264
		265	This call is meant to do any custom initialisation that might be
		266	required (for example, the "require" method uses it). It's not
		267	supposed to be used to completely take over the process, use "run"
		268	for that.
		269
		270	The code will usually be executed after this call returns, and there
		271	is no way to pass anything back to the calling process. Any
		272	evaluation errors will be reported to stderr and cause the process
		273	to exit.
		274
		275	Returns the process object for easy chaining of method calls.
		276
		277	$proc = $proc->require ($module, ...)
		278	Tries to load the given module(s) into the process
		279
		280	Returns the process object for easy chaining of method calls.
		281
		282	$proc = $proc->send_fh ($handle, ...)
		283	Send one or more file handles (not file descriptors) to the
		284	process, to prepare a call to "run".
		285
		286	The process object keeps a reference to the handles until this is
		287	done, so you must not explicitly close the handles. This is most
		288	easily accomplished by simply not storing the file handles anywhere
		289	after passing them to this method.
		290
		291	Returns the process object for easy chaining of method calls.
		292
		293	Example: pass an fh to a process, and release it without closing. it
		294	will be closed automatically when it is no longer used.
		295
		296	$proc->send_fh ($my_fh);
		297	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
		298
		299	$proc = $proc->send_arg ($string, ...)
		300	Send one or more argument strings to the process, to prepare a call
		301	to "run". The strings can be any octet string.
		302
		303	Returns the process object for easy chaining of emthod calls.
		304
		305	$proc->run ($func, $cb->($fh))
		306	Enter the function specified by the fully qualified name in $func in
		307	the process. The function is called with the communication socket as
		308	first argument, followed by all file handles and string arguments
		309	sent earlier via "send_fh" and "send_arg" methods, in the order they
		310	were called.
		311
		312	If the called function returns, the process exits.
		313
		314	Preparing the process can take time - when the process is ready, the
		315	callback is invoked with the local communications socket as
		316	argument.
		317
		318	The process object becomes unusable on return from this function.
		319
		320	If the communication socket isn't used, it should be closed on both
		321	sides, to save on kernel memory.
		322
		323	The socket is non-blocking in the parent, and blocking in the newly
		324	created process. The close-on-exec flag is set on both. Even if not
		325	used otherwise, the socket can be a good indicator for the existance
		326	of the process - if the other process exits, you get a readable
		327	event on it, because exiting the process closes the socket (if it
		328	didn't create any children using fork).
		329
		330	Example: create a template for a process pool, pass a few strings,
		331	some file handles, then fork, pass one more string, and run some
		332	code.
		333
		334	my $pool = AnyEvent::Fork
		335	->new
		336	->send_arg ("str1", "str2")
		337	->send_fh ($fh1, $fh2);
		338
		339	for (1..2) {
		340	$pool
		341	->fork
		342	->send_arg ("str3")
		343	->run ("Some::function", sub {
		344	my ($fh) = @_;
		345
		346	# fh is nonblocking, but we trust that the OS can accept these
		347	# extra 3 octets anyway.
		348	syswrite $fh, "hi #$_\n";
		349
		350	# $fh is being closed here, as we don't store it anywhere
		351	});
		352	}
		353
		354	# Some::function might look like this - all parameters passed before fork
		355	# and after will be passed, in order, after the communications socket.
		356	sub Some::function {
		357	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
		358
		359	print scalar <$fh>; # prints "hi 1\n" and "hi 2\n"
		360	}
		361
		362	PORTABILITY NOTES
		363	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a
		364	nop, and ::Template is not going to work), and it cost a lot of blood
		365	and sweat to make it so, mostly due to the bloody broken perl that
		366	nobody seems to care about. The fork emulation is a bad joke - I have
		367	yet to see something useful that you cna do with it without running into
		368	memory corruption issues or other braindamage. Hrrrr.
		369
		370	Cygwin perl is not supported at the moment, as it should implement fd
		371	passing, but doesn't, and rolling my own is hard, as cygwin doesn't
		372	support enough functionality to do it.
		373
		374	AUTHOR
		375	Marc Lehmann <schmorp@schmorp.de>
		376	http://home.schmorp.de/
		377

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing cvsroot/AnyEvent-Fork/README (file contents): Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs. Revision 1.2 by root, Thu Apr 4 07:27:09 2013 UTC

Diff Legend

Comparing cvsroot/AnyEvent-Fork/README (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.2 by root, Thu Apr 4 07:27:09 2013 UTC