[ViewVC] Diff of: cvs/AnyEvent-Fork/README

Comparing AnyEvent-Fork/README (file contents):
Revision 1.3 by root, Fri Apr 5 19:10:10 2013 UTC vs.
Revision 1.6 by root, Thu Apr 18 20:17:35 2013 UTC

…		…
2	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't	2	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
3		3
4	SYNOPSIS	4	SYNOPSIS
5	use AnyEvent::Fork;	5	use AnyEvent::Fork;
6		6
7	##################################################################	7	AnyEvent::Fork
		8	->new
		9	->require ("MyModule")
		10	->run ("MyModule::server", my $cv = AE::cv);
		11
		12	my $fh = $cv->recv;
		13
		14	DESCRIPTION
		15	This module allows you to create new processes, without actually forking
		16	them from your current process (avoiding the problems of forking), but
		17	preserving most of the advantages of fork.
		18
		19	It can be used to create new worker processes or new independent
		20	subprocesses for short- and long-running jobs, process pools (e.g. for
		21	use in pre-forked servers) but also to spawn new external processes
		22	(such as CGI scripts from a web server), which can be faster (and more
		23	well behaved) than using fork+exec in big processes.
		24
		25	Special care has been taken to make this module useful from other
		26	modules, while still supporting specialised environments such as
		27	App::Staticperl or PAR::Packer.
		28
		29	WHAT THIS MODULE IS NOT
		30	This module only creates processes and lets you pass file handles and
		31	strings to it, and run perl code. It does not implement any kind of RPC
		32	- there is no back channel from the process back to you, and there is no
		33	RPC or message passing going on.
		34
		35	If you need some form of RPC, you could use the AnyEvent::Fork::RPC
		36	companion module, which adds simple RPC/job queueing to a process
		37	created by this module.
		38
		39	Or you can implement it yourself in whatever way you like, use some
		40	message-passing module such as AnyEvent::MP, some pipe such as
		41	AnyEvent::ZeroMQ, use AnyEvent::Handle on both sides to send e.g. JSON
		42	or Storable messages, and so on.
		43
		44	COMPARISON TO OTHER MODULES
		45	There is an abundance of modules on CPAN that do "something fork", such
		46	as Parallel::ForkManager, AnyEvent::ForkManager, AnyEvent::Worker or
		47	AnyEvent::Subprocess. There are modules that implement their own process
		48	management, such as AnyEvent::DBI.
		49
		50	The problems that all these modules try to solve are real, however, none
		51	of them (from what I have seen) tackle the very real problems of
		52	unwanted memory sharing, efficiency, not being able to use event
		53	processing or similar modules in the processes they create.
		54
		55	This module doesn't try to replace any of them - instead it tries to
		56	solve the problem of creating processes with a minimum of fuss and
		57	overhead (and also luxury). Ideally, most of these would use
		58	AnyEvent::Fork internally, except they were written before AnyEvent:Fork
		59	was available, so obviously had to roll their own.
		60
		61	PROBLEM STATEMENT
		62	There are two traditional ways to implement parallel processing on UNIX
		63	like operating systems - fork and process, and fork+exec and process.
		64	They have different advantages and disadvantages that I describe below,
		65	together with how this module tries to mitigate the disadvantages.
		66
		67	Forking from a big process can be very slow.
		68	A 5GB process needs 0.05s to fork on my 3.6GHz amd64 GNU/Linux box.
		69	This overhead is often shared with exec (because you have to fork
		70	first), but in some circumstances (e.g. when vfork is used),
		71	fork+exec can be much faster.
		72
		73	This module can help here by telling a small(er) helper process to
		74	fork, which is faster then forking the main process, and also uses
		75	vfork where possible. This gives the speed of vfork, with the
		76	flexibility of fork.
		77
		78	Forking usually creates a copy-on-write copy of the parent process.
		79	For example, modules or data files that are loaded will not use
		80	additional memory after a fork. When exec'ing a new process, modules
		81	and data files might need to be loaded again, at extra CPU and
		82	memory cost. But when forking, literally all data structures are
		83	copied - if the program frees them and replaces them by new data,
		84	the child processes will retain the old version even if it isn't
		85	used, which can suddenly and unexpectedly increase memory usage when
		86	freeing memory.
		87
		88	The trade-off is between more sharing with fork (which can be good
		89	or bad), and no sharing with exec.
		90
		91	This module allows the main program to do a controlled fork, and
		92	allows modules to exec processes safely at any time. When creating a
		93	custom process pool you can take advantage of data sharing via fork
		94	without risking to share large dynamic data structures that will
		95	blow up child memory usage.
		96
		97	In other words, this module puts you into control over what is being
		98	shared and what isn't, at all times.
		99
		100	Exec'ing a new perl process might be difficult.
		101	For example, it is not easy to find the correct path to the perl
		102	interpreter - $^X might not be a perl interpreter at all.
		103
		104	This module tries hard to identify the correct path to the perl
		105	interpreter. With a cooperative main program, exec'ing the
		106	interpreter might not even be necessary, but even without help from
		107	the main program, it will still work when used from a module.
		108
		109	Exec'ing a new perl process might be slow, as all necessary modules have
		110	to be loaded from disk again, with no guarantees of success.
		111	Long running processes might run into problems when perl is upgraded
		112	and modules are no longer loadable because they refer to a different
		113	perl version, or parts of a distribution are newer than the ones
		114	already loaded.
		115
		116	This module supports creating pre-initialised perl processes to be
		117	used as a template for new processes.
		118
		119	Forking might be impossible when a program is running.
		120	For example, POSIX makes it almost impossible to fork from a
		121	multi-threaded program while doing anything useful in the child - in
		122	fact, if your perl program uses POSIX threads (even indirectly via
		123	e.g. IO::AIO or threads), you cannot call fork on the perl level
		124	anymore without risking corruption issues on a number of operating
		125	systems.
		126
		127	This module can safely fork helper processes at any time, by calling
		128	fork+exec in C, in a POSIX-compatible way (via Proc::FastSpawn).
		129
		130	Parallel processing with fork might be inconvenient or difficult to
		131	implement. Modules might not work in both parent and child.
		132	For example, when a program uses an event loop and creates watchers
		133	it becomes very hard to use the event loop from a child program, as
		134	the watchers already exist but are only meaningful in the parent.
		135	Worse, a module might want to use such a module, not knowing whether
		136	another module or the main program also does, leading to problems.
		137
		138	Apart from event loops, graphical toolkits also commonly fall into
		139	the "unsafe module" category, or just about anything that
		140	communicates with the external world, such as network libraries and
		141	file I/O modules, which usually don't like being copied and then
		142	allowed to continue in two processes.
		143
		144	With this module only the main program is allowed to create new
		145	processes by forking (because only the main program can know when it
		146	is still safe to do so) - all other processes are created via
		147	fork+exec, which makes it possible to use modules such as event
		148	loops or window interfaces safely.
		149
		150	EXAMPLES
8	# create a single new process, tell it to run your worker function	151	Create a single new process, tell it to run your worker function.
9
10	AnyEvent::Fork	152	AnyEvent::Fork
11	->new	153	->new
12	->require ("MyModule")	154	->require ("MyModule")
13	->run ("MyModule::worker, sub {	155	->run ("MyModule::worker, sub {
14	my ($master_filehandle) = @_;	156	my ($master_filehandle) = @_;
15		157
16	# now $master_filehandle is connected to the	158	# now $master_filehandle is connected to the
17	# $slave_filehandle in the new process.	159	# $slave_filehandle in the new process.
18	});	160	});
19		161
20	# MyModule::worker might look like this	162	"MyModule" might look like this:
		163
		164	package MyModule;
		165
21	sub MyModule::worker {	166	sub worker {
22	my ($slave_filehandle) = @_;	167	my ($slave_filehandle) = @_;
23		168
24	# now $slave_filehandle is connected to the $master_filehandle	169	# now $slave_filehandle is connected to the $master_filehandle
25	# in the original prorcess. have fun!	170	# in the original prorcess. have fun!
26	}	171	}
27		172
28	##################################################################
29	# create a pool of server processes all accepting on the same socket	173	Create a pool of server processes all accepting on the same socket.
30
31	# create listener socket	174	# create listener socket
32	my $listener = ...;	175	my $listener = ...;
33		176
34	# create a pool template, initialise it and give it the socket	177	# create a pool template, initialise it and give it the socket
35	my $pool = AnyEvent::Fork	178	my $pool = AnyEvent::Fork
…		…
46	}	189	}
47		190
48	# now do other things - maybe use the filehandle provided by run	191	# now do other things - maybe use the filehandle provided by run
49	# to wait for the processes to die. or whatever.	192	# to wait for the processes to die. or whatever.
50		193
51	# My::Server::run might look like this	194	"My::Server" might look like this:
52	sub My::Server::run {	195
		196	package My::Server;
		197
		198	sub run {
53	my ($slave, $listener, $id) = @_;	199	my ($slave, $listener, $id) = @_;
54		200
55	close $slave; # we do not use the socket, so close it to save resources	201	close $slave; # we do not use the socket, so close it to save resources
56		202
57	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,	203	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
…		…
59	while (my $socket = $listener->accept) {	205	while (my $socket = $listener->accept) {
60	# do sth. with new socket	206	# do sth. with new socket
61	}	207	}
62	}	208	}
63		209
64	DESCRIPTION	210	use AnyEvent::Fork as a faster fork+exec
65	This module allows you to create new processes, without actually forking	211	This runs "/bin/echo hi", with standard output redirected to /tmp/log
66	them from your current process (avoiding the problems of forking), but	212	and standard error redirected to the communications socket. It is
67	preserving most of the advantages of fork.	213	usually faster than fork+exec, but still lets you prepare the
		214	environment.
68		215
69	It can be used to create new worker processes or new independent	216	open my $output, ">/tmp/log" or die "$!";
70	subprocesses for short- and long-running jobs, process pools (e.g. for
71	use in pre-forked servers) but also to spawn new external processes
72	(such as CGI scripts from a webserver), which can be faster (and more
73	well behaved) than using fork+exec in big processes.
74		217
75	Special care has been taken to make this module useful from other	218	AnyEvent::Fork
76	modules, while still supporting specialised environments such as	219	->new
77	App::Staticperl or PAR::Packer.	220	->eval ('
		221	# compile a helper function for later use
		222	sub run {
		223	my ($fh, $output, @cmd) = @_;
78		224
79	PROBLEM STATEMENT	225	# perl will clear close-on-exec on STDOUT/STDERR
80	There are two ways to implement parallel processing on UNIX like	226	open STDOUT, ">&", $output or die;
81	operating systems - fork and process, and fork+exec and process. They	227	open STDERR, ">&", $fh or die;
82	have different advantages and disadvantages that I describe below,
83	together with how this module tries to mitigate the disadvantages.
84		228
85	Forking from a big process can be very slow (a 5GB process needs 0.05s	229	exec @cmd;
86	to fork on my 3.6GHz amd64 GNU/Linux box for example). This overhead is	230	}
87	often shared with exec (because you have to fork first), but in some	231	')
88	circumstances (e.g. when vfork is used), fork+exec can be much faster.	232	->send_fh ($output)
89	This module can help here by telling a small(er) helper process to	233	->send_arg ("/bin/echo", "hi")
90	fork, or fork+exec instead.	234	->run ("run", my $cv = AE::cv);
91		235
92	Forking usually creates a copy-on-write copy of the parent process.	236	my $stderr = $cv->recv;
93	Memory (for example, modules or data files that have been will not take
94	additional memory). When exec'ing a new process, modules and data files
95	might need to be loaded again, at extra cpu and memory cost. Likewise
96	when forking, all data structures are copied as well - if the program
97	frees them and replaces them by new data, the child processes will
98	retain the memory even if it isn't used.
99	This module allows the main program to do a controlled fork, and
100	allows modules to exec processes safely at any time. When creating a
101	custom process pool you can take advantage of data sharing via fork
102	without risking to share large dynamic data structures that will
103	blow up child memory usage.
104
105	Exec'ing a new perl process might be difficult and slow. For example, it
106	is not easy to find the correct path to the perl interpreter, and all
107	modules have to be loaded from disk again. Long running processes might
108	run into problems when perl is upgraded for example.
109	This module supports creating pre-initialised perl processes to be
110	used as template, and also tries hard to identify the correct path
111	to the perl interpreter. With a cooperative main program, exec'ing
112	the interpreter might not even be necessary.
113
114	Forking might be impossible when a program is running. For example,
115	POSIX makes it almost impossible to fork from a multithreaded program
116	and do anything useful in the child - strictly speaking, if your perl
117	program uses posix threads (even indirectly via e.g. IO::AIO or
118	threads), you cannot call fork on the perl level anymore, at all.
119	This module can safely fork helper processes at any time, by caling
120	fork+exec in C, in a POSIX-compatible way.
121
122	Parallel processing with fork might be inconvenient or difficult to
123	implement. For example, when a program uses an event loop and creates
124	watchers it becomes very hard to use the event loop from a child
125	program, as the watchers already exist but are only meaningful in the
126	parent. Worse, a module might want to use such a system, not knowing
127	whether another module or the main program also does, leading to
128	problems.
129	This module only lets the main program create pools by forking
130	(because only the main program can know when it is still safe to do
131	so) - all other pools are created by fork+exec, after which such
132	modules can again be loaded.
133		237
134	CONCEPTS	238	CONCEPTS
135	This module can create new processes either by executing a new perl	239	This module can create new processes either by executing a new perl
136	process, or by forking from an existing "template" process.	240	process, or by forking from an existing "template" process.
		241
		242	All these processes are called "child processes" (whether they are
		243	direct children or not), while the process that manages them is called
		244	the "parent process".
137		245
138	Each such process comes with its own file handle that can be used to	246	Each such process comes with its own file handle that can be used to
139	communicate with it (it's actually a socket - one end in the new	247	communicate with it (it's actually a socket - one end in the new
140	process, one end in the main process), and among the things you can do	248	process, one end in the main process), and among the things you can do
141	in it are load modules, fork new processes, send file handles to it, and	249	in it are load modules, fork new processes, send file handles to it, and
…		…
151	memory used for the perl interpreter with the new process, but	259	memory used for the perl interpreter with the new process, but
152	loading modules takes time, and the memory is not shared with	260	loading modules takes time, and the memory is not shared with
153	anything else.	261	anything else.
154		262
155	This is ideal for when you only need one extra process of a kind,	263	This is ideal for when you only need one extra process of a kind,
156	with the option of starting and stipping it on demand.	264	with the option of starting and stopping it on demand.
157		265
158	Example:	266	Example:
159		267
160	AnyEvent::Fork	268	AnyEvent::Fork
161	->new	269	->new
…		…
175	the modules you loaded) is shared between the processes, and each	283	the modules you loaded) is shared between the processes, and each
176	new process consumes relatively little memory of its own.	284	new process consumes relatively little memory of its own.
177		285
178	The disadvantage of this approach is that you need to create a	286	The disadvantage of this approach is that you need to create a
179	template process for the sole purpose of forking new processes from	287	template process for the sole purpose of forking new processes from
180	it, but if you only need a fixed number of proceses you can create	288	it, but if you only need a fixed number of processes you can create
181	them, and then destroy the template process.	289	them, and then destroy the template process.
182		290
183	Example:	291	Example:
184		292
185	my $template = AnyEvent::Fork->new->require ("Some::Module");	293	my $template = AnyEvent::Fork->new->require ("Some::Module");
…		…
209	->require ("Some::Module")	317	->require ("Some::Module")
210	->run ("Some::Module::run", sub {	318	->run ("Some::Module::run", sub {
211	my ($fork_fh) = @_;	319	my ($fork_fh) = @_;
212	});	320	});
213		321
214	FUNCTIONS	322	THE "AnyEvent::Fork" CLASS
215	my $pool = new AnyEvent::Fork key => value...	323	This module exports nothing, and only implements a single class -
216	Create a new process pool. The following named parameters are	324	"AnyEvent::Fork".
217	supported:	325
		326	There are two class constructors that both create new processes - "new"
		327	and "new_exec". The "fork" method creates a new process by forking an
		328	existing one and could be considered a third constructor.
		329
		330	Most of the remaining methods deal with preparing the new process, by
		331	loading code, evaluating code and sending data to the new process. They
		332	usually return the process object, so you can chain method calls.
		333
		334	If a process object is destroyed before calling its "run" method, then
		335	the process simply exits. After "run" is called, all responsibility is
		336	passed to the specified function.
		337
		338	As long as there is any outstanding work to be done, process objects
		339	resist being destroyed, so there is no reason to store them unless you
		340	need them later - configure and forget works just fine.
218		341
219	my $proc = new AnyEvent::Fork	342	my $proc = new AnyEvent::Fork
220	Create a new "empty" perl interpreter process and returns its	343	Create a new "empty" perl interpreter process and returns its
221	process object for further manipulation.	344	process object for further manipulation.
222		345
223	The new process is forked from a template process that is kept	346	The new process is forked from a template process that is kept
224	around for this purpose. When it doesn't exist yet, it is created by	347	around for this purpose. When it doesn't exist yet, it is created by
225	a call to "new_exec" and kept around for future calls.	348	a call to "new_exec" first and then stays around for future calls.
226
227	When the process object is destroyed, it will release the file
228	handle that connects it with the new process. When the new process
229	has not yet called "run", then the process will exit. Otherwise,
230	what happens depends entirely on the code that is executed.
231		349
232	$new_proc = $proc->fork	350	$new_proc = $proc->fork
233	Forks $proc, creating a new process, and returns the process object	351	Forks $proc, creating a new process, and returns the process object
234	of the new process.	352	of the new process.
235		353
…		…
248	possible, and is also slower.	366	possible, and is also slower.
249		367
250	You should use "new" whenever possible, except when having a	368	You should use "new" whenever possible, except when having a
251	template process around is unacceptable.	369	template process around is unacceptable.
252		370
253	The path to the perl interpreter is divined usign various methods -	371	The path to the perl interpreter is divined using various methods -
254	first $^X is investigated to see if the path ends with something	372	first $^X is investigated to see if the path ends with something
255	that sounds as if it were the perl interpreter. Failing this, the	373	that sounds as if it were the perl interpreter. Failing this, the
256	module falls back to using $Config::Config{perlpath}.	374	module falls back to using $Config::Config{perlpath}.
257		375
		376	$pid = $proc->pid
		377	Returns the process id of the process *iff it is a direct child of
		378	the process running AnyEvent::Fork*, and "undef" otherwise.
		379
		380	Normally, only processes created via "AnyEvent::Fork->new_exec" and
		381	AnyEvent::Fork::Template are direct children, and you are
		382	responsible to clean up their zombies when they die.
		383
		384	All other processes are not direct children, and will be cleaned up
		385	by AnyEvent::Fork itself.
		386
258	$proc = $proc->eval ($perlcode, @args)	387	$proc = $proc->eval ($perlcode, @args)
259	Evaluates the given $perlcode as ... perl code, while setting @_ to	388	Evaluates the given $perlcode as ... Perl code, while setting @_ to
260	the strings specified by @args.	389	the strings specified by @args, in the "main" package.
261		390
262	This call is meant to do any custom initialisation that might be	391	This call is meant to do any custom initialisation that might be
263	required (for example, the "require" method uses it). It's not	392	required (for example, the "require" method uses it). It's not
264	supposed to be used to completely take over the process, use "run"	393	supposed to be used to completely take over the process, use "run"
265	for that.	394	for that.
…		…
267	The code will usually be executed after this call returns, and there	396	The code will usually be executed after this call returns, and there
268	is no way to pass anything back to the calling process. Any	397	is no way to pass anything back to the calling process. Any
269	evaluation errors will be reported to stderr and cause the process	398	evaluation errors will be reported to stderr and cause the process
270	to exit.	399	to exit.
271		400
		401	If you want to execute some code (that isn't in a module) to take
		402	over the process, you should compile a function via "eval" first,
		403	and then call it via "run". This also gives you access to any
		404	arguments passed via the "send_xxx" methods, such as file handles.
		405	See the "use AnyEvent::Fork as a faster fork+exec" example to see it
		406	in action.
		407
272	Returns the process object for easy chaining of method calls.	408	Returns the process object for easy chaining of method calls.
273		409
274	$proc = $proc->require ($module, ...)	410	$proc = $proc->require ($module, ...)
275	Tries to load the given module(s) into the process	411	Tries to load the given module(s) into the process
276		412
…		…
278		414
279	$proc = $proc->send_fh ($handle, ...)	415	$proc = $proc->send_fh ($handle, ...)
280	Send one or more file handles (not file descriptors) to the	416	Send one or more file handles (not file descriptors) to the
281	process, to prepare a call to "run".	417	process, to prepare a call to "run".
282		418
283	The process object keeps a reference to the handles until this is	419	The process object keeps a reference to the handles until they have
284	done, so you must not explicitly close the handles. This is most	420	been passed over to the process, so you must not explicitly close
285	easily accomplished by simply not storing the file handles anywhere	421	the handles. This is most easily accomplished by simply not storing
286	after passing them to this method.	422	the file handles anywhere after passing them to this method - when
		423	AnyEvent::Fork is finished using them, perl will automatically close
		424	them.
287		425
288	Returns the process object for easy chaining of method calls.	426	Returns the process object for easy chaining of method calls.
289		427
290	Example: pass an fh to a process, and release it without closing. it	428	Example: pass a file handle to a process, and release it without
291	will be closed automatically when it is no longer used.	429	closing. It will be closed automatically when it is no longer used.
292		430
293	$proc->send_fh ($my_fh);	431	$proc->send_fh ($my_fh);
294	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT	432	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
295		433
296	$proc = $proc->send_arg ($string, ...)	434	$proc = $proc->send_arg ($string, ...)
297	Send one or more argument strings to the process, to prepare a call	435	Send one or more argument strings to the process, to prepare a call
298	to "run". The strings can be any octet string.	436	to "run". The strings can be any octet strings.
299		437
		438	The protocol is optimised to pass a moderate number of relatively
		439	short strings - while you can pass up to 4GB of data in one go, this
		440	is more meant to pass some ID information or other startup info, not
		441	big chunks of data.
		442
300	Returns the process object for easy chaining of emthod calls.	443	Returns the process object for easy chaining of method calls.
301		444
302	$proc->run ($func, $cb->($fh))	445	$proc->run ($func, $cb->($fh))
303	Enter the function specified by the fully qualified name in $func in	446	Enter the function specified by the function name in $func in the
304	the process. The function is called with the communication socket as	447	process. The function is called with the communication socket as
305	first argument, followed by all file handles and string arguments	448	first argument, followed by all file handles and string arguments
306	sent earlier via "send_fh" and "send_arg" methods, in the order they	449	sent earlier via "send_fh" and "send_arg" methods, in the order they
307	were called.	450	were called.
308		451
309	If the called function returns, the process exits.
310
311	Preparing the process can take time - when the process is ready, the
312	callback is invoked with the local communications socket as
313	argument.
314
315	The process object becomes unusable on return from this function.	452	The process object becomes unusable on return from this function -
		453	any further method calls result in undefined behaviour.
		454
		455	The function name should be fully qualified, but if it isn't, it
		456	will be looked up in the "main" package.
		457
		458	If the called function returns, doesn't exist, or any error occurs,
		459	the process exits.
		460
		461	Preparing the process is done in the background - when all commands
		462	have been sent, the callback is invoked with the local
		463	communications socket as argument. At this point you can start using
		464	the socket in any way you like.
316		465
317	If the communication socket isn't used, it should be closed on both	466	If the communication socket isn't used, it should be closed on both
318	sides, to save on kernel memory.	467	sides, to save on kernel memory.
319		468
320	The socket is non-blocking in the parent, and blocking in the newly	469	The socket is non-blocking in the parent, and blocking in the newly
321	created process. The close-on-exec flag is set on both. Even if not	470	created process. The close-on-exec flag is set in both.
		471
322	used otherwise, the socket can be a good indicator for the existance	472	Even if not used otherwise, the socket can be a good indicator for
323	of the process - if the other process exits, you get a readable	473	the existence of the process - if the other process exits, you get a
324	event on it, because exiting the process closes the socket (if it	474	readable event on it, because exiting the process closes the socket
325	didn't create any children using fork).	475	(if it didn't create any children using fork).
326		476
327	Example: create a template for a process pool, pass a few strings,	477	Example: create a template for a process pool, pass a few strings,
328	some file handles, then fork, pass one more string, and run some	478	some file handles, then fork, pass one more string, and run some
329	code.	479	code.
330		480
…		…
339	->send_arg ("str3")	489	->send_arg ("str3")
340	->run ("Some::function", sub {	490	->run ("Some::function", sub {
341	my ($fh) = @_;	491	my ($fh) = @_;
342		492
343	# fh is nonblocking, but we trust that the OS can accept these	493	# fh is nonblocking, but we trust that the OS can accept these
344	# extra 3 octets anyway.	494	# few octets anyway.
345	syswrite $fh, "hi #$_\n";	495	syswrite $fh, "hi #$_\n";
346		496
347	# $fh is being closed here, as we don't store it anywhere	497	# $fh is being closed here, as we don't store it anywhere
348	});	498	});
349	}	499	}
…		…
351	# Some::function might look like this - all parameters passed before fork	501	# Some::function might look like this - all parameters passed before fork
352	# and after will be passed, in order, after the communications socket.	502	# and after will be passed, in order, after the communications socket.
353	sub Some::function {	503	sub Some::function {
354	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;	504	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
355		505
356	print scalar <$fh>; # prints "hi 1\n" and "hi 2\n"	506	print scalar <$fh>; # prints "hi #1\n" and "hi #2\n" in any order
357	}	507	}
		508
		509	PERFORMANCE
		510	Now for some unscientific benchmark numbers (all done on an amd64
		511	GNU/Linux box). These are intended to give you an idea of the relative
		512	performance you can expect, they are not meant to be absolute
		513	performance numbers.
		514
		515	OK, so, I ran a simple benchmark that creates a socket pair, forks,
		516	calls exit in the child and waits for the socket to close in the parent.
		517	I did load AnyEvent, EV and AnyEvent::Fork, for a total process size of
		518	5100kB.
		519
		520	2079 new processes per second, using manual socketpair + fork
		521
		522	Then I did the same thing, but instead of calling fork, I called
		523	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
		524	socket form the child to close on exit. This does the same thing as
		525	manual socket pair + fork, except that what is forked is the template
		526	process (2440kB), and the socket needs to be passed to the server at the
		527	other end of the socket first.
		528
		529	2307 new processes per second, using AnyEvent::Fork->new
		530
		531	And finally, using "new_exec" instead "new", using vforks+execs to exec
		532	a new perl interpreter and compile the small server each time, I get:
		533
		534	479 vfork+execs per second, using AnyEvent::Fork->new_exec
		535
		536	So how can "AnyEvent->new" be faster than a standard fork, even though
		537	it uses the same operations, but adds a lot of overhead?
		538
		539	The difference is simply the process size: forking the 5MB process takes
		540	so much longer than forking the 2.5MB template process that the extra
		541	overhead is canceled out.
		542
		543	If the benchmark process grows, the normal fork becomes even slower:
		544
		545	1340 new processes, manual fork of a 20MB process
		546	731 new processes, manual fork of a 200MB process
		547	235 new processes, manual fork of a 2000MB process
		548
		549	What that means (to me) is that I can use this module without having a
		550	bad conscience because of the extra overhead required to start new
		551	processes.
358		552
359	TYPICAL PROBLEMS	553	TYPICAL PROBLEMS
360	This section lists typical problems that remain. I hope by recognising	554	This section lists typical problems that remain. I hope by recognising
361	them, most can be avoided.	555	them, most can be avoided.
362		556
363	"leaked" file descriptors for exec'ed processes	557	leaked file descriptors for exec'ed processes
364	POSIX systems inherit file descriptors by default when exec'ing a	558	POSIX systems inherit file descriptors by default when exec'ing a
365	new process. While perl itself laudably sets the close-on-exec flags	559	new process. While perl itself laudably sets the close-on-exec flags
366	on new file handles, most C libraries don't care, and even if all	560	on new file handles, most C libraries don't care, and even if all
367	cared, it's often not possible to set the flag in a race-free	561	cared, it's often not possible to set the flag in a race-free
368	manner.	562	manner.
369		563
370	That means some file descriptors can leak through. And since it	564	That means some file descriptors can leak through. And since it
371	isn't possible to know which file descriptors are "good" and	565	isn't possible to know which file descriptors are "good" and
372	"neccessary" (or even to know which file descreiptors are open),	566	"necessary" (or even to know which file descriptors are open), there
373	there is no good way to close the ones that might harm.	567	is no good way to close the ones that might harm.
374		568
375	As an example of what "harm" can be done consider a web server that	569	As an example of what "harm" can be done consider a web server that
376	accepts connections and afterwards some module uses AnyEvent::Fork	570	accepts connections and afterwards some module uses AnyEvent::Fork
377	for the first time, causing it to fork and exec a new process, which	571	for the first time, causing it to fork and exec a new process, which
378	might inherit the network socket. When the server closes the socket,	572	might inherit the network socket. When the server closes the socket,
…		…
385	exec'ed well before many random file descriptors are open.	579	exec'ed well before many random file descriptors are open.
386		580
387	In general, the solution for these kind of problems is to fix the	581	In general, the solution for these kind of problems is to fix the
388	libraries or the code that leaks those file descriptors.	582	libraries or the code that leaks those file descriptors.
389		583
390	Fortunately, most of these lekaed descriptors do no harm, other than	584	Fortunately, most of these leaked descriptors do no harm, other than
391	sitting on some resources.	585	sitting on some resources.
392		586
393	"leaked" file descriptors for fork'ed processes	587	leaked file descriptors for fork'ed processes
394	Normally, AnyEvent::Fork does start new processes by exec'ing them,	588	Normally, AnyEvent::Fork does start new processes by exec'ing them,
395	which closes file descriptors not marked for being inherited.	589	which closes file descriptors not marked for being inherited.
396		590
397	However, AnyEvent::Fork::Early and AnyEvent::Fork::Template offer a	591	However, AnyEvent::Fork::Early and AnyEvent::Fork::Template offer a
398	way to create these processes by forking, and this leaks more file	592	way to create these processes by forking, and this leaks more file
…		…
405	trouble with a fork.	599	trouble with a fork.
406		600
407	The solution is to either not load these modules before use'ing	601	The solution is to either not load these modules before use'ing
408	AnyEvent::Fork::Early or AnyEvent::Fork::Template, or to delay	602	AnyEvent::Fork::Early or AnyEvent::Fork::Template, or to delay
409	initialising them, for example, by calling "init Gtk2" manually.	603	initialising them, for example, by calling "init Gtk2" manually.
		604
		605	exiting calls object destructors
		606	This only applies to users of AnyEvent::Fork:Early and
		607	AnyEvent::Fork::Template, or when initialising code creates objects
		608	that reference external resources.
		609
		610	When a process created by AnyEvent::Fork exits, it might do so by
		611	calling exit, or simply letting perl reach the end of the program.
		612	At which point Perl runs all destructors.
		613
		614	Not all destructors are fork-safe - for example, an object that
		615	represents the connection to an X display might tell the X server to
		616	free resources, which is inconvenient when the "real" object in the
		617	parent still needs to use them.
		618
		619	This is obviously not a problem for AnyEvent::Fork::Early, as you
		620	used it as the very first thing, right?
		621
		622	It is a problem for AnyEvent::Fork::Template though - and the
		623	solution is to not create objects with nontrivial destructors that
		624	might have an effect outside of Perl.
410		625
411	PORTABILITY NOTES	626	PORTABILITY NOTES
412	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a	627	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a
413	nop, and ::Template is not going to work), and it cost a lot of blood	628	nop, and ::Template is not going to work), and it cost a lot of blood
414	and sweat to make it so, mostly due to the bloody broken perl that	629	and sweat to make it so, mostly due to the bloody broken perl that
415	nobody seems to care about. The fork emulation is a bad joke - I have	630	nobody seems to care about. The fork emulation is a bad joke - I have
416	yet to see something useful that you cna do with it without running into	631	yet to see something useful that you can do with it without running into
417	memory corruption issues or other braindamage. Hrrrr.	632	memory corruption issues or other braindamage. Hrrrr.
418		633
419	Cygwin perl is not supported at the moment, as it should implement fd	634	Cygwin perl is not supported at the moment due to some hilarious
420	passing, but doesn't, and rolling my own is hard, as cygwin doesn't	635	shortcomings of its API - see IO::FDPoll for more details.
421	support enough functionality to do it.
422		636
423	SEE ALSO	637	SEE ALSO
424	AnyEvent::Fork::Early (to avoid executing a perl interpreter),	638	AnyEvent::Fork::Early, to avoid executing a perl interpreter at all
		639	(part of this distribution).
		640
425	AnyEvent::Fork::Template (to create a process by forking the main	641	AnyEvent::Fork::Template, to create a process by forking the main
426	program at a convenient time).	642	program at a convenient time (part of this distribution).
427		643
428	AUTHOR	644	AnyEvent::Fork::RPC, for simple RPC to child processes (on CPAN).
		645
		646	AUTHOR AND CONTACT INFORMATION
429	Marc Lehmann <schmorp@schmorp.de>	647	Marc Lehmann <schmorp@schmorp.de>
430	http://home.schmorp.de/	648	http://software.schmorp.de/pkg/AnyEvent-Fork
431		649

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/README (file contents): Revision 1.3 by root, Fri Apr 5 19:10:10 2013 UTC vs. Revision 1.6 by root, Thu Apr 18 20:17:35 2013 UTC

Diff Legend

Comparing AnyEvent-Fork/README (file contents):
Revision 1.3 by root, Fri Apr 5 19:10:10 2013 UTC vs.
Revision 1.6 by root, Thu Apr 18 20:17:35 2013 UTC