[ViewVC] Diff of: cvs/AnyEvent-Fork/Fork.pm

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.25 by root, Sat Apr 6 08:55:16 2013 UTC vs.
Revision 1.41 by root, Mon Apr 8 03:20:53 2013 UTC

…		…
27		27
28	Special care has been taken to make this module useful from other modules,	28	Special care has been taken to make this module useful from other modules,
29	while still supporting specialised environments such as L<App::Staticperl>	29	while still supporting specialised environments such as L<App::Staticperl>
30	or L<PAR::Packer>.	30	or L<PAR::Packer>.
31		31
32	=head1 WHAT THIS MODULE IS NOT	32	=head2 WHAT THIS MODULE IS NOT
33		33
34	This module only creates processes and lets you pass file handles and	34	This module only creates processes and lets you pass file handles and
35	strings to it, and run perl code. It does not implement any kind of RPC -	35	strings to it, and run perl code. It does not implement any kind of RPC -
36	there is no back channel from the process back to you, and there is no RPC	36	there is no back channel from the process back to you, and there is no RPC
37	or message passing going on.	37	or message passing going on.
…		…
40	in whatever way you like, use some message-passing module such	40	in whatever way you like, use some message-passing module such
41	as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use	41	as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use
42	L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,	42	L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
43	and so on.	43	and so on.
44		44
		45	=head2 COMPARISON TO OTHER MODULES
		46
		47	There is an abundance of modules on CPAN that do "something fork", such as
		48	L<Parallel::ForkManager>, L<AnyEvent::ForkManager>, L<AnyEvent::Worker>
		49	or L<AnyEvent::Subprocess>. There are modules that implement their own
		50	process management, such as L<AnyEvent::DBI>.
		51
		52	The problems that all these modules try to solve are real, however, none
		53	of them (from what I have seen) tackle the very real problems of unwanted
		54	memory sharing, efficiency, not being able to use event processing or
		55	similar modules in the processes they create.
		56
		57	This module doesn't try to replace any of them - instead it tries to solve
		58	the problem of creating processes with a minimum of fuss and overhead (and
		59	also luxury). Ideally, most of these would use AnyEvent::Fork internally,
		60	except they were written before AnyEvent:Fork was available, so obviously
		61	had to roll their own.
		62
45	=head1 PROBLEM STATEMENT	63	=head2 PROBLEM STATEMENT
46		64
47	There are two traditional ways to implement parallel processing on UNIX	65	There are two traditional ways to implement parallel processing on UNIX
48	like operating systems - fork and process, and fork+exec and process. They	66	like operating systems - fork and process, and fork+exec and process. They
49	have different advantages and disadvantages that I describe below,	67	have different advantages and disadvantages that I describe below,
50	together with how this module tries to mitigate the disadvantages.	68	together with how this module tries to mitigate the disadvantages.
…		…
125	becomes very hard to use the event loop from a child program, as the	143	becomes very hard to use the event loop from a child program, as the
126	watchers already exist but are only meaningful in the parent. Worse, a	144	watchers already exist but are only meaningful in the parent. Worse, a
127	module might want to use such a module, not knowing whether another module	145	module might want to use such a module, not knowing whether another module
128	or the main program also does, leading to problems.	146	or the main program also does, leading to problems.
129		147
		148	Apart from event loops, graphical toolkits also commonly fall into the
		149	"unsafe module" category, or just about anything that communicates with
		150	the external world, such as network libraries and file I/O modules, which
		151	usually don't like being copied and then allowed to continue in two
		152	processes.
		153
130	With this module only the main program is allowed to create new processes	154	With this module only the main program is allowed to create new processes
131	by forking (because only the main program can know when it is still safe	155	by forking (because only the main program can know when it is still safe
132	to do so) - all other processes are created via fork+exec, which makes it	156	to do so) - all other processes are created via fork+exec, which makes it
133	possible to use modules such as event loops or window interfaces safely.	157	possible to use modules such as event loops or window interfaces safely.
134		158
…		…
146		170
147	# now $master_filehandle is connected to the	171	# now $master_filehandle is connected to the
148	# $slave_filehandle in the new process.	172	# $slave_filehandle in the new process.
149	});	173	});
150		174
151	# MyModule::worker might look like this	175	C<MyModule> might look like this:
		176
		177	package MyModule;
		178
152	sub MyModule::worker {	179	sub worker {
153	my ($slave_filehandle) = @_;	180	my ($slave_filehandle) = @_;
154		181
155	# now $slave_filehandle is connected to the $master_filehandle	182	# now $slave_filehandle is connected to the $master_filehandle
156	# in the original prorcess. have fun!	183	# in the original prorcess. have fun!
157	}	184	}
…		…
176	}	203	}
177		204
178	# now do other things - maybe use the filehandle provided by run	205	# now do other things - maybe use the filehandle provided by run
179	# to wait for the processes to die. or whatever.	206	# to wait for the processes to die. or whatever.
180		207
181	# My::Server::run might look like this	208	C<My::Server> might look like this:
182	sub My::Server::run {	209
		210	package My::Server;
		211
		212	sub run {
183	my ($slave, $listener, $id) = @_;	213	my ($slave, $listener, $id) = @_;
184		214
185	close $slave; # we do not use the socket, so close it to save resources	215	close $slave; # we do not use the socket, so close it to save resources
186		216
187	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,	217	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
…		…
191	}	221	}
192	}	222	}
193		223
194	=head2 use AnyEvent::Fork as a faster fork+exec	224	=head2 use AnyEvent::Fork as a faster fork+exec
195		225
196	This runs /bin/echo hi, with stdout redirected to /tmp/log and stderr to	226	This runs C</bin/echo hi>, with stdandard output redirected to /tmp/log
197	the communications socket. It is usually faster than fork+exec, but still	227	and standard error redirected to the communications socket. It is usually
198	let's you prepare the environment.	228	faster than fork+exec, but still lets you prepare the environment.
199		229
200	open my $output, ">/tmp/log" or die "$!";	230	open my $output, ">/tmp/log" or die "$!";
201		231
202	AnyEvent::Fork	232	AnyEvent::Fork
203	->new	233	->new
204	->eval ('	234	->eval ('
		235	# compile a helper function for later use
205	sub run {	236	sub run {
206	my ($fh, $output, @cmd) = @_;	237	my ($fh, $output, @cmd) = @_;
207		238
208	# perl will clear close-on-exec on STDOUT/STDERR	239	# perl will clear close-on-exec on STDOUT/STDERR
209	open STDOUT, ">&", $output or die;	240	open STDOUT, ">&", $output or die;
…		…
303	my ($fork_fh) = @_;	334	my ($fork_fh) = @_;
304	});	335	});
305		336
306	=back	337	=back
307		338
308	=head1 FUNCTIONS	339	=head1 THE C<AnyEvent::Fork> CLASS
		340
		341	This module exports nothing, and only implements a single class -
		342	C<AnyEvent::Fork>.
		343
		344	There are two class constructors that both create new processes - C<new>
		345	and C<new_exec>. The C<fork> method creates a new process by forking an
		346	existing one and could be considered a third constructor.
		347
		348	Most of the remaining methods deal with preparing the new process, by
		349	loading code, evaluating code and sending data to the new process. They
		350	usually return the process object, so you can chain method calls.
		351
		352	If a process object is destroyed before calling its C<run> method, then
		353	the process simply exits. After C<run> is called, all responsibility is
		354	passed to the specified function.
		355
		356	As long as there is any outstanding work to be done, process objects
		357	resist being destroyed, so there is no reason to store them unless you
		358	need them later - configure and forget works just fine.
309		359
310	=over 4	360	=over 4
311		361
312	=cut	362	=cut
313		363
…		…
320	use AnyEvent;	370	use AnyEvent;
321	use AnyEvent::Util ();	371	use AnyEvent::Util ();
322		372
323	use IO::FDPass;	373	use IO::FDPass;
324		374
325	our $VERSION = 0.5;	375	our $VERSION = 0.6;
326
327	our $PERL; # the path to the perl interpreter, deduces with various forms of magic
328
329	=item my $pool = new AnyEvent::Fork key => value...
330
331	Create a new process pool. The following named parameters are supported:
332		376
333	=over 4	377	=over 4
334		378
335	=back	379	=back
336		380
…		…
415	if ($pid eq 0) {	459	if ($pid eq 0) {
416	require AnyEvent::Fork::Serve;	460	require AnyEvent::Fork::Serve;
417	$AnyEvent::Fork::Serve::OWNER = $parent;	461	$AnyEvent::Fork::Serve::OWNER = $parent;
418	close $fh;	462	close $fh;
419	$0 = "$_[1] of $parent";	463	$0 = "$_[1] of $parent";
420	$SIG{CHLD} = 'IGNORE';
421	AnyEvent::Fork::Serve::serve ($slave);	464	AnyEvent::Fork::Serve::serve ($slave);
422	exit 0;	465	exit 0;
423	} elsif (!$pid) {	466	} elsif (!$pid) {
424	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";	467	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
425	}	468	}
…		…
432	Create a new "empty" perl interpreter process and returns its process	475	Create a new "empty" perl interpreter process and returns its process
433	object for further manipulation.	476	object for further manipulation.
434		477
435	The new process is forked from a template process that is kept around	478	The new process is forked from a template process that is kept around
436	for this purpose. When it doesn't exist yet, it is created by a call to	479	for this purpose. When it doesn't exist yet, it is created by a call to
437	C<new_exec> and kept around for future calls.	480	C<new_exec> first and then stays around for future calls.
438
439	When the process object is destroyed, it will release the file handle
440	that connects it with the new process. When the new process has not yet
441	called C<run>, then the process will exit. Otherwise, what happens depends
442	entirely on the code that is executed.
443		481
444	=cut	482	=cut
445		483
446	sub new {	484	sub new {
447	my $class = shift;	485	my $class = shift;
…		…
537	}	575	}
538		576
539	=item $pid = $proc->pid	577	=item $pid = $proc->pid
540		578
541	Returns the process id of the process I<iff it is a direct child of the	579	Returns the process id of the process I<iff it is a direct child of the
542	process> running AnyEvent::Fork, and C<undef> otherwise.	580	process running AnyEvent::Fork>, and C<undef> otherwise.
543		581
544	Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and	582	Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and
545	L<AnyEvent::Fork::Template> are direct children, and you are responsible	583	L<AnyEvent::Fork::Template> are direct children, and you are responsible
546	to clean up their zombies when they die.	584	to clean up their zombies when they die.
547		585
548	All other processes are not direct children, and will be cleaned up by	586	All other processes are not direct children, and will be cleaned up by
549	AnyEvent::Fork.	587	AnyEvent::Fork itself.
550		588
551	=cut	589	=cut
552		590
553	sub pid {	591	sub pid {
554	$_[0][0]	592	$_[0][0]
…		…
565		603
566	The code will usually be executed after this call returns, and there is no	604	The code will usually be executed after this call returns, and there is no
567	way to pass anything back to the calling process. Any evaluation errors	605	way to pass anything back to the calling process. Any evaluation errors
568	will be reported to stderr and cause the process to exit.	606	will be reported to stderr and cause the process to exit.
569		607
570	If you want to execute some code to take over the process (see the	608	If you want to execute some code (that isn't in a module) to take over the
571	"fork+exec" example in the SYNOPSIS), you should compile a function via	609	process, you should compile a function via C<eval> first, and then call
572	C<eval> first, and then call it via C<run>. This also gives you access to	610	it via C<run>. This also gives you access to any arguments passed via the
573	any arguments passed via the C<send_xxx> methods, such as file handles.	611	C<send_xxx> methods, such as file handles. See the L<use AnyEvent::Fork as
		612	a faster fork+exec> example to see it in action.
574		613
575	Returns the process object for easy chaining of method calls.	614	Returns the process object for easy chaining of method calls.
576		615
577	=cut	616	=cut
578		617
…		…
604	=item $proc = $proc->send_fh ($handle, ...)	643	=item $proc = $proc->send_fh ($handle, ...)
605		644
606	Send one or more file handles (I<not> file descriptors) to the process,	645	Send one or more file handles (I<not> file descriptors) to the process,
607	to prepare a call to C<run>.	646	to prepare a call to C<run>.
608		647
609	The process object keeps a reference to the handles until this is done,	648	The process object keeps a reference to the handles until they have
610	so you must not explicitly close the handles. This is most easily	649	been passed over to the process, so you must not explicitly close the
611	accomplished by simply not storing the file handles anywhere after passing	650	handles. This is most easily accomplished by simply not storing the file
612	them to this method.	651	handles anywhere after passing them to this method - when AnyEvent::Fork
		652	is finished using them, perl will automatically close them.
613		653
614	Returns the process object for easy chaining of method calls.	654	Returns the process object for easy chaining of method calls.
615		655
616	Example: pass a file handle to a process, and release it without	656	Example: pass a file handle to a process, and release it without
617	closing. It will be closed automatically when it is no longer used.	657	closing. It will be closed automatically when it is no longer used.
…		…
633	}	673	}
634		674
635	=item $proc = $proc->send_arg ($string, ...)	675	=item $proc = $proc->send_arg ($string, ...)
636		676
637	Send one or more argument strings to the process, to prepare a call to	677	Send one or more argument strings to the process, to prepare a call to
638	C<run>. The strings can be any octet string.	678	C<run>. The strings can be any octet strings.
639		679
640	The protocol is optimised to pass a moderate number of relatively short	680	The protocol is optimised to pass a moderate number of relatively short
641	strings - while you can pass up to 4GB of data in one go, this is more	681	strings - while you can pass up to 4GB of data in one go, this is more
642	meant to pass some ID information or other startup info, not big chunks of	682	meant to pass some ID information or other startup info, not big chunks of
643	data.	683	data.
…		…
659	Enter the function specified by the function name in C<$func> in the	699	Enter the function specified by the function name in C<$func> in the
660	process. The function is called with the communication socket as first	700	process. The function is called with the communication socket as first
661	argument, followed by all file handles and string arguments sent earlier	701	argument, followed by all file handles and string arguments sent earlier
662	via C<send_fh> and C<send_arg> methods, in the order they were called.	702	via C<send_fh> and C<send_arg> methods, in the order they were called.
663		703
		704	The process object becomes unusable on return from this function - any
		705	further method calls result in undefined behaviour.
		706
664	The function name should be fully qualified, but if it isn't, it will be	707	The function name should be fully qualified, but if it isn't, it will be
665	looked up in the main package.	708	looked up in the C<main> package.
666		709
667	If the called function returns, doesn't exist, or any error occurs, the	710	If the called function returns, doesn't exist, or any error occurs, the
668	process exits.	711	process exits.
669		712
670	Preparing the process is done in the background - when all commands have	713	Preparing the process is done in the background - when all commands have
671	been sent, the callback is invoked with the local communications socket	714	been sent, the callback is invoked with the local communications socket
672	as argument. At this point you can start using the socket in any way you	715	as argument. At this point you can start using the socket in any way you
673	like.	716	like.
674
675	The process object becomes unusable on return from this function - any
676	further method calls result in undefined behaviour.
677		717
678	If the communication socket isn't used, it should be closed on both sides,	718	If the communication socket isn't used, it should be closed on both sides,
679	to save on kernel memory.	719	to save on kernel memory.
680		720
681	The socket is non-blocking in the parent, and blocking in the newly	721	The socket is non-blocking in the parent, and blocking in the newly
…		…
756	479 vfork+execs per second, using AnyEvent::Fork->new_exec	796	479 vfork+execs per second, using AnyEvent::Fork->new_exec
757		797
758	So how can C<< AnyEvent->new >> be faster than a standard fork, even	798	So how can C<< AnyEvent->new >> be faster than a standard fork, even
759	though it uses the same operations, but adds a lot of overhead?	799	though it uses the same operations, but adds a lot of overhead?
760		800
761	The difference is simply the process size: forking the 6MB process takes	801	The difference is simply the process size: forking the 5MB process takes
762	so much longer than forking the 2.5MB template process that the overhead	802	so much longer than forking the 2.5MB template process that the extra
763	introduced is canceled out.	803	overhead introduced is canceled out.
764		804
765	If the benchmark process grows, the normal fork becomes even slower:	805	If the benchmark process grows, the normal fork becomes even slower:
766		806
767	1340 new processes, manual fork in a 20MB process	807	1340 new processes, manual fork of a 20MB process
768	731 new processes, manual fork in a 200MB process	808	731 new processes, manual fork of a 200MB process
769	235 new processes, manual fork in a 2000MB process	809	235 new processes, manual fork of a 2000MB process
770		810
771	What that means (to me) is that I can use this module without having a	811	What that means (to me) is that I can use this module without having a bad
772	very bad conscience because of the extra overhead required to start new	812	conscience because of the extra overhead required to start new processes.
773	processes.
774		813
775	=head1 TYPICAL PROBLEMS	814	=head1 TYPICAL PROBLEMS
776		815
777	This section lists typical problems that remain. I hope by recognising	816	This section lists typical problems that remain. I hope by recognising
778	them, most can be avoided.	817	them, most can be avoided.
779		818
780	=over 4	819	=over 4
781		820
782	=item "leaked" file descriptors for exec'ed processes	821	=item leaked file descriptors for exec'ed processes
783		822
784	POSIX systems inherit file descriptors by default when exec'ing a new	823	POSIX systems inherit file descriptors by default when exec'ing a new
785	process. While perl itself laudably sets the close-on-exec flags on new	824	process. While perl itself laudably sets the close-on-exec flags on new
786	file handles, most C libraries don't care, and even if all cared, it's	825	file handles, most C libraries don't care, and even if all cared, it's
787	often not possible to set the flag in a race-free manner.	826	often not possible to set the flag in a race-free manner.
…		…
807	libraries or the code that leaks those file descriptors.	846	libraries or the code that leaks those file descriptors.
808		847
809	Fortunately, most of these leaked descriptors do no harm, other than	848	Fortunately, most of these leaked descriptors do no harm, other than
810	sitting on some resources.	849	sitting on some resources.
811		850
812	=item "leaked" file descriptors for fork'ed processes	851	=item leaked file descriptors for fork'ed processes
813		852
814	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,	853	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
815	which closes file descriptors not marked for being inherited.	854	which closes file descriptors not marked for being inherited.
816		855
817	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer	856	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
…		…
826		865
827	The solution is to either not load these modules before use'ing	866	The solution is to either not load these modules before use'ing
828	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay	867	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
829	initialising them, for example, by calling C<init Gtk2> manually.	868	initialising them, for example, by calling C<init Gtk2> manually.
830		869
831	=item exit runs destructors	870	=item exiting calls object destructors
832		871
833	This only applies to users of Lc<AnyEvent::Fork:Early> and	872	This only applies to users of L<AnyEvent::Fork:Early> and
834	L<AnyEvent::Fork::Template>.	873	L<AnyEvent::Fork::Template>, or when initialiasing code creates objects
		874	that reference external resources.
835		875
836	When a process created by AnyEvent::Fork exits, it might do so by calling	876	When a process created by AnyEvent::Fork exits, it might do so by calling
837	exit, or simply letting perl reach the end of the program. At which point	877	exit, or simply letting perl reach the end of the program. At which point
838	Perl runs all destructors.	878	Perl runs all destructors.
839		879
…		…
858	to make it so, mostly due to the bloody broken perl that nobody seems to	898	to make it so, mostly due to the bloody broken perl that nobody seems to
859	care about. The fork emulation is a bad joke - I have yet to see something	899	care about. The fork emulation is a bad joke - I have yet to see something
860	useful that you can do with it without running into memory corruption	900	useful that you can do with it without running into memory corruption
861	issues or other braindamage. Hrrrr.	901	issues or other braindamage. Hrrrr.
862		902
863	Cygwin perl is not supported at the moment, as it should implement fd	903	Cygwin perl is not supported at the moment due to some hilarious
864	passing, but doesn't, and rolling my own is hard, as cygwin doesn't	904	shortcomings of its API - see L<IO::FDPoll> for more details.
865	support enough functionality to do it.
866		905
867	=head1 SEE ALSO	906	=head1 SEE ALSO
868		907
869	L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),	908	L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
870	L<AnyEvent::Fork::Template> (to create a process by forking the main	909	L<AnyEvent::Fork::Template> (to create a process by forking the main

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/Fork.pm (file contents): Revision 1.25 by root, Sat Apr 6 08:55:16 2013 UTC vs. Revision 1.41 by root, Mon Apr 8 03:20:53 2013 UTC

Diff Legend

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.25 by root, Sat Apr 6 08:55:16 2013 UTC vs.
Revision 1.41 by root, Mon Apr 8 03:20:53 2013 UTC