[ViewVC] Diff of: cvs/AnyEvent-Fork/Fork.pm

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.29 by root, Sat Apr 6 09:15:49 2013 UTC vs.
Revision 1.63 by root, Wed Nov 26 13:36:18 2014 UTC

…		…
27		27
28	Special care has been taken to make this module useful from other modules,	28	Special care has been taken to make this module useful from other modules,
29	while still supporting specialised environments such as L<App::Staticperl>	29	while still supporting specialised environments such as L<App::Staticperl>
30	or L<PAR::Packer>.	30	or L<PAR::Packer>.
31		31
32	=head1 WHAT THIS MODULE IS NOT	32	=head2 WHAT THIS MODULE IS NOT
33		33
34	This module only creates processes and lets you pass file handles and	34	This module only creates processes and lets you pass file handles and
35	strings to it, and run perl code. It does not implement any kind of RPC -	35	strings to it, and run perl code. It does not implement any kind of RPC -
36	there is no back channel from the process back to you, and there is no RPC	36	there is no back channel from the process back to you, and there is no RPC
37	or message passing going on.	37	or message passing going on.
38		38
39	If you need some form of RPC, you can either implement it yourself	39	If you need some form of RPC, you could use the L<AnyEvent::Fork::RPC>
40	in whatever way you like, use some message-passing module such	40	companion module, which adds simple RPC/job queueing to a process created
41	as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use	41	by this module.
42	L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
43	and so on.
44		42
		43	And if you need some automatic process pool management on top of
		44	L<AnyEvent::Fork::RPC>, you can look at the L<AnyEvent::Fork::Pool>
		45	companion module.
		46
		47	Or you can implement it yourself in whatever way you like: use some
		48	message-passing module such as L<AnyEvent::MP>, some pipe such as
		49	L<AnyEvent::ZeroMQ>, use L<AnyEvent::Handle> on both sides to send
		50	e.g. JSON or Storable messages, and so on.
		51
		52	=head2 COMPARISON TO OTHER MODULES
		53
		54	There is an abundance of modules on CPAN that do "something fork", such as
		55	L<Parallel::ForkManager>, L<AnyEvent::ForkManager>, L<AnyEvent::Worker>
		56	or L<AnyEvent::Subprocess>. There are modules that implement their own
		57	process management, such as L<AnyEvent::DBI>.
		58
		59	The problems that all these modules try to solve are real, however, none
		60	of them (from what I have seen) tackle the very real problems of unwanted
		61	memory sharing, efficiency or not being able to use event processing, GUI
		62	toolkits or similar modules in the processes they create.
		63
		64	This module doesn't try to replace any of them - instead it tries to solve
		65	the problem of creating processes with a minimum of fuss and overhead (and
		66	also luxury). Ideally, most of these would use AnyEvent::Fork internally,
		67	except they were written before AnyEvent:Fork was available, so obviously
		68	had to roll their own.
		69
45	=head1 PROBLEM STATEMENT	70	=head2 PROBLEM STATEMENT
46		71
47	There are two traditional ways to implement parallel processing on UNIX	72	There are two traditional ways to implement parallel processing on UNIX
48	like operating systems - fork and process, and fork+exec and process. They	73	like operating systems - fork and process, and fork+exec and process. They
49	have different advantages and disadvantages that I describe below,	74	have different advantages and disadvantages that I describe below,
50	together with how this module tries to mitigate the disadvantages.	75	together with how this module tries to mitigate the disadvantages.
…		…
64		89
65	=item Forking usually creates a copy-on-write copy of the parent	90	=item Forking usually creates a copy-on-write copy of the parent
66	process.	91	process.
67		92
68	For example, modules or data files that are loaded will not use additional	93	For example, modules or data files that are loaded will not use additional
69	memory after a fork. When exec'ing a new process, modules and data files	94	memory after a fork. Exec'ing a new process, in contrast, means modules
70	might need to be loaded again, at extra CPU and memory cost. But when	95	and data files might need to be loaded again, at extra CPU and memory
71	forking, literally all data structures are copied - if the program frees	96	cost.
		97
		98	But when forking, you still create a copy of your data structures - if
72	them and replaces them by new data, the child processes will retain the	99	the program frees them and replaces them by new data, the child processes
73	old version even if it isn't used, which can suddenly and unexpectedly	100	will retain the old version even if it isn't used, which can suddenly and
74	increase memory usage when freeing memory.	101	unexpectedly increase memory usage when freeing memory.
75		102
		103	For example, L<Gtk2::CV> is an image viewer optimised for large
		104	directories (millions of pictures). It also forks subprocesses for
		105	thumbnail generation, which inherit the data structure that stores all
		106	file information. If the user changes the directory, it gets freed in
		107	the main process, leaving a copy in the thumbnailer processes. This can
		108	lead to many times the memory usage that would actually be required. The
		109	solution is to fork early (and being unable to dynamically generate more
		110	subprocesses or do this from a module)... or to use L<AnyEvent:Fork>.
		111
76	The trade-off is between more sharing with fork (which can be good or	112	There is a trade-off between more sharing with fork (which can be good or
77	bad), and no sharing with exec.	113	bad), and no sharing with exec.
78		114
79	This module allows the main program to do a controlled fork, and allows	115	This module allows the main program to do a controlled fork, and allows
80	modules to exec processes safely at any time. When creating a custom	116	modules to exec processes safely at any time. When creating a custom
81	process pool you can take advantage of data sharing via fork without	117	process pool you can take advantage of data sharing via fork without
…		…
86	shared and what isn't, at all times.	122	shared and what isn't, at all times.
87		123
88	=item Exec'ing a new perl process might be difficult.	124	=item Exec'ing a new perl process might be difficult.
89		125
90	For example, it is not easy to find the correct path to the perl	126	For example, it is not easy to find the correct path to the perl
91	interpreter - C<$^X> might not be a perl interpreter at all.	127	interpreter - C<$^X> might not be a perl interpreter at all. Worse, there
		128	might not even be a perl binary installed on the system.
92		129
93	This module tries hard to identify the correct path to the perl	130	This module tries hard to identify the correct path to the perl
94	interpreter. With a cooperative main program, exec'ing the interpreter	131	interpreter. With a cooperative main program, exec'ing the interpreter
95	might not even be necessary, but even without help from the main program,	132	might not even be necessary, but even without help from the main program,
96	it will still work when used from a module.	133	it will still work when used from a module.
…		…
102	and modules are no longer loadable because they refer to a different	139	and modules are no longer loadable because they refer to a different
103	perl version, or parts of a distribution are newer than the ones already	140	perl version, or parts of a distribution are newer than the ones already
104	loaded.	141	loaded.
105		142
106	This module supports creating pre-initialised perl processes to be used as	143	This module supports creating pre-initialised perl processes to be used as
107	a template for new processes.	144	a template for new processes at a later time, e.g. for use in a process
		145	pool.
108		146
109	=item Forking might be impossible when a program is running.	147	=item Forking might be impossible when a program is running.
110		148
111	For example, POSIX makes it almost impossible to fork from a	149	For example, POSIX makes it almost impossible to fork from a
112	multi-threaded program while doing anything useful in the child - in	150	multi-threaded program while doing anything useful in the child - in
113	fact, if your perl program uses POSIX threads (even indirectly via	151	fact, if your perl program uses POSIX threads (even indirectly via
114	e.g. L<IO::AIO> or L<threads>), you cannot call fork on the perl level	152	e.g. L<IO::AIO> or L<threads>), you cannot call fork on the perl level
115	anymore without risking corruption issues on a number of operating	153	anymore without risking memory corruption or worse on a number of
116	systems.	154	operating systems.
117		155
118	This module can safely fork helper processes at any time, by calling	156	This module can safely fork helper processes at any time, by calling
119	fork+exec in C, in a POSIX-compatible way (via L<Proc::FastSpawn>).	157	fork+exec in C, in a POSIX-compatible way (via L<Proc::FastSpawn>).
120		158
121	=item Parallel processing with fork might be inconvenient or difficult	159	=item Parallel processing with fork might be inconvenient or difficult
…		…
152		190
153	# now $master_filehandle is connected to the	191	# now $master_filehandle is connected to the
154	# $slave_filehandle in the new process.	192	# $slave_filehandle in the new process.
155	});	193	});
156		194
157	# MyModule::worker might look like this	195	C<MyModule> might look like this:
		196
		197	package MyModule;
		198
158	sub MyModule::worker {	199	sub worker {
159	my ($slave_filehandle) = @_;	200	my ($slave_filehandle) = @_;
160		201
161	# now $slave_filehandle is connected to the $master_filehandle	202	# now $slave_filehandle is connected to the $master_filehandle
162	# in the original prorcess. have fun!	203	# in the original prorcess. have fun!
163	}	204	}
…		…
182	}	223	}
183		224
184	# now do other things - maybe use the filehandle provided by run	225	# now do other things - maybe use the filehandle provided by run
185	# to wait for the processes to die. or whatever.	226	# to wait for the processes to die. or whatever.
186		227
187	# My::Server::run might look like this	228	C<My::Server> might look like this:
188	sub My::Server::run {	229
		230	package My::Server;
		231
		232	sub run {
189	my ($slave, $listener, $id) = @_;	233	my ($slave, $listener, $id) = @_;
190		234
191	close $slave; # we do not use the socket, so close it to save resources	235	close $slave; # we do not use the socket, so close it to save resources
192		236
193	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,	237	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
…		…
197	}	241	}
198	}	242	}
199		243
200	=head2 use AnyEvent::Fork as a faster fork+exec	244	=head2 use AnyEvent::Fork as a faster fork+exec
201		245
202	This runs /bin/echo hi, with stdout redirected to /tmp/log and stderr to	246	This runs C</bin/echo hi>, with standard output redirected to F</tmp/log>
203	the communications socket. It is usually faster than fork+exec, but still	247	and standard error redirected to the communications socket. It is usually
204	let's you prepare the environment.	248	faster than fork+exec, but still lets you prepare the environment.
205		249
206	open my $output, ">/tmp/log" or die "$!";	250	open my $output, ">/tmp/log" or die "$!";
207		251
208	AnyEvent::Fork	252	AnyEvent::Fork
209	->new	253	->new
210	->eval ('	254	->eval ('
		255	# compile a helper function for later use
211	sub run {	256	sub run {
212	my ($fh, $output, @cmd) = @_;	257	my ($fh, $output, @cmd) = @_;
213		258
214	# perl will clear close-on-exec on STDOUT/STDERR	259	# perl will clear close-on-exec on STDOUT/STDERR
215	open STDOUT, ">&", $output or die;	260	open STDOUT, ">&", $output or die;
…		…
222	->send_arg ("/bin/echo", "hi")	267	->send_arg ("/bin/echo", "hi")
223	->run ("run", my $cv = AE::cv);	268	->run ("run", my $cv = AE::cv);
224		269
225	my $stderr = $cv->recv;	270	my $stderr = $cv->recv;
226		271
		272	=head2 For stingy users: put the worker code into a C<DATA> section.
		273
		274	When you want to be stingy with files, you can put your code into the
		275	C<DATA> section of your module (or program):
		276
		277	use AnyEvent::Fork;
		278
		279	AnyEvent::Fork
		280	->new
		281	->eval (do { local $/; <DATA> })
		282	->run ("doit", sub { ... });
		283
		284	__DATA__
		285
		286	sub doit {
		287	... do something!
		288	}
		289
		290	=head2 For stingy standalone programs: do not rely on external files at
		291	all.
		292
		293	For single-file scripts it can be inconvenient to rely on external
		294	files - even when using a C<DATA> section, you still need to C<exec> an
		295	external perl interpreter, which might not be available when using
		296	L<App::Staticperl>, L<Urlader> or L<PAR::Packer> for example.
		297
		298	Two modules help here - L<AnyEvent::Fork::Early> forks a template process
		299	for all further calls to C<new_exec>, and L<AnyEvent::Fork::Template>
		300	forks the main program as a template process.
		301
		302	Here is how your main program should look like:
		303
		304	#! perl
		305
		306	# optional, as the very first thing.
		307	# in case modules want to create their own processes.
		308	use AnyEvent::Fork::Early;
		309
		310	# next, load all modules you need in your template process
		311	use Example::My::Module
		312	use Example::Whatever;
		313
		314	# next, put your run function definition and anything else you
		315	# need, but do not use code outside of BEGIN blocks.
		316	sub worker_run {
		317	my ($fh, @args) = @_;
		318	...
		319	}
		320
		321	# now preserve everything so far as AnyEvent::Fork object
		322	# in $TEMPLATE.
		323	use AnyEvent::Fork::Template;
		324
		325	# do not put code outside of BEGIN blocks until here
		326
		327	# now use the $TEMPLATE process in any way you like
		328
		329	# for example: create 10 worker processes
		330	my @worker;
		331	my $cv = AE::cv;
		332	for (1..10) {
		333	$cv->begin;
		334	$TEMPLATE->fork->send_arg ($_)->run ("worker_run", sub {
		335	push @worker, shift;
		336	$cv->end;
		337	});
		338	}
		339	$cv->recv;
		340
227	=head1 CONCEPTS	341	=head1 CONCEPTS
228		342
229	This module can create new processes either by executing a new perl	343	This module can create new processes either by executing a new perl
230	process, or by forking from an existing "template" process.	344	process, or by forking from an existing "template" process.
		345
		346	All these processes are called "child processes" (whether they are direct
		347	children or not), while the process that manages them is called the
		348	"parent process".
231		349
232	Each such process comes with its own file handle that can be used to	350	Each such process comes with its own file handle that can be used to
233	communicate with it (it's actually a socket - one end in the new process,	351	communicate with it (it's actually a socket - one end in the new process,
234	one end in the main process), and among the things you can do in it are	352	one end in the main process), and among the things you can do in it are
235	load modules, fork new processes, send file handles to it, and execute	353	load modules, fork new processes, send file handles to it, and execute
…		…
345	use AnyEvent;	463	use AnyEvent;
346	use AnyEvent::Util ();	464	use AnyEvent::Util ();
347		465
348	use IO::FDPass;	466	use IO::FDPass;
349		467
350	our $VERSION = 0.5;	468	our $VERSION = 1.2;
351
352	our $PERL; # the path to the perl interpreter, deduces with various forms of magic
353
354	=over 4
355
356	=back
357
358	=cut
359		469
360	# the early fork template process	470	# the early fork template process
361	our $EARLY;	471	our $EARLY;
362		472
363	# the empty template process	473	# the empty template process
364	our $TEMPLATE;	474	our $TEMPLATE;
		475
		476	sub QUEUE() { 0 }
		477	sub FH() { 1 }
		478	sub WW() { 2 }
		479	sub PID() { 3 }
		480	sub CB() { 4 }
		481
		482	sub _new {
		483	my ($self, $fh, $pid) = @_;
		484
		485	AnyEvent::Util::fh_nonblocking $fh, 1;
		486
		487	$self = bless [
		488	[], # write queue - strings or fd's
		489	$fh,
		490	undef, # AE watcher
		491	$pid,
		492	], $self;
		493
		494	$self
		495	}
365		496
366	sub _cmd {	497	sub _cmd {
367	my $self = shift;	498	my $self = shift;
368		499
369	# ideally, we would want to use "a (w/a)*" as format string, but perl	500	# ideally, we would want to use "a (w/a)*" as format string, but perl
370	# versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack	501	# versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
371	# it.	502	# it.
372	push @{ $self->[2] }, pack "a L/a*", $_[0], $_[1];	503	push @{ $self->[QUEUE] }, pack "a L/a*", $_[0], $_[1];
373		504
374	$self->[3] \|\|= AE::io $self->[1], 1, sub {	505	$self->[WW] \|\|= AE::io $self->[FH], 1, sub {
375	do {	506	do {
376	# send the next "thing" in the queue - either a reference to an fh,	507	# send the next "thing" in the queue - either a reference to an fh,
377	# or a plain string.	508	# or a plain string.
378		509
379	if (ref $self->[2][0]) {	510	if (ref $self->[QUEUE][0]) {
380	# send fh	511	# send fh
381	unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) {	512	unless (IO::FDPass::send fileno $self->[FH], fileno ${ $self->[QUEUE][0] }) {
382	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;	513	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;
383	undef $self->[3];	514	undef $self->[WW];
384	die "AnyEvent::Fork: file descriptor send failure: $!";	515	die "AnyEvent::Fork: file descriptor send failure: $!";
385	}	516	}
386		517
387	shift @{ $self->[2] };	518	shift @{ $self->[QUEUE] };
388		519
389	} else {	520	} else {
390	# send string	521	# send string
391	my $len = syswrite $self->[1], $self->[2][0];	522	my $len = syswrite $self->[FH], $self->[QUEUE][0];
392		523
393	unless ($len) {	524	unless ($len) {
394	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;	525	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;
395	undef $self->[3];	526	undef $self->[WW];
396	die "AnyEvent::Fork: command write failure: $!";	527	die "AnyEvent::Fork: command write failure: $!";
397	}	528	}
398		529
399	substr $self->[2][0], 0, $len, "";	530	substr $self->[QUEUE][0], 0, $len, "";
400	shift @{ $self->[2] } unless length $self->[2][0];	531	shift @{ $self->[QUEUE] } unless length $self->[QUEUE][0];
401	}	532	}
402	} while @{ $self->[2] };	533	} while @{ $self->[QUEUE] };
403		534
404	# everything written	535	# everything written
405	undef $self->[3];	536	undef $self->[WW];
406		537
407	# invoke run callback, if any	538	# invoke run callback, if any
408	$self->[4]->($self->[1]) if $self->[4];	539	if ($self->[CB]) {
		540	$self->[CB]->($self->[FH]);
		541	@$self = ();
		542	}
409	};	543	};
410		544
411	() # make sure we don't leak the watcher	545	() # make sure we don't leak the watcher
412	}
413
414	sub _new {
415	my ($self, $fh, $pid) = @_;
416
417	AnyEvent::Util::fh_nonblocking $fh, 1;
418
419	$self = bless [
420	$pid,
421	$fh,
422	[], # write queue - strings or fd's
423	undef, # AE watcher
424	], $self;
425
426	$self
427	}	546	}
428		547
429	# fork template from current process, used by AnyEvent::Fork::Early/Template	548	# fork template from current process, used by AnyEvent::Fork::Early/Template
430	sub _new_fork {	549	sub _new_fork {
431	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;	550	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
…		…
436	if ($pid eq 0) {	555	if ($pid eq 0) {
437	require AnyEvent::Fork::Serve;	556	require AnyEvent::Fork::Serve;
438	$AnyEvent::Fork::Serve::OWNER = $parent;	557	$AnyEvent::Fork::Serve::OWNER = $parent;
439	close $fh;	558	close $fh;
440	$0 = "$_[1] of $parent";	559	$0 = "$_[1] of $parent";
441	$SIG{CHLD} = 'IGNORE';
442	AnyEvent::Fork::Serve::serve ($slave);	560	AnyEvent::Fork::Serve::serve ($slave);
443	exit 0;	561	exit 0;
444	} elsif (!$pid) {	562	} elsif (!$pid) {
445	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";	563	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
446	}	564	}
…		…
500		618
501	You should use C<new> whenever possible, except when having a template	619	You should use C<new> whenever possible, except when having a template
502	process around is unacceptable.	620	process around is unacceptable.
503		621
504	The path to the perl interpreter is divined using various methods - first	622	The path to the perl interpreter is divined using various methods - first
505	C<$^X> is investigated to see if the path ends with something that sounds	623	C<$^X> is investigated to see if the path ends with something that looks
506	as if it were the perl interpreter. Failing this, the module falls back to	624	as if it were the perl interpreter. Failing this, the module falls back to
507	using C<$Config::Config{perlpath}>.	625	using C<$Config::Config{perlpath}>.
508		626
		627	The path to perl can also be overriden by setting the global variable
		628	C<$AnyEvent::Fork::PERL> - it's value will be used for all subsequent
		629	invocations.
		630
509	=cut	631	=cut
		632
		633	our $PERL;
510		634
511	sub new_exec {	635	sub new_exec {
512	my ($self) = @_;	636	my ($self) = @_;
513		637
514	return $EARLY->fork	638	return $EARLY->fork
515	if $EARLY;	639	if $EARLY;
516		640
		641	unless (defined $PERL) {
517	# first find path of perl	642	# first find path of perl
518	my $perl = $;	643	my $perl = $^X;
519		644
520	# first we try $^X, but the path must be absolute (always on win32), and end in sth.	645	# first we try $^X, but the path must be absolute (always on win32), and end in sth.
521	# that looks like perl. this obviously only works for posix and win32	646	# that looks like perl. this obviously only works for posix and win32
522	unless (	647	unless (
523	($^O eq "MSWin32" \|\| $perl =~ m%^/%)	648	($^O eq "MSWin32" \|\| $perl =~ m%^/%)
524	&& $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i	649	&& $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i
525	) {	650	) {
526	# if it doesn't look perlish enough, try Config	651	# if it doesn't look perlish enough, try Config
527	require Config;	652	require Config;
528	$perl = $Config::Config{perlpath};	653	$perl = $Config::Config{perlpath};
529	$perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;	654	$perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;
		655	}
		656
		657	$PERL = $perl;
530	}	658	}
531		659
532	require Proc::FastSpawn;	660	require Proc::FastSpawn;
533		661
534	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;	662	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
…		…
542	#local $ENV{PERL5LIB} = join ":", grep !ref, @INC;	670	#local $ENV{PERL5LIB} = join ":", grep !ref, @INC;
543	my %env = %ENV;	671	my %env = %ENV;
544	$env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;	672	$env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;
545		673
546	my $pid = Proc::FastSpawn::spawn (	674	my $pid = Proc::FastSpawn::spawn (
547	$perl,	675	$PERL,
548	["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],	676	["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],
549	[map "$_=$env{$_}", keys %env],	677	[map "$_=$env{$_}", keys %env],
550	) or die "unable to spawn AnyEvent::Fork server: $!";	678	) or die "unable to spawn AnyEvent::Fork server: $!";
551		679
552	$self->_new ($fh, $pid)	680	$self->_new ($fh, $pid)
553	}	681	}
554		682
555	=item $pid = $proc->pid	683	=item $pid = $proc->pid
556		684
557	Returns the process id of the process I<iff it is a direct child of the	685	Returns the process id of the process I<iff it is a direct child of the
558	process> running AnyEvent::Fork, and C<undef> otherwise.	686	process running AnyEvent::Fork>, and C<undef> otherwise. As a general
		687	rule (that you cannot rely upon), processes created via C<new_exec>,
		688	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template> are direct
		689	children, while all other processes are not.
559		690
560	Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and	691	Or in other words, you do not normally have to take care of zombies for
561	L<AnyEvent::Fork::Template> are direct children, and you are responsible	692	processes created via C<new>, but when in doubt, or zombies are a problem,
562	to clean up their zombies when they die.	693	you need to check whether a process is a diretc child by calling this
563		694	method, and possibly creating a child watcher or reap it manually.
564	All other processes are not direct children, and will be cleaned up by
565	AnyEvent::Fork.
566		695
567	=cut	696	=cut
568		697
569	sub pid {	698	sub pid {
570	$_[0][0]	699	$_[0][PID]
571	}	700	}
572		701
573	=item $proc = $proc->eval ($perlcode, @args)	702	=item $proc = $proc->eval ($perlcode, @args)
574		703
575	Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to	704	Evaluates the given C<$perlcode> as ... Perl code, while setting C<@_> to
576	the strings specified by C<@args>, in the "main" package.	705	the strings specified by C<@args>, in the "main" package.
577		706
578	This call is meant to do any custom initialisation that might be required	707	This call is meant to do any custom initialisation that might be required
579	(for example, the C<require> method uses it). It's not supposed to be used	708	(for example, the C<require> method uses it). It's not supposed to be used
580	to completely take over the process, use C<run> for that.	709	to completely take over the process, use C<run> for that.
581		710
582	The code will usually be executed after this call returns, and there is no	711	The code will usually be executed after this call returns, and there is no
583	way to pass anything back to the calling process. Any evaluation errors	712	way to pass anything back to the calling process. Any evaluation errors
584	will be reported to stderr and cause the process to exit.	713	will be reported to stderr and cause the process to exit.
585		714
586	If you want to execute some code to take over the process (see the	715	If you want to execute some code (that isn't in a module) to take over the
587	"fork+exec" example in the SYNOPSIS), you should compile a function via	716	process, you should compile a function via C<eval> first, and then call
588	C<eval> first, and then call it via C<run>. This also gives you access to	717	it via C<run>. This also gives you access to any arguments passed via the
589	any arguments passed via the C<send_xxx> methods, such as file handles.	718	C<send_xxx> methods, such as file handles. See the L<use AnyEvent::Fork as
		719	a faster fork+exec> example to see it in action.
590		720
591	Returns the process object for easy chaining of method calls.	721	Returns the process object for easy chaining of method calls.
592		722
593	=cut	723	=cut
594		724
…		…
620	=item $proc = $proc->send_fh ($handle, ...)	750	=item $proc = $proc->send_fh ($handle, ...)
621		751
622	Send one or more file handles (I<not> file descriptors) to the process,	752	Send one or more file handles (I<not> file descriptors) to the process,
623	to prepare a call to C<run>.	753	to prepare a call to C<run>.
624		754
625	The process object keeps a reference to the handles until this is done,	755	The process object keeps a reference to the handles until they have
626	so you must not explicitly close the handles. This is most easily	756	been passed over to the process, so you must not explicitly close the
627	accomplished by simply not storing the file handles anywhere after passing	757	handles. This is most easily accomplished by simply not storing the file
628	them to this method.	758	handles anywhere after passing them to this method - when AnyEvent::Fork
		759	is finished using them, perl will automatically close them.
629		760
630	Returns the process object for easy chaining of method calls.	761	Returns the process object for easy chaining of method calls.
631		762
632	Example: pass a file handle to a process, and release it without	763	Example: pass a file handle to a process, and release it without
633	closing. It will be closed automatically when it is no longer used.	764	closing. It will be closed automatically when it is no longer used.
…		…
640	sub send_fh {	771	sub send_fh {
641	my ($self, @fh) = @_;	772	my ($self, @fh) = @_;
642		773
643	for my $fh (@fh) {	774	for my $fh (@fh) {
644	$self->_cmd ("h");	775	$self->_cmd ("h");
645	push @{ $self->[2] }, \$fh;	776	push @{ $self->[QUEUE] }, \$fh;
646	}	777	}
647		778
648	$self	779	$self
649	}	780	}
650		781
651	=item $proc = $proc->send_arg ($string, ...)	782	=item $proc = $proc->send_arg ($string, ...)
652		783
653	Send one or more argument strings to the process, to prepare a call to	784	Send one or more argument strings to the process, to prepare a call to
654	C<run>. The strings can be any octet string.	785	C<run>. The strings can be any octet strings.
655		786
656	The protocol is optimised to pass a moderate number of relatively short	787	The protocol is optimised to pass a moderate number of relatively short
657	strings - while you can pass up to 4GB of data in one go, this is more	788	strings - while you can pass up to 4GB of data in one go, this is more
658	meant to pass some ID information or other startup info, not big chunks of	789	meant to pass some ID information or other startup info, not big chunks of
659	data.	790	data.
…		…
675	Enter the function specified by the function name in C<$func> in the	806	Enter the function specified by the function name in C<$func> in the
676	process. The function is called with the communication socket as first	807	process. The function is called with the communication socket as first
677	argument, followed by all file handles and string arguments sent earlier	808	argument, followed by all file handles and string arguments sent earlier
678	via C<send_fh> and C<send_arg> methods, in the order they were called.	809	via C<send_fh> and C<send_arg> methods, in the order they were called.
679		810
		811	The process object becomes unusable on return from this function - any
		812	further method calls result in undefined behaviour.
		813
680	The function name should be fully qualified, but if it isn't, it will be	814	The function name should be fully qualified, but if it isn't, it will be
681	looked up in the main package.	815	looked up in the C<main> package.
682		816
683	If the called function returns, doesn't exist, or any error occurs, the	817	If the called function returns, doesn't exist, or any error occurs, the
684	process exits.	818	process exits.
685		819
686	Preparing the process is done in the background - when all commands have	820	Preparing the process is done in the background - when all commands have
687	been sent, the callback is invoked with the local communications socket	821	been sent, the callback is invoked with the local communications socket
688	as argument. At this point you can start using the socket in any way you	822	as argument. At this point you can start using the socket in any way you
689	like.	823	like.
690		824
691	The process object becomes unusable on return from this function - any
692	further method calls result in undefined behaviour.
693
694	If the communication socket isn't used, it should be closed on both sides,	825	If the communication socket isn't used, it should be closed on both sides,
695	to save on kernel memory.	826	to save on kernel memory.
696		827
697	The socket is non-blocking in the parent, and blocking in the newly	828	The socket is non-blocking in the parent, and blocking in the newly
698	created process. The close-on-exec flag is set in both.	829	created process. The close-on-exec flag is set in both.
699		830
700	Even if not used otherwise, the socket can be a good indicator for the	831	Even if not used otherwise, the socket can be a good indicator for the
701	existence of the process - if the other process exits, you get a readable	832	existence of the process - if the other process exits, you get a readable
702	event on it, because exiting the process closes the socket (if it didn't	833	event on it, because exiting the process closes the socket (if it didn't
703	create any children using fork).	834	create any children using fork).
		835
		836	=over 4
		837
		838	=item Compatibility to L<AnyEvent::Fork::Remote>
		839
		840	If you want to write code that works with both this module and
		841	L<AnyEvent::Fork::Remote>, you need to write your code so that it assumes
		842	there are two file handles for communications, which might not be unix
		843	domain sockets. The C<run> function should start like this:
		844
		845	sub run {
		846	my ($rfh, @args) = @_; # @args is your normal arguments
		847	my $wfh = fileno $rfh ? $rfh : *STDOUT;
		848
		849	# now use $rfh for reading and $wfh for writing
		850	}
		851
		852	This checks whether the passed file handle is, in fact, the process
		853	C<STDIN> handle. If it is, then the function was invoked visa
		854	L<AnyEvent::Fork::Remote>, so STDIN should be used for reading and
		855	C<STDOUT> should be used for writing.
		856
		857	In all other cases, the function was called via this module, and there is
		858	only one file handle that should be sued for reading and writing.
		859
		860	=back
704		861
705	Example: create a template for a process pool, pass a few strings, some	862	Example: create a template for a process pool, pass a few strings, some
706	file handles, then fork, pass one more string, and run some code.	863	file handles, then fork, pass one more string, and run some code.
707		864
708	my $pool = AnyEvent::Fork	865	my $pool = AnyEvent::Fork
…		…
736	=cut	893	=cut
737		894
738	sub run {	895	sub run {
739	my ($self, $func, $cb) = @_;	896	my ($self, $func, $cb) = @_;
740		897
741	$self->[4] = $cb;	898	$self->[CB] = $cb;
742	$self->_cmd (r => $func);	899	$self->_cmd (r => $func);
		900	}
		901
		902	=back
		903
		904	=head2 EXPERIMENTAL METHODS
		905
		906	These methods might go away completely or change behaviour, at any time.
		907
		908	=over 4
		909
		910	=item $proc->to_fh ($cb->($fh)) # EXPERIMENTAL, MIGHT BE REMOVED
		911
		912	Flushes all commands out to the process and then calls the callback with
		913	the communications socket.
		914
		915	The process object becomes unusable on return from this function - any
		916	further method calls result in undefined behaviour.
		917
		918	The point of this method is to give you a file handle that you can pass
		919	to another process. In that other process, you can call C<new_from_fh
		920	AnyEvent::Fork $fh> to create a new C<AnyEvent::Fork> object from it,
		921	thereby effectively passing a fork object to another process.
		922
		923	=cut
		924
		925	sub to_fh {
		926	my ($self, $cb) = @_;
		927
		928	$self->[CB] = $cb;
		929
		930	unless ($self->[WW]) {
		931	$self->[CB]->($self->[FH]);
		932	@$self = ();
		933	}
		934	}
		935
		936	=item new_from_fh AnyEvent::Fork $fh # EXPERIMENTAL, MIGHT BE REMOVED
		937
		938	Takes a file handle originally rceeived by the C<to_fh> method and creates
		939	a new C<AnyEvent:Fork> object. The child process itself will not change in
		940	any way, i.e. it will keep all the modifications done to it before calling
		941	C<to_fh>.
		942
		943	The new object is very much like the original object, except that the
		944	C<pid> method will return C<undef> even if the process is a direct child.
		945
		946	=cut
		947
		948	sub new_from_fh {
		949	my ($class, $fh) = @_;
		950
		951	$class->_new ($fh)
743	}	952	}
744		953
745	=back	954	=back
746		955
747	=head1 PERFORMANCE	956	=head1 PERFORMANCE
…		…
757		966
758	2079 new processes per second, using manual socketpair + fork	967	2079 new processes per second, using manual socketpair + fork
759		968
760	Then I did the same thing, but instead of calling fork, I called	969	Then I did the same thing, but instead of calling fork, I called
761	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the	970	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
762	socket form the child to close on exit. This does the same thing as manual	971	socket from the child to close on exit. This does the same thing as manual
763	socket pair + fork, except that what is forked is the template process	972	socket pair + fork, except that what is forked is the template process
764	(2440kB), and the socket needs to be passed to the server at the other end	973	(2440kB), and the socket needs to be passed to the server at the other end
765	of the socket first.	974	of the socket first.
766		975
767	2307 new processes per second, using AnyEvent::Fork->new	976	2307 new processes per second, using AnyEvent::Fork->new
…		…
772	479 vfork+execs per second, using AnyEvent::Fork->new_exec	981	479 vfork+execs per second, using AnyEvent::Fork->new_exec
773		982
774	So how can C<< AnyEvent->new >> be faster than a standard fork, even	983	So how can C<< AnyEvent->new >> be faster than a standard fork, even
775	though it uses the same operations, but adds a lot of overhead?	984	though it uses the same operations, but adds a lot of overhead?
776		985
777	The difference is simply the process size: forking the 6MB process takes	986	The difference is simply the process size: forking the 5MB process takes
778	so much longer than forking the 2.5MB template process that the overhead	987	so much longer than forking the 2.5MB template process that the extra
779	introduced is canceled out.	988	overhead is canceled out.
780		989
781	If the benchmark process grows, the normal fork becomes even slower:	990	If the benchmark process grows, the normal fork becomes even slower:
782		991
783	1340 new processes, manual fork in a 20MB process	992	1340 new processes, manual fork of a 20MB process
784	731 new processes, manual fork in a 200MB process	993	731 new processes, manual fork of a 200MB process
785	235 new processes, manual fork in a 2000MB process	994	235 new processes, manual fork of a 2000MB process
786		995
787	What that means (to me) is that I can use this module without having a	996	What that means (to me) is that I can use this module without having a bad
788	very bad conscience because of the extra overhead required to start new	997	conscience because of the extra overhead required to start new processes.
789	processes.
790		998
791	=head1 TYPICAL PROBLEMS	999	=head1 TYPICAL PROBLEMS
792		1000
793	This section lists typical problems that remain. I hope by recognising	1001	This section lists typical problems that remain. I hope by recognising
794	them, most can be avoided.	1002	them, most can be avoided.
795		1003
796	=over 4	1004	=over 4
797		1005
798	=item "leaked" file descriptors for exec'ed processes	1006	=item leaked file descriptors for exec'ed processes
799		1007
800	POSIX systems inherit file descriptors by default when exec'ing a new	1008	POSIX systems inherit file descriptors by default when exec'ing a new
801	process. While perl itself laudably sets the close-on-exec flags on new	1009	process. While perl itself laudably sets the close-on-exec flags on new
802	file handles, most C libraries don't care, and even if all cared, it's	1010	file handles, most C libraries don't care, and even if all cared, it's
803	often not possible to set the flag in a race-free manner.	1011	often not possible to set the flag in a race-free manner.
…		…
823	libraries or the code that leaks those file descriptors.	1031	libraries or the code that leaks those file descriptors.
824		1032
825	Fortunately, most of these leaked descriptors do no harm, other than	1033	Fortunately, most of these leaked descriptors do no harm, other than
826	sitting on some resources.	1034	sitting on some resources.
827		1035
828	=item "leaked" file descriptors for fork'ed processes	1036	=item leaked file descriptors for fork'ed processes
829		1037
830	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,	1038	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
831	which closes file descriptors not marked for being inherited.	1039	which closes file descriptors not marked for being inherited.
832		1040
833	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer	1041	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
…		…
842		1050
843	The solution is to either not load these modules before use'ing	1051	The solution is to either not load these modules before use'ing
844	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay	1052	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
845	initialising them, for example, by calling C<init Gtk2> manually.	1053	initialising them, for example, by calling C<init Gtk2> manually.
846		1054
847	=item exit runs destructors	1055	=item exiting calls object destructors
848		1056
849	This only applies to users of Lc<AnyEvent::Fork:Early> and	1057	This only applies to users of L<AnyEvent::Fork:Early> and
850	L<AnyEvent::Fork::Template>.	1058	L<AnyEvent::Fork::Template>, or when initialising code creates objects
		1059	that reference external resources.
851		1060
852	When a process created by AnyEvent::Fork exits, it might do so by calling	1061	When a process created by AnyEvent::Fork exits, it might do so by calling
853	exit, or simply letting perl reach the end of the program. At which point	1062	exit, or simply letting perl reach the end of the program. At which point
854	Perl runs all destructors.	1063	Perl runs all destructors.
855		1064
…		…
874	to make it so, mostly due to the bloody broken perl that nobody seems to	1083	to make it so, mostly due to the bloody broken perl that nobody seems to
875	care about. The fork emulation is a bad joke - I have yet to see something	1084	care about. The fork emulation is a bad joke - I have yet to see something
876	useful that you can do with it without running into memory corruption	1085	useful that you can do with it without running into memory corruption
877	issues or other braindamage. Hrrrr.	1086	issues or other braindamage. Hrrrr.
878		1087
879	Cygwin perl is not supported at the moment, as it should implement fd	1088	Since fork is endlessly broken on win32 perls (it doesn't even remotely
880	passing, but doesn't, and rolling my own is hard, as cygwin doesn't	1089	work within it's documented limits) and quite obviously it's not getting
881	support enough functionality to do it.	1090	improved any time soon, the best way to proceed on windows would be to
		1091	always use C<new_exec> and thus never rely on perl's fork "emulation".
		1092
		1093	Cygwin perl is not supported at the moment due to some hilarious
		1094	shortcomings of its API - see L<IO::FDPoll> for more details. If you never
		1095	use C<send_fh> and always use C<new_exec> to create processes, it should
		1096	work though.
		1097
		1098	=head1 USING AnyEvent::Fork IN SUBPROCESSES
		1099
		1100	AnyEvent::Fork itself cannot generally be used in subprocesses. As long as
		1101	only one process ever forks new processes, sharing the template processes
		1102	is possible (you could use a pipe as a lock by writing a byte into it to
		1103	unlock, and reading the byte to lock for example)
		1104
		1105	To make concurrent calls possible after fork, you should get rid of the
		1106	template and early fork processes. AnyEvent::Fork will create a new
		1107	template process as needed.
		1108
		1109	undef $AnyEvent::Fork::EARLY;
		1110	undef $AnyEvent::Fork::TEMPLATE;
		1111
		1112	It doesn't matter whether you get rid of them in the parent or child after
		1113	a fork.
882		1114
883	=head1 SEE ALSO	1115	=head1 SEE ALSO
884		1116
885	L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),	1117	L<AnyEvent::Fork::Early>, to avoid executing a perl interpreter at all
		1118	(part of this distribution).
		1119
886	L<AnyEvent::Fork::Template> (to create a process by forking the main	1120	L<AnyEvent::Fork::Template>, to create a process by forking the main
887	program at a convenient time).	1121	program at a convenient time (part of this distribution).
888		1122
889	=head1 AUTHOR	1123	L<AnyEvent::Fork::Remote>, for another way to create processes that is
		1124	mostly compatible to this module and modules building on top of it, but
		1125	works better with remote processes.
		1126
		1127	L<AnyEvent::Fork::RPC>, for simple RPC to child processes (on CPAN).
		1128
		1129	L<AnyEvent::Fork::Pool>, for simple worker process pool (on CPAN).
		1130
		1131	=head1 AUTHOR AND CONTACT INFORMATION
890		1132
891	Marc Lehmann <schmorp@schmorp.de>	1133	Marc Lehmann <schmorp@schmorp.de>
892	http://home.schmorp.de/	1134	http://software.schmorp.de/pkg/AnyEvent-Fork
893		1135
894	=cut	1136	=cut
895		1137
896	1	1138	1
897		1139

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/Fork.pm (file contents): Revision 1.29 by root, Sat Apr 6 09:15:49 2013 UTC vs. Revision 1.63 by root, Wed Nov 26 13:36:18 2014 UTC

Diff Legend

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.29 by root, Sat Apr 6 09:15:49 2013 UTC vs.
Revision 1.63 by root, Wed Nov 26 13:36:18 2014 UTC