[ViewVC] Diff of: cvs/AnyEvent-Fork/Fork.pm

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.3 by root, Tue Apr 2 18:00:04 2013 UTC vs.
Revision 1.23 by root, Sat Apr 6 08:29:43 2013 UTC

…		…
1	=head1 NAME	1	=head1 NAME
2		2
3	AnyEvent::ProcessPool - manage pools of perl worker processes, exec'ed or fork'ed	3	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
4		4
5	=head1 SYNOPSIS	5	=head1 SYNOPSIS
6		6
7	use AnyEvent::ProcessPool;	7	use AnyEvent::Fork;
		8
		9	##################################################################
		10	# create a single new process, tell it to run your worker function
		11
		12	AnyEvent::Fork
		13	->new
		14	->require ("MyModule")
		15	->run ("MyModule::worker, sub {
		16	my ($master_filehandle) = @_;
		17
		18	# now $master_filehandle is connected to the
		19	# $slave_filehandle in the new process.
		20	});
		21
		22	# MyModule::worker might look like this
		23	sub MyModule::worker {
		24	my ($slave_filehandle) = @_;
		25
		26	# now $slave_filehandle is connected to the $master_filehandle
		27	# in the original prorcess. have fun!
		28	}
		29
		30	##################################################################
		31	# create a pool of server processes all accepting on the same socket
		32
		33	# create listener socket
		34	my $listener = ...;
		35
		36	# create a pool template, initialise it and give it the socket
		37	my $pool = AnyEvent::Fork
		38	->new
		39	->require ("Some::Stuff", "My::Server")
		40	->send_fh ($listener);
		41
		42	# now create 10 identical workers
		43	for my $id (1..10) {
		44	$pool
		45	->fork
		46	->send_arg ($id)
		47	->run ("My::Server::run");
		48	}
		49
		50	# now do other things - maybe use the filehandle provided by run
		51	# to wait for the processes to die. or whatever.
		52
		53	# My::Server::run might look like this
		54	sub My::Server::run {
		55	my ($slave, $listener, $id) = @_;
		56
		57	close $slave; # we do not use the socket, so close it to save resources
		58
		59	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
		60	# or anything we usually couldn't do in a process forked normally.
		61	while (my $socket = $listener->accept) {
		62	# do sth. with new socket
		63	}
		64	}
		65
		66	##################################################################
		67	# use AnyEvent::Fork as a faster fork+exec
		68
		69	# this runs /bin/echo hi, with stdout redirected to /tmp/log
		70	# and stderr to the communications socket. it is usually faster
		71	# than fork+exec, but still let's you prepare the environment.
		72
		73	open my $output, ">/tmp/log" or die "$!";
		74
		75	AnyEvent::Fork
		76	->new
		77	->eval ('
		78	sub run {
		79	my ($fh, $output, @cmd) = @_;
		80
		81	# perl will clear close-on-exec on STDOUT/STDERR
		82	open STDOUT, ">&", $output or die;
		83	open STDERR, ">&", $fh or die;
		84
		85	exec @cmd;
		86	}
		87	')
		88	->send_fh ($output)
		89	->send_arg ("/bin/echo", "hi")
		90	->run ("run", my $cv = AE::cv);
		91
		92	my $stderr = $cv->recv;
8		93
9	=head1 DESCRIPTION	94	=head1 DESCRIPTION
10		95
11	This module allows you to create single worker processes but also worker	96	This module allows you to create new processes, without actually forking
12	pool that share memory, by forking from the main program, or exec'ing new	97	them from your current process (avoiding the problems of forking), but
13	perl interpreters from a module.	98	preserving most of the advantages of fork.
14		99
15	You create a new processes in a pool by specifying a function to call	100	It can be used to create new worker processes or new independent
16	with any combination of string values and file handles.	101	subprocesses for short- and long-running jobs, process pools (e.g. for use
		102	in pre-forked servers) but also to spawn new external processes (such as
		103	CGI scripts from a web server), which can be faster (and more well behaved)
		104	than using fork+exec in big processes.
17		105
18	A pool can have initialisation code which is executed before forking. The	106	Special care has been taken to make this module useful from other modules,
19	initialisation code is only executed once and the resulting process is	107	while still supporting specialised environments such as L<App::Staticperl>
20	cached, to be used as a template.	108	or L<PAR::Packer>.
21		109
22	Pools without such initialisation code don't cache an extra process.	110	=head1 WHAT THIS MODULE IS NOT
		111
		112	This module only creates processes and lets you pass file handles and
		113	strings to it, and run perl code. It does not implement any kind of RPC -
		114	there is no back channel from the process back to you, and there is no RPC
		115	or message passing going on.
		116
		117	If you need some form of RPC, you can either implement it yourself
		118	in whatever way you like, use some message-passing module such
		119	as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use
		120	L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
		121	and so on.
23		122
24	=head1 PROBLEM STATEMENT	123	=head1 PROBLEM STATEMENT
25		124
26	There are two ways to implement parallel processing on UNIX like operating	125	There are two ways to implement parallel processing on UNIX like operating
27	systems - fork and process, and fork+exec and process. They have different	126	systems - fork and process, and fork+exec and process. They have different
39	or fork+exec instead.	138	or fork+exec instead.
40		139
41	=item Forking usually creates a copy-on-write copy of the parent	140	=item Forking usually creates a copy-on-write copy of the parent
42	process. Memory (for example, modules or data files that have been	141	process. Memory (for example, modules or data files that have been
43	will not take additional memory). When exec'ing a new process, modules	142	will not take additional memory). When exec'ing a new process, modules
44	and data files might need to be loaded again, at extra cpu and memory	143	and data files might need to be loaded again, at extra CPU and memory
45	cost. Likewise when forking, all data structures are copied as well - if	144	cost. Likewise when forking, all data structures are copied as well - if
46	the program frees them and replaces them by new data, the child processes	145	the program frees them and replaces them by new data, the child processes
47	will retain the memory even if it isn't used.	146	will retain the memory even if it isn't used.
48		147
49	This module allows the main program to do a controlled fork, and allows	148	This module allows the main program to do a controlled fork, and allows
…		…
61	as template, and also tries hard to identify the correct path to the perl	160	as template, and also tries hard to identify the correct path to the perl
62	interpreter. With a cooperative main program, exec'ing the interpreter	161	interpreter. With a cooperative main program, exec'ing the interpreter
63	might not even be necessary.	162	might not even be necessary.
64		163
65	=item Forking might be impossible when a program is running. For example,	164	=item Forking might be impossible when a program is running. For example,
66	POSIX makes it almost impossible to fork from a multithreaded program and	165	POSIX makes it almost impossible to fork from a multi-threaded program and
67	do anything useful in the child - strictly speaking, if your perl program	166	do anything useful in the child - strictly speaking, if your perl program
68	uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),	167	uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),
69	you cannot call fork on the perl level anymore, at all.	168	you cannot call fork on the perl level anymore, at all.
70		169
71	This module can safely fork helper processes at any time, by caling	170	This module can safely fork helper processes at any time, by calling
72	fork+exec in C, in a POSIX-compatible way.	171	fork+exec in C, in a POSIX-compatible way.
73		172
74	=item Parallel processing with fork might be inconvenient or difficult	173	=item Parallel processing with fork might be inconvenient or difficult
75	to implement. For example, when a program uses an event loop and creates	174	to implement. For example, when a program uses an event loop and creates
76	watchers it becomes very hard to use the event loop from a child	175	watchers it becomes very hard to use the event loop from a child
…		…
108	needed the first time. Forking from this process shares the memory used	207	needed the first time. Forking from this process shares the memory used
109	for the perl interpreter with the new process, but loading modules takes	208	for the perl interpreter with the new process, but loading modules takes
110	time, and the memory is not shared with anything else.	209	time, and the memory is not shared with anything else.
111		210
112	This is ideal for when you only need one extra process of a kind, with the	211	This is ideal for when you only need one extra process of a kind, with the
113	option of starting and stipping it on demand.	212	option of starting and stopping it on demand.
		213
		214	Example:
		215
		216	AnyEvent::Fork
		217	->new
		218	->require ("Some::Module")
		219	->run ("Some::Module::run", sub {
		220	my ($fork_fh) = @_;
		221	});
114		222
115	=item fork a new template process, load code, then fork processes off of	223	=item fork a new template process, load code, then fork processes off of
116	it and run the code	224	it and run the code
117		225
118	When you need to have a bunch of processes that all execute the same (or	226	When you need to have a bunch of processes that all execute the same (or
…		…
124	modules you loaded) is shared between the processes, and each new process	232	modules you loaded) is shared between the processes, and each new process
125	consumes relatively little memory of its own.	233	consumes relatively little memory of its own.
126		234
127	The disadvantage of this approach is that you need to create a template	235	The disadvantage of this approach is that you need to create a template
128	process for the sole purpose of forking new processes from it, but if you	236	process for the sole purpose of forking new processes from it, but if you
129	only need a fixed number of proceses you can create them, and then destroy	237	only need a fixed number of processes you can create them, and then destroy
130	the template process.	238	the template process.
		239
		240	Example:
		241
		242	my $template = AnyEvent::Fork->new->require ("Some::Module");
		243
		244	for (1..10) {
		245	$template->fork->run ("Some::Module::run", sub {
		246	my ($fork_fh) = @_;
		247	});
		248	}
		249
		250	# at this point, you can keep $template around to fork new processes
		251	# later, or you can destroy it, which causes it to vanish.
131		252
132	=item execute a new perl interpreter, load some code, run it	253	=item execute a new perl interpreter, load some code, run it
133		254
134	This is relatively slow, and doesn't allow you to share memory between	255	This is relatively slow, and doesn't allow you to share memory between
135	multiple processes.	256	multiple processes.
…		…
137	The only advantage is that you don't have to have a template process	258	The only advantage is that you don't have to have a template process
138	hanging around all the time to fork off some new processes, which might be	259	hanging around all the time to fork off some new processes, which might be
139	an advantage when there are long time spans where no extra processes are	260	an advantage when there are long time spans where no extra processes are
140	needed.	261	needed.
141		262
		263	Example:
		264
		265	AnyEvent::Fork
		266	->new_exec
		267	->require ("Some::Module")
		268	->run ("Some::Module::run", sub {
		269	my ($fork_fh) = @_;
		270	});
		271
142	=back	272	=back
143		273
144	=head1 FUNCTIONS	274	=head1 FUNCTIONS
145		275
146	=over 4	276	=over 4
147		277
148	=cut	278	=cut
149		279
150	package AnyEvent::ProcessPool;	280	package AnyEvent::Fork;
151		281
152	use common::sense;	282	use common::sense;
153		283
154	use Socket ();	284	use Errno ();
155		285
156	use Proc::FastSpawn;
157	use AnyEvent;	286	use AnyEvent;
158	use AnyEvent::ProcessPool::Util;
159	use AnyEvent::Util ();	287	use AnyEvent::Util ();
160		288
161	BEGIN {	289	use IO::FDPass;
162	# require Exporter;
163	}
164		290
		291	our $VERSION = 0.5;
		292
		293	our $PERL; # the path to the perl interpreter, deduces with various forms of magic
		294
165	=item my $pool = new AnyEvent::ProcessPool key => value...	295	=item my $pool = new AnyEvent::Fork key => value...
166		296
167	Create a new process pool. The following named parameters are supported:	297	Create a new process pool. The following named parameters are supported:
168		298
169	=over 4	299	=over 4
170		300
171	=back	301	=back
172		302
173	=cut	303	=cut
174		304
		305	# the early fork template process
		306	our $EARLY;
		307
175	# the template process	308	# the empty template process
176	our $template;	309	our $TEMPLATE;
177		310
178	sub _queue {	311	sub _cmd {
179	my ($pid, $fh) = @_;	312	my $self = shift;
180		313
181	[	314	# ideally, we would want to use "a (w/a)*" as format string, but perl
		315	# versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
		316	# it.
		317	push @{ $self->[2] }, pack "a L/a*", $_[0], $_[1];
		318
		319	$self->[3] \|\|= AE::io $self->[1], 1, sub {
		320	do {
		321	# send the next "thing" in the queue - either a reference to an fh,
		322	# or a plain string.
		323
		324	if (ref $self->[2][0]) {
		325	# send fh
		326	unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) {
		327	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;
		328	undef $self->[3];
		329	die "AnyEvent::Fork: file descriptor send failure: $!";
		330	}
		331
		332	shift @{ $self->[2] };
		333
		334	} else {
		335	# send string
		336	my $len = syswrite $self->[1], $self->[2][0];
		337
		338	unless ($len) {
		339	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;
		340	undef $self->[3];
		341	die "AnyEvent::Fork: command write failure: $!";
		342	}
		343
		344	substr $self->[2][0], 0, $len, "";
		345	shift @{ $self->[2] } unless length $self->[2][0];
		346	}
		347	} while @{ $self->[2] };
		348
		349	# everything written
		350	undef $self->[3];
		351
		352	# invoke run callback, if any
		353	$self->[4]->($self->[1]) if $self->[4];
		354	};
		355
		356	() # make sure we don't leak the watcher
		357	}
		358
		359	sub _new {
		360	my ($self, $fh, $pid) = @_;
		361
		362	AnyEvent::Util::fh_nonblocking $fh, 1;
		363
		364	$self = bless [
182	$pid,	365	$pid,
183	$fh,	366	$fh,
184	[],	367	[], # write queue - strings or fd's
185	undef	368	undef, # AE watcher
186	]	369	], $self;
187	}
188		370
189	sub queue_cmd {	371	$self
190	my $queue = shift;
191
192	push @{ $queue->[2] }, pack "N/a", pack "a (w/a)*", @_;
193
194	$queue->[3] \|\|= AE::io $queue->[1], 1, sub {
195	if (ref $queue->[2][0]) {
196	AnyEvent::ProcessPool::Util::fd_send fileno $queue->[1], fileno ${ $queue->[2][0] }
197	and shift @{ $queue->[2] };
198	} else {
199	my $len = syswrite $queue->[1], $queue->[2][0]
200	or do { undef $queue->[3]; die "AnyEvent::ProcessPool::queue write failure: $!" };
201	substr $queue->[2][0], 0, $len, "";
202	shift @{ $queue->[2] } unless length $queue->[2][0];
203	}
204
205	undef $queue->[3] unless @{ $queue->[2] };
206	};
207	}	372	}
208		373
209	sub run_template {	374	# fork template from current process, used by AnyEvent::Fork::Early/Template
210	return if $template;	375	sub _new_fork {
211
212	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;	376	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
213	AnyEvent::Util::fh_nonblocking $fh, 1;	377	my $parent = $$;
214	fd_inherit fileno $slave;
215		378
216	my %env = %ENV;	379	my $pid = fork;
217	$env{PERL5LIB} = join ":", grep !ref, @INC;
218		380
219	my $pid = spawn	381	if ($pid eq 0) {
220	$^X,	382	require AnyEvent::Fork::Serve;
221	["perl", "-MAnyEvent::ProcessPool::Serve", "-e", "AnyEvent::ProcessPool::Serve::me", fileno $slave],	383	$AnyEvent::Fork::Serve::OWNER = $parent;
222	[map "$_=$env{$_}", keys %env],	384	close $fh;
223	or die "unable to spawn AnyEvent::ProcessPool server: $!";	385	$0 = "$_[1] of $parent";
		386	$SIG{CHLD} = 'IGNORE';
		387	AnyEvent::Fork::Serve::serve ($slave);
		388	exit 0;
		389	} elsif (!$pid) {
		390	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
		391	}
224		392
225	close $slave;	393	AnyEvent::Fork->_new ($fh, $pid)
226
227	$template = _queue $pid, $fh;
228
229	my ($a, $b) = AnyEvent::Util::portable_socketpair;
230
231	queue_cmd $template, "Iabc";
232	push @{ $template->[2] }, \$b;
233
234	use Coro::AnyEvent; Coro::AnyEvent::sleep 1;
235	undef $b;
236	die "x" . <$a>;
237	}	394	}
		395
		396	=item my $proc = new AnyEvent::Fork
		397
		398	Create a new "empty" perl interpreter process and returns its process
		399	object for further manipulation.
		400
		401	The new process is forked from a template process that is kept around
		402	for this purpose. When it doesn't exist yet, it is created by a call to
		403	C<new_exec> and kept around for future calls.
		404
		405	When the process object is destroyed, it will release the file handle
		406	that connects it with the new process. When the new process has not yet
		407	called C<run>, then the process will exit. Otherwise, what happens depends
		408	entirely on the code that is executed.
		409
		410	=cut
238		411
239	sub new {	412	sub new {
240	my $class = shift;	413	my $class = shift;
241		414
242	my $self = bless {	415	$TEMPLATE \|\|= $class->new_exec;
243	@_	416	$TEMPLATE->fork
244	}, $class;	417	}
245		418
246	run_template;	419	=item $new_proc = $proc->fork
		420
		421	Forks C<$proc>, creating a new process, and returns the process object
		422	of the new process.
		423
		424	If any of the C<send_> functions have been called before fork, then they
		425	will be cloned in the child. For example, in a pre-forked server, you
		426	might C<send_fh> the listening socket into the template process, and then
		427	keep calling C<fork> and C<run>.
		428
		429	=cut
		430
		431	sub fork {
		432	my ($self) = @_;
		433
		434	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
		435
		436	$self->send_fh ($slave);
		437	$self->_cmd ("f");
		438
		439	AnyEvent::Fork->_new ($fh)
		440	}
		441
		442	=item my $proc = new_exec AnyEvent::Fork
		443
		444	Create a new "empty" perl interpreter process and returns its process
		445	object for further manipulation.
		446
		447	Unlike the C<new> method, this method I<always> spawns a new perl process
		448	(except in some cases, see L<AnyEvent::Fork::Early> for details). This
		449	reduces the amount of memory sharing that is possible, and is also slower.
		450
		451	You should use C<new> whenever possible, except when having a template
		452	process around is unacceptable.
		453
		454	The path to the perl interpreter is divined using various methods - first
		455	C<$^X> is investigated to see if the path ends with something that sounds
		456	as if it were the perl interpreter. Failing this, the module falls back to
		457	using C<$Config::Config{perlpath}>.
		458
		459	=cut
		460
		461	sub new_exec {
		462	my ($self) = @_;
		463
		464	return $EARLY->fork
		465	if $EARLY;
		466
		467	# first find path of perl
		468	my $perl = $;
		469
		470	# first we try $^X, but the path must be absolute (always on win32), and end in sth.
		471	# that looks like perl. this obviously only works for posix and win32
		472	unless (
		473	($^O eq "MSWin32" \|\| $perl =~ m%^/%)
		474	&& $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i
		475	) {
		476	# if it doesn't look perlish enough, try Config
		477	require Config;
		478	$perl = $Config::Config{perlpath};
		479	$perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;
		480	}
		481
		482	require Proc::FastSpawn;
		483
		484	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
		485	Proc::FastSpawn::fd_inherit (fileno $slave);
		486
		487	# new fh's should always be set cloexec (due to $^F),
		488	# but hey, not on win32, so we always clear the inherit flag.
		489	Proc::FastSpawn::fd_inherit (fileno $fh, 0);
		490
		491	# quick. also doesn't work in win32. of course. what did you expect
		492	#local $ENV{PERL5LIB} = join ":", grep !ref, @INC;
		493	my %env = %ENV;
		494	$env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;
		495
		496	my $pid = Proc::FastSpawn::spawn (
		497	$perl,
		498	["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],
		499	[map "$_=$env{$_}", keys %env],
		500	) or die "unable to spawn AnyEvent::Fork server: $!";
		501
		502	$self->_new ($fh, $pid)
		503	}
		504
		505	=item $pid = $proc->pid
		506
		507	Returns the process id of the process I<iff it is a direct child of the
		508	process> running AnyEvent::Fork, and C<undef> otherwise.
		509
		510	Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and
		511	L<AnyEvent::Fork::Template> are direct children, and you are responsible
		512	to clean up their zombies when they die.
		513
		514	All other processes are not direct children, and will be cleaned up by
		515	AnyEvent::Fork.
		516
		517	=cut
		518
		519	sub pid {
		520	$_[0][0]
		521	}
		522
		523	=item $proc = $proc->eval ($perlcode, @args)
		524
		525	Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to
		526	the strings specified by C<@args>, in the "main" package.
		527
		528	This call is meant to do any custom initialisation that might be required
		529	(for example, the C<require> method uses it). It's not supposed to be used
		530	to completely take over the process, use C<run> for that.
		531
		532	The code will usually be executed after this call returns, and there is no
		533	way to pass anything back to the calling process. Any evaluation errors
		534	will be reported to stderr and cause the process to exit.
		535
		536	If you want to execute some code to take over the process (see the
		537	"fork+exec" example in the SYNOPSIS), you should compile a function via
		538	C<eval> first, and then call it via C<run>. This also gives you access to
		539	any arguments passed via the C<send_xxx> methods, such as file handles.
		540
		541	Returns the process object for easy chaining of method calls.
		542
		543	=cut
		544
		545	sub eval {
		546	my ($self, $code, @args) = @_;
		547
		548	$self->_cmd (e => pack "(w/a)", $code, @args);
247		549
248	$self	550	$self
249	}	551	}
250		552
		553	=item $proc = $proc->require ($module, ...)
		554
		555	Tries to load the given module(s) into the process
		556
		557	Returns the process object for easy chaining of method calls.
		558
		559	=cut
		560
		561	sub require {
		562	my ($self, @modules) = @_;
		563
		564	s%::%/%g for @modules;
		565	$self->eval ('require "$_.pm" for @_', @modules);
		566
		567	$self
		568	}
		569
		570	=item $proc = $proc->send_fh ($handle, ...)
		571
		572	Send one or more file handles (I<not> file descriptors) to the process,
		573	to prepare a call to C<run>.
		574
		575	The process object keeps a reference to the handles until this is done,
		576	so you must not explicitly close the handles. This is most easily
		577	accomplished by simply not storing the file handles anywhere after passing
		578	them to this method.
		579
		580	Returns the process object for easy chaining of method calls.
		581
		582	Example: pass a file handle to a process, and release it without
		583	closing. It will be closed automatically when it is no longer used.
		584
		585	$proc->send_fh ($my_fh);
		586	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
		587
		588	=cut
		589
		590	sub send_fh {
		591	my ($self, @fh) = @_;
		592
		593	for my $fh (@fh) {
		594	$self->_cmd ("h");
		595	push @{ $self->[2] }, \$fh;
		596	}
		597
		598	$self
		599	}
		600
		601	=item $proc = $proc->send_arg ($string, ...)
		602
		603	Send one or more argument strings to the process, to prepare a call to
		604	C<run>. The strings can be any octet string.
		605
		606	The protocol is optimised to pass a moderate number of relatively short
		607	strings - while you can pass up to 4GB of data in one go, this is more
		608	meant to pass some ID information or other startup info, not big chunks of
		609	data.
		610
		611	Returns the process object for easy chaining of method calls.
		612
		613	=cut
		614
		615	sub send_arg {
		616	my ($self, @arg) = @_;
		617
		618	$self->_cmd (a => pack "(w/a)", @arg);
		619
		620	$self
		621	}
		622
		623	=item $proc->run ($func, $cb->($fh))
		624
		625	Enter the function specified by the function name in C<$func> in the
		626	process. The function is called with the communication socket as first
		627	argument, followed by all file handles and string arguments sent earlier
		628	via C<send_fh> and C<send_arg> methods, in the order they were called.
		629
		630	The function name should be fully qualified, but if it isn't, it will be
		631	looked up in the main package.
		632
		633	If the called function returns, doesn't exist, or any error occurs, the
		634	process exits.
		635
		636	Preparing the process is done in the background - when all commands have
		637	been sent, the callback is invoked with the local communications socket
		638	as argument. At this point you can start using the socket in any way you
		639	like.
		640
		641	The process object becomes unusable on return from this function - any
		642	further method calls result in undefined behaviour.
		643
		644	If the communication socket isn't used, it should be closed on both sides,
		645	to save on kernel memory.
		646
		647	The socket is non-blocking in the parent, and blocking in the newly
		648	created process. The close-on-exec flag is set in both.
		649
		650	Even if not used otherwise, the socket can be a good indicator for the
		651	existence of the process - if the other process exits, you get a readable
		652	event on it, because exiting the process closes the socket (if it didn't
		653	create any children using fork).
		654
		655	Example: create a template for a process pool, pass a few strings, some
		656	file handles, then fork, pass one more string, and run some code.
		657
		658	my $pool = AnyEvent::Fork
		659	->new
		660	->send_arg ("str1", "str2")
		661	->send_fh ($fh1, $fh2);
		662
		663	for (1..2) {
		664	$pool
		665	->fork
		666	->send_arg ("str3")
		667	->run ("Some::function", sub {
		668	my ($fh) = @_;
		669
		670	# fh is nonblocking, but we trust that the OS can accept these
		671	# few octets anyway.
		672	syswrite $fh, "hi #$_\n";
		673
		674	# $fh is being closed here, as we don't store it anywhere
		675	});
		676	}
		677
		678	# Some::function might look like this - all parameters passed before fork
		679	# and after will be passed, in order, after the communications socket.
		680	sub Some::function {
		681	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
		682
		683	print scalar <$fh>; # prints "hi #1\n" and "hi #2\n" in any order
		684	}
		685
		686	=cut
		687
		688	sub run {
		689	my ($self, $func, $cb) = @_;
		690
		691	$self->[4] = $cb;
		692	$self->_cmd (r => $func);
		693	}
		694
251	=back	695	=back
		696
		697	=head1 PERFORMANCE
		698
		699	Now for some unscientific benchmark numbers (all done on an amd64
		700	GNU/Linux box). These are intended to give you an idea of the relative
		701	performance you can expect, they are not meant to be absolute performance
		702	numbers.
		703
		704	OK, so, I ran a simple benchmark that creates a socket pair, forks, calls
		705	exit in the child and waits for the socket to close in the parent. I did
		706	load AnyEvent, EV and AnyEvent::Fork, for a total process size of 5100kB.
		707
		708	2079 new processes per second, using manual socketpair + fork
		709
		710	Then I did the same thing, but instead of calling fork, I called
		711	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
		712	socket form the child to close on exit. This does the same thing as manual
		713	socket pair + fork, except that what is forked is the template process
		714	(2440kB), and the socket needs to be passed to the server at the other end
		715	of the socket first.
		716
		717	2307 new processes per second, using AnyEvent::Fork->new
		718
		719	And finally, using C<new_exec> instead C<new>, using vforks+execs to exec
		720	a new perl interpreter and compile the small server each time, I get:
		721
		722	479 vfork+execs per second, using AnyEvent::Fork->new_exec
		723
		724	So how can C<< AnyEvent->new >> be faster than a standard fork, even
		725	though it uses the same operations, but adds a lot of overhead?
		726
		727	The difference is simply the process size: forking the 6MB process takes
		728	so much longer than forking the 2.5MB template process that the overhead
		729	introduced is canceled out.
		730
		731	If the benchmark process grows, the normal fork becomes even slower:
		732
		733	1340 new processes, manual fork in a 20MB process
		734	731 new processes, manual fork in a 200MB process
		735	235 new processes, manual fork in a 2000MB process
		736
		737	What that means (to me) is that I can use this module without having a
		738	very bad conscience because of the extra overhead required to start new
		739	processes.
		740
		741	=head1 TYPICAL PROBLEMS
		742
		743	This section lists typical problems that remain. I hope by recognising
		744	them, most can be avoided.
		745
		746	=over 4
		747
		748	=item "leaked" file descriptors for exec'ed processes
		749
		750	POSIX systems inherit file descriptors by default when exec'ing a new
		751	process. While perl itself laudably sets the close-on-exec flags on new
		752	file handles, most C libraries don't care, and even if all cared, it's
		753	often not possible to set the flag in a race-free manner.
		754
		755	That means some file descriptors can leak through. And since it isn't
		756	possible to know which file descriptors are "good" and "necessary" (or
		757	even to know which file descriptors are open), there is no good way to
		758	close the ones that might harm.
		759
		760	As an example of what "harm" can be done consider a web server that
		761	accepts connections and afterwards some module uses AnyEvent::Fork for the
		762	first time, causing it to fork and exec a new process, which might inherit
		763	the network socket. When the server closes the socket, it is still open
		764	in the child (which doesn't even know that) and the client might conclude
		765	that the connection is still fine.
		766
		767	For the main program, there are multiple remedies available -
		768	L<AnyEvent::Fork::Early> is one, creating a process early and not using
		769	C<new_exec> is another, as in both cases, the first process can be exec'ed
		770	well before many random file descriptors are open.
		771
		772	In general, the solution for these kind of problems is to fix the
		773	libraries or the code that leaks those file descriptors.
		774
		775	Fortunately, most of these leaked descriptors do no harm, other than
		776	sitting on some resources.
		777
		778	=item "leaked" file descriptors for fork'ed processes
		779
		780	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
		781	which closes file descriptors not marked for being inherited.
		782
		783	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
		784	a way to create these processes by forking, and this leaks more file
		785	descriptors than exec'ing them, as there is no way to mark descriptors as
		786	"close on fork".
		787
		788	An example would be modules like L<EV>, L<IO::AIO> or L<Gtk2>. Both create
		789	pipes for internal uses, and L<Gtk2> might open a connection to the X
		790	server. L<EV> and L<IO::AIO> can deal with fork, but Gtk2 might have
		791	trouble with a fork.
		792
		793	The solution is to either not load these modules before use'ing
		794	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
		795	initialising them, for example, by calling C<init Gtk2> manually.
		796
		797	=item exit runs destructors
		798
		799	This only applies to users of Lc<AnyEvent::Fork:Early> and
		800	L<AnyEvent::Fork::Template>.
		801
		802	When a process created by AnyEvent::Fork exits, it might do so by calling
		803	exit, or simply letting perl reach the end of the program. At which point
		804	Perl runs all destructors.
		805
		806	Not all destructors are fork-safe - for example, an object that represents
		807	the connection to an X display might tell the X server to free resources,
		808	which is inconvenient when the "real" object in the parent still needs to
		809	use them.
		810
		811	This is obviously not a problem for L<AnyEvent::Fork::Early>, as you used
		812	it as the very first thing, right?
		813
		814	It is a problem for L<AnyEvent::Fork::Template> though - and the solution
		815	is to not create objects with nontrivial destructors that might have an
		816	effect outside of Perl.
		817
		818	=back
		819
		820	=head1 PORTABILITY NOTES
		821
		822	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a nop,
		823	and ::Template is not going to work), and it cost a lot of blood and sweat
		824	to make it so, mostly due to the bloody broken perl that nobody seems to
		825	care about. The fork emulation is a bad joke - I have yet to see something
		826	useful that you can do with it without running into memory corruption
		827	issues or other braindamage. Hrrrr.
		828
		829	Cygwin perl is not supported at the moment, as it should implement fd
		830	passing, but doesn't, and rolling my own is hard, as cygwin doesn't
		831	support enough functionality to do it.
		832
		833	=head1 SEE ALSO
		834
		835	L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
		836	L<AnyEvent::Fork::Template> (to create a process by forking the main
		837	program at a convenient time).
252		838
253	=head1 AUTHOR	839	=head1 AUTHOR
254		840
255	Marc Lehmann <schmorp@schmorp.de>	841	Marc Lehmann <schmorp@schmorp.de>
256	http://home.schmorp.de/	842	http://home.schmorp.de/

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/Fork.pm (file contents): Revision 1.3 by root, Tue Apr 2 18:00:04 2013 UTC vs. Revision 1.23 by root, Sat Apr 6 08:29:43 2013 UTC

Diff Legend

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.3 by root, Tue Apr 2 18:00:04 2013 UTC vs.
Revision 1.23 by root, Sat Apr 6 08:29:43 2013 UTC