[ViewVC] Diff of: cvs/AnyEvent-Fork/Fork.pm

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.18 by root, Sat Apr 6 01:33:56 2013 UTC

…		…
1	=head1 NAME	1	=head1 NAME
2		2
3	AnyEvent::ProcessPool - manage pools of perl worker processes, exec'ed or fork'ed	3	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
4		4
5	=head1 SYNOPSIS	5	=head1 SYNOPSIS
6		6
7	use AnyEvent::ProcessPool;	7	use AnyEvent::Fork;
		8
		9	##################################################################
		10	# create a single new process, tell it to run your worker function
		11
		12	AnyEvent::Fork
		13	->new
		14	->require ("MyModule")
		15	->run ("MyModule::worker, sub {
		16	my ($master_filehandle) = @_;
		17
		18	# now $master_filehandle is connected to the
		19	# $slave_filehandle in the new process.
		20	});
		21
		22	# MyModule::worker might look like this
		23	sub MyModule::worker {
		24	my ($slave_filehandle) = @_;
		25
		26	# now $slave_filehandle is connected to the $master_filehandle
		27	# in the original prorcess. have fun!
		28	}
		29
		30	##################################################################
		31	# create a pool of server processes all accepting on the same socket
		32
		33	# create listener socket
		34	my $listener = ...;
		35
		36	# create a pool template, initialise it and give it the socket
		37	my $pool = AnyEvent::Fork
		38	->new
		39	->require ("Some::Stuff", "My::Server")
		40	->send_fh ($listener);
		41
		42	# now create 10 identical workers
		43	for my $id (1..10) {
		44	$pool
		45	->fork
		46	->send_arg ($id)
		47	->run ("My::Server::run");
		48	}
		49
		50	# now do other things - maybe use the filehandle provided by run
		51	# to wait for the processes to die. or whatever.
		52
		53	# My::Server::run might look like this
		54	sub My::Server::run {
		55	my ($slave, $listener, $id) = @_;
		56
		57	close $slave; # we do not use the socket, so close it to save resources
		58
		59	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
		60	# or anything we usually couldn't do in a process forked normally.
		61	while (my $socket = $listener->accept) {
		62	# do sth. with new socket
		63	}
		64	}
8		65
9	=head1 DESCRIPTION	66	=head1 DESCRIPTION
10		67
11	This module allows you to create single worker processes but also worker	68	This module allows you to create new processes, without actually forking
12	pool that share memory, by forking from the main program, or exec'ing new	69	them from your current process (avoiding the problems of forking), but
13	perl interpreters from a module.	70	preserving most of the advantages of fork.
14		71
15	You create a new processes in a pool by specifying a function to call	72	It can be used to create new worker processes or new independent
16	with any combination of string values and file handles.	73	subprocesses for short- and long-running jobs, process pools (e.g. for use
		74	in pre-forked servers) but also to spawn new external processes (such as
		75	CGI scripts from a web server), which can be faster (and more well behaved)
		76	than using fork+exec in big processes.
17		77
18	A pool can have initialisation code which is executed before forking. The	78	Special care has been taken to make this module useful from other modules,
19	initialisation code is only executed once and the resulting process is	79	while still supporting specialised environments such as L<App::Staticperl>
20	cached, to be used as a template.	80	or L<PAR::Packer>.
21		81
22	Pools without such initialisation code don't cache an extra process.	82	=head1 WHAT THIS MODULE IS NOT
		83
		84	This module only creates processes and lets you pass file handles and
		85	strings to it, and run perl code. It does not implement any kind of RPC -
		86	there is no back channel from the process back to you, and there is no RPC
		87	or message passing going on.
		88
		89	If you need some form of RPC, you can either implement it yourself
		90	in whatever way you like, use some message-passing module such
		91	as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use
		92	L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
		93	and so on.
23		94
24	=head1 PROBLEM STATEMENT	95	=head1 PROBLEM STATEMENT
25		96
26	There are two ways to implement parallel processing on UNIX like operating	97	There are two ways to implement parallel processing on UNIX like operating
27	systems - fork and process, and fork+exec and process. They have different	98	systems - fork and process, and fork+exec and process. They have different
39	or fork+exec instead.	110	or fork+exec instead.
40		111
41	=item Forking usually creates a copy-on-write copy of the parent	112	=item Forking usually creates a copy-on-write copy of the parent
42	process. Memory (for example, modules or data files that have been	113	process. Memory (for example, modules or data files that have been
43	will not take additional memory). When exec'ing a new process, modules	114	will not take additional memory). When exec'ing a new process, modules
44	and data files might need to be loaded again, at extra cpu and memory	115	and data files might need to be loaded again, at extra CPU and memory
45	cost. Likewise when forking, all data structures are copied as well - if	116	cost. Likewise when forking, all data structures are copied as well - if
46	the program frees them and replaces them by new data, the child processes	117	the program frees them and replaces them by new data, the child processes
47	will retain the memory even if it isn't used.	118	will retain the memory even if it isn't used.
48		119
49	This module allows the main program to do a controlled fork, and allows	120	This module allows the main program to do a controlled fork, and allows
…		…
61	as template, and also tries hard to identify the correct path to the perl	132	as template, and also tries hard to identify the correct path to the perl
62	interpreter. With a cooperative main program, exec'ing the interpreter	133	interpreter. With a cooperative main program, exec'ing the interpreter
63	might not even be necessary.	134	might not even be necessary.
64		135
65	=item Forking might be impossible when a program is running. For example,	136	=item Forking might be impossible when a program is running. For example,
66	POSIX makes it almost impossible to fork from a multithreaded program and	137	POSIX makes it almost impossible to fork from a multi-threaded program and
67	do anything useful in the child - strictly speaking, if your perl program	138	do anything useful in the child - strictly speaking, if your perl program
68	uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),	139	uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),
69	you cannot call fork on the perl level anymore, at all.	140	you cannot call fork on the perl level anymore, at all.
70		141
71	This module can safely fork helper processes at any time, by caling	142	This module can safely fork helper processes at any time, by calling
72	fork+exec in C, in a POSIX-compatible way.	143	fork+exec in C, in a POSIX-compatible way.
73		144
74	=item Parallel processing with fork might be inconvenient or difficult	145	=item Parallel processing with fork might be inconvenient or difficult
75	to implement. For example, when a program uses an event loop and creates	146	to implement. For example, when a program uses an event loop and creates
76	watchers it becomes very hard to use the event loop from a child	147	watchers it becomes very hard to use the event loop from a child
…		…
83	pools are created by fork+exec, after which such modules can again be	154	pools are created by fork+exec, after which such modules can again be
84	loaded.	155	loaded.
85		156
86	=back	157	=back
87		158
		159	=head1 CONCEPTS
		160
		161	This module can create new processes either by executing a new perl
		162	process, or by forking from an existing "template" process.
		163
		164	Each such process comes with its own file handle that can be used to
		165	communicate with it (it's actually a socket - one end in the new process,
		166	one end in the main process), and among the things you can do in it are
		167	load modules, fork new processes, send file handles to it, and execute
		168	functions.
		169
		170	There are multiple ways to create additional processes to execute some
		171	jobs:
		172
88	=over 4	173	=over 4
89		174
90	=cut	175	=item fork a new process from the "default" template process, load code,
		176	run it
91		177
		178	This module has a "default" template process which it executes when it is
		179	needed the first time. Forking from this process shares the memory used
		180	for the perl interpreter with the new process, but loading modules takes
		181	time, and the memory is not shared with anything else.
		182
		183	This is ideal for when you only need one extra process of a kind, with the
		184	option of starting and stopping it on demand.
		185
		186	Example:
		187
		188	AnyEvent::Fork
		189	->new
		190	->require ("Some::Module")
		191	->run ("Some::Module::run", sub {
		192	my ($fork_fh) = @_;
		193	});
		194
		195	=item fork a new template process, load code, then fork processes off of
		196	it and run the code
		197
		198	When you need to have a bunch of processes that all execute the same (or
		199	very similar) tasks, then a good way is to create a new template process
		200	for them, loading all the modules you need, and then create your worker
		201	processes from this new template process.
		202
		203	This way, all code (and data structures) that can be shared (e.g. the
		204	modules you loaded) is shared between the processes, and each new process
		205	consumes relatively little memory of its own.
		206
		207	The disadvantage of this approach is that you need to create a template
		208	process for the sole purpose of forking new processes from it, but if you
		209	only need a fixed number of processes you can create them, and then destroy
		210	the template process.
		211
		212	Example:
		213
		214	my $template = AnyEvent::Fork->new->require ("Some::Module");
		215
		216	for (1..10) {
		217	$template->fork->run ("Some::Module::run", sub {
		218	my ($fork_fh) = @_;
		219	});
		220	}
		221
		222	# at this point, you can keep $template around to fork new processes
		223	# later, or you can destroy it, which causes it to vanish.
		224
		225	=item execute a new perl interpreter, load some code, run it
		226
		227	This is relatively slow, and doesn't allow you to share memory between
		228	multiple processes.
		229
		230	The only advantage is that you don't have to have a template process
		231	hanging around all the time to fork off some new processes, which might be
		232	an advantage when there are long time spans where no extra processes are
		233	needed.
		234
		235	Example:
		236
		237	AnyEvent::Fork
		238	->new_exec
		239	->require ("Some::Module")
		240	->run ("Some::Module::run", sub {
		241	my ($fork_fh) = @_;
		242	});
		243
		244	=back
		245
		246	=head1 FUNCTIONS
		247
		248	=over 4
		249
		250	=cut
		251
92	package AnyEvent::ProcessPool;	252	package AnyEvent::Fork;
93		253
94	use common::sense;	254	use common::sense;
95		255
96	use Socket ();	256	use Errno ();
97		257
98	use Proc::FastSpawn;
99	use AnyEvent;	258	use AnyEvent;
100	use AnyEvent::ProcessPool::Util;
101	use AnyEvent::Util ();	259	use AnyEvent::Util ();
102		260
103	BEGIN {	261	use IO::FDPass;
104	# require Exporter;
105	}
106		262
		263	our $VERSION = 0.2;
		264
		265	our $PERL; # the path to the perl interpreter, deduces with various forms of magic
		266
107	=item my $pool = new AnyEvent::ProcessPool key => value...	267	=item my $pool = new AnyEvent::Fork key => value...
108		268
109	Create a new process pool. The following named parameters are supported:	269	Create a new process pool. The following named parameters are supported:
110		270
111	=over 4	271	=over 4
112		272
113	=back	273	=back
114		274
115	=cut	275	=cut
116		276
		277	# the early fork template process
		278	our $EARLY;
		279
117	# the template process	280	# the empty template process
118	our $template;	281	our $TEMPLATE;
119		282
120	sub _queue {	283	sub _cmd {
		284	my $self = shift;
		285
		286	# ideally, we would want to use "a (w/a)*" as format string, but perl
		287	# versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
		288	# it.
		289	push @{ $self->[2] }, pack "L/a", pack "(w/a)*", @_;
		290
		291	unless ($self->[3]) {
		292	my $wcb = sub {
		293	do {
		294	# send the next "thing" in the queue - either a reference to an fh,
		295	# or a plain string.
		296
		297	if (ref $self->[2][0]) {
		298	# send fh
		299	unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) {
		300	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;
		301	undef $self->[3];
		302	die "AnyEvent::Fork: file descriptor send failure: $!";
		303	}
		304
		305	shift @{ $self->[2] };
		306
		307	} else {
		308	# send string
		309	my $len = syswrite $self->[1], $self->[2][0];
		310
		311	unless ($len) {
		312	return if $! == Errno::EAGAIN \|\| $! == Errno::EWOULDBLOCK;
		313	undef $self->[3];
		314	die "AnyEvent::Fork: command write failure: $!";
		315	}
		316
		317	substr $self->[2][0], 0, $len, "";
		318	shift @{ $self->[2] } unless length $self->[2][0];
		319	}
		320	} while @{ $self->[2] };
		321
		322	# everything written
		323	undef $self->[3];
		324	# invoke run callback
		325	$self->[0]->($self->[1]) if $self->[0];
		326	};
		327
		328	$wcb->();
		329
		330	$self->[3] \|\|= AE::io $self->[1], 1, $wcb
		331	if @{ $self->[2] };
		332	}
		333
		334	() # make sure we don't leak the watcher
		335	}
		336
		337	sub _new {
121	my ($pid, $fh) = @_;	338	my ($self, $fh) = @_;
122		339
123	[	340	AnyEvent::Util::fh_nonblocking $fh, 1;
124	$pid,	341
		342	$self = bless [
		343	undef, # run callback
125	$fh,	344	$fh,
126	[],	345	[], # write queue - strings or fd's
127	undef	346	undef, # AE watcher
128	]	347	], $self;
129	}
130		348
131	sub queue_cmd {	349	$self
132	my ($queue, $cmd) = @_;
133
134	push @{ $queue->[2] }, pack "N/a", $cmd;
135
136	$queue->[3] \|\|= AE::io $queue->[1], 1, sub {
137	warn "oopl0 ", scalar @{ $queue->[2] };
138	if (ref $queue->[2][0]) {
139	warn "oopla2\n";#d#
140	AnyEvent::ProcessPool::Util::fd_send fileno $queue->[1], fileno ${ $queue->[2][0] }
141	and shift @{ $queue->[2] };
142	} else {
143	warn "write ", length $queue->[2][0];#d#
144	my $len = syswrite $queue->[1], $queue->[2][0]
145	or die "AnyEvent::ProcessPool::queue write failure: $!";
146	substr $queue->[2][0], 0, $len, "";
147	shift @{ $queue->[2] } unless length $queue->[2][0];
148	}
149
150	undef $queue->[3] unless @{ $queue->[2] };
151	warn "oopl3 ", scalar @{ $queue->[2] };
152	warn "oopl4 $queue->[3]\n";#d#
153	};
154	}	350	}
155		351
156	sub run_template {	352	# fork template from current process, used by AnyEvent::Fork::Early/Template
157	return if $template;	353	sub _new_fork {
158
159	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;	354	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
160	AnyEvent::Util::fh_nonblocking $fh, 1;	355	my $parent = $$;
161	fd_inherit fileno $slave;
162		356
163	my %env = %ENV;	357	my $pid = fork;
164	$env{PERL5LIB} = join ":", grep !ref, @INC;
165		358
166	my $pid = spawn	359	if ($pid eq 0) {
167	$^X,	360	require AnyEvent::Fork::Serve;
168	["perl", "-MAnyEvent::ProcessPool::Serve", "-e", "AnyEvent::ProcessPool::Serve::me", fileno $slave],	361	$AnyEvent::Fork::Serve::OWNER = $parent;
169	[map "$_=$env{$_}", keys %env],	362	close $fh;
170	or die "unable to spawn AnyEvent::ProcessPool server: $!";	363	$0 = "$_[1] of $parent";
		364	$SIG{CHLD} = 'IGNORE';
		365	AnyEvent::Fork::Serve::serve ($slave);
		366	exit 0;
		367	} elsif (!$pid) {
		368	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
		369	}
171		370
172	close $slave;	371	AnyEvent::Fork->_new ($fh)
173
174	$template = _queue $pid, $fh;
175
176	my ($a, $b) = AnyEvent::Util::portable_socketpair;
177
178	queue_cmd $template, "Iabc";
179	push @{ $template->[2] }, \$b;
180
181	use Coro::AnyEvent; Coro::AnyEvent::sleep 1;
182	}	372	}
		373
		374	=item my $proc = new AnyEvent::Fork
		375
		376	Create a new "empty" perl interpreter process and returns its process
		377	object for further manipulation.
		378
		379	The new process is forked from a template process that is kept around
		380	for this purpose. When it doesn't exist yet, it is created by a call to
		381	C<new_exec> and kept around for future calls.
		382
		383	When the process object is destroyed, it will release the file handle
		384	that connects it with the new process. When the new process has not yet
		385	called C<run>, then the process will exit. Otherwise, what happens depends
		386	entirely on the code that is executed.
		387
		388	=cut
183		389
184	sub new {	390	sub new {
185	my $class = shift;	391	my $class = shift;
186		392
187	my $self = bless {	393	$TEMPLATE \|\|= $class->new_exec;
188	@_	394	$TEMPLATE->fork
189	}, $class;	395	}
190		396
191	run_template;	397	=item $new_proc = $proc->fork
		398
		399	Forks C<$proc>, creating a new process, and returns the process object
		400	of the new process.
		401
		402	If any of the C<send_> functions have been called before fork, then they
		403	will be cloned in the child. For example, in a pre-forked server, you
		404	might C<send_fh> the listening socket into the template process, and then
		405	keep calling C<fork> and C<run>.
		406
		407	=cut
		408
		409	sub fork {
		410	my ($self) = @_;
		411
		412	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
		413
		414	$self->send_fh ($slave);
		415	$self->_cmd ("f");
		416
		417	AnyEvent::Fork->_new ($fh)
		418	}
		419
		420	=item my $proc = new_exec AnyEvent::Fork
		421
		422	Create a new "empty" perl interpreter process and returns its process
		423	object for further manipulation.
		424
		425	Unlike the C<new> method, this method I<always> spawns a new perl process
		426	(except in some cases, see L<AnyEvent::Fork::Early> for details). This
		427	reduces the amount of memory sharing that is possible, and is also slower.
		428
		429	You should use C<new> whenever possible, except when having a template
		430	process around is unacceptable.
		431
		432	The path to the perl interpreter is divined using various methods - first
		433	C<$^X> is investigated to see if the path ends with something that sounds
		434	as if it were the perl interpreter. Failing this, the module falls back to
		435	using C<$Config::Config{perlpath}>.
		436
		437	=cut
		438
		439	sub new_exec {
		440	my ($self) = @_;
		441
		442	return $EARLY->fork
		443	if $EARLY;
		444
		445	# first find path of perl
		446	my $perl = $;
		447
		448	# first we try $^X, but the path must be absolute (always on win32), and end in sth.
		449	# that looks like perl. this obviously only works for posix and win32
		450	unless (
		451	($^O eq "MSWin32" \|\| $perl =~ m%^/%)
		452	&& $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i
		453	) {
		454	# if it doesn't look perlish enough, try Config
		455	require Config;
		456	$perl = $Config::Config{perlpath};
		457	$perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;
		458	}
		459
		460	require Proc::FastSpawn;
		461
		462	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
		463	Proc::FastSpawn::fd_inherit (fileno $slave);
		464
		465	# new fh's should always be set cloexec (due to $^F),
		466	# but hey, not on win32, so we always clear the inherit flag.
		467	Proc::FastSpawn::fd_inherit (fileno $fh, 0);
		468
		469	# quick. also doesn't work in win32. of course. what did you expect
		470	#local $ENV{PERL5LIB} = join ":", grep !ref, @INC;
		471	my %env = %ENV;
		472	$env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;
		473
		474	Proc::FastSpawn::spawn (
		475	$perl,
		476	["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],
		477	[map "$_=$env{$_}", keys %env],
		478	) or die "unable to spawn AnyEvent::Fork server: $!";
		479
		480	$self->_new ($fh)
		481	}
		482
		483	=item $proc = $proc->eval ($perlcode, @args)
		484
		485	Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to
		486	the strings specified by C<@args>.
		487
		488	This call is meant to do any custom initialisation that might be required
		489	(for example, the C<require> method uses it). It's not supposed to be used
		490	to completely take over the process, use C<run> for that.
		491
		492	The code will usually be executed after this call returns, and there is no
		493	way to pass anything back to the calling process. Any evaluation errors
		494	will be reported to stderr and cause the process to exit.
		495
		496	Returns the process object for easy chaining of method calls.
		497
		498	=cut
		499
		500	sub eval {
		501	my ($self, $code, @args) = @_;
		502
		503	$self->_cmd (e => $code, @args);
192		504
193	$self	505	$self
194	}	506	}
195		507
		508	=item $proc = $proc->require ($module, ...)
		509
		510	Tries to load the given module(s) into the process
		511
		512	Returns the process object for easy chaining of method calls.
		513
		514	=cut
		515
		516	sub require {
		517	my ($self, @modules) = @_;
		518
		519	s%::%/%g for @modules;
		520	$self->eval ('require "$_.pm" for @_', @modules);
		521
		522	$self
		523	}
		524
		525	=item $proc = $proc->send_fh ($handle, ...)
		526
		527	Send one or more file handles (I<not> file descriptors) to the process,
		528	to prepare a call to C<run>.
		529
		530	The process object keeps a reference to the handles until this is done,
		531	so you must not explicitly close the handles. This is most easily
		532	accomplished by simply not storing the file handles anywhere after passing
		533	them to this method.
		534
		535	Returns the process object for easy chaining of method calls.
		536
		537	Example: pass a file handle to a process, and release it without
		538	closing. It will be closed automatically when it is no longer used.
		539
		540	$proc->send_fh ($my_fh);
		541	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
		542
		543	=cut
		544
		545	sub send_fh {
		546	my ($self, @fh) = @_;
		547
		548	for my $fh (@fh) {
		549	$self->_cmd ("h");
		550	push @{ $self->[2] }, \$fh;
		551	}
		552
		553	$self
		554	}
		555
		556	=item $proc = $proc->send_arg ($string, ...)
		557
		558	Send one or more argument strings to the process, to prepare a call to
		559	C<run>. The strings can be any octet string.
		560
		561	The protocol is optimised to pass a moderate number of relatively short
		562	strings - while you can pass up to 4GB of data in one go, this is more
		563	meant to pass some ID information or other startup info, not big chunks of
		564	data.
		565
		566	Returns the process object for easy chaining of method calls.
		567
		568	=cut
		569
		570	sub send_arg {
		571	my ($self, @arg) = @_;
		572
		573	$self->_cmd (a => @arg);
		574
		575	$self
		576	}
		577
		578	=item $proc->run ($func, $cb->($fh))
		579
		580	Enter the function specified by the fully qualified name in C<$func> in
		581	the process. The function is called with the communication socket as first
		582	argument, followed by all file handles and string arguments sent earlier
		583	via C<send_fh> and C<send_arg> methods, in the order they were called.
		584
		585	If the called function returns, the process exits.
		586
		587	Preparing the process can take time - when the process is ready, the
		588	callback is invoked with the local communications socket as argument.
		589
		590	The process object becomes unusable on return from this function.
		591
		592	If the communication socket isn't used, it should be closed on both sides,
		593	to save on kernel memory.
		594
		595	The socket is non-blocking in the parent, and blocking in the newly
		596	created process. The close-on-exec flag is set on both. Even if not used
		597	otherwise, the socket can be a good indicator for the existence of the
		598	process - if the other process exits, you get a readable event on it,
		599	because exiting the process closes the socket (if it didn't create any
		600	children using fork).
		601
		602	Example: create a template for a process pool, pass a few strings, some
		603	file handles, then fork, pass one more string, and run some code.
		604
		605	my $pool = AnyEvent::Fork
		606	->new
		607	->send_arg ("str1", "str2")
		608	->send_fh ($fh1, $fh2);
		609
		610	for (1..2) {
		611	$pool
		612	->fork
		613	->send_arg ("str3")
		614	->run ("Some::function", sub {
		615	my ($fh) = @_;
		616
		617	# fh is nonblocking, but we trust that the OS can accept these
		618	# extra 3 octets anyway.
		619	syswrite $fh, "hi #$_\n";
		620
		621	# $fh is being closed here, as we don't store it anywhere
		622	});
		623	}
		624
		625	# Some::function might look like this - all parameters passed before fork
		626	# and after will be passed, in order, after the communications socket.
		627	sub Some::function {
		628	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
		629
		630	print scalar <$fh>; # prints "hi 1\n" and "hi 2\n"
		631	}
		632
		633	=cut
		634
		635	sub run {
		636	my ($self, $func, $cb) = @_;
		637
		638	$self->[0] = $cb;
		639	$self->_cmd (r => $func);
		640	}
		641
196	=back	642	=back
		643
		644	=head1 PERFORMANCE
		645
		646	Now for some unscientific benchmark numbers (all done on an amd64
		647	GNU/Linux box). These are intended to give you an idea of the relative
		648	performance you can expect, they are not meant to be absolute performance
		649	numbers.
		650
		651	OK, so, I ran a simple benchmark that creates a socket pair, forks, calls
		652	exit in the child and waits for the socket to close in the parent. I did
		653	load AnyEvent, EV and AnyEvent::Fork, for a total process size of 5100kB.
		654
		655	2079 new processes per second, using manual socketpair + fork
		656
		657	Then I did the same thing, but instead of calling fork, I called
		658	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
		659	socket form the child to close on exit. This does the same thing as manual
		660	socket pair + fork, except that what is forked is the template process
		661	(2440kB), and the socket needs to be passed to the server at the other end
		662	of the socket first.
		663
		664	2307 new processes per second, using AnyEvent::Fork->new
		665
		666	And finally, using C<new_exec> instead C<new>, using vforks+execs to exec
		667	a new perl interpreter and compile the small server each time, I get:
		668
		669	479 vfork+execs per second, using AnyEvent::Fork->new_exec
		670
		671	So how can C<< AnyEvent->new >> be faster than a standard fork, even
		672	though it uses the same operations, but adds a lot of overhead?
		673
		674	The difference is simply the process size: forking the 6MB process takes
		675	so much longer than forking the 2.5MB template process that the overhead
		676	introduced is canceled out.
		677
		678	If the benchmark process grows, the normal fork becomes even slower:
		679
		680	1340 new processes, manual fork in a 20MB process
		681	731 new processes, manual fork in a 200MB process
		682	235 new processes, manual fork in a 2000MB process
		683
		684	What that means (to me) is that I can use this module without having a
		685	very bad conscience because of the extra overhead required to start new
		686	processes.
		687
		688	=head1 TYPICAL PROBLEMS
		689
		690	This section lists typical problems that remain. I hope by recognising
		691	them, most can be avoided.
		692
		693	=over 4
		694
		695	=item exit runs destructors
		696
		697	=item "leaked" file descriptors for exec'ed processes
		698
		699	POSIX systems inherit file descriptors by default when exec'ing a new
		700	process. While perl itself laudably sets the close-on-exec flags on new
		701	file handles, most C libraries don't care, and even if all cared, it's
		702	often not possible to set the flag in a race-free manner.
		703
		704	That means some file descriptors can leak through. And since it isn't
		705	possible to know which file descriptors are "good" and "necessary" (or
		706	even to know which file descriptors are open), there is no good way to
		707	close the ones that might harm.
		708
		709	As an example of what "harm" can be done consider a web server that
		710	accepts connections and afterwards some module uses AnyEvent::Fork for the
		711	first time, causing it to fork and exec a new process, which might inherit
		712	the network socket. When the server closes the socket, it is still open
		713	in the child (which doesn't even know that) and the client might conclude
		714	that the connection is still fine.
		715
		716	For the main program, there are multiple remedies available -
		717	L<AnyEvent::Fork::Early> is one, creating a process early and not using
		718	C<new_exec> is another, as in both cases, the first process can be exec'ed
		719	well before many random file descriptors are open.
		720
		721	In general, the solution for these kind of problems is to fix the
		722	libraries or the code that leaks those file descriptors.
		723
		724	Fortunately, most of these leaked descriptors do no harm, other than
		725	sitting on some resources.
		726
		727	=item "leaked" file descriptors for fork'ed processes
		728
		729	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
		730	which closes file descriptors not marked for being inherited.
		731
		732	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
		733	a way to create these processes by forking, and this leaks more file
		734	descriptors than exec'ing them, as there is no way to mark descriptors as
		735	"close on fork".
		736
		737	An example would be modules like L<EV>, L<IO::AIO> or L<Gtk2>. Both create
		738	pipes for internal uses, and L<Gtk2> might open a connection to the X
		739	server. L<EV> and L<IO::AIO> can deal with fork, but Gtk2 might have
		740	trouble with a fork.
		741
		742	The solution is to either not load these modules before use'ing
		743	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
		744	initialising them, for example, by calling C<init Gtk2> manually.
		745
		746	=back
		747
		748	=head1 PORTABILITY NOTES
		749
		750	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a nop,
		751	and ::Template is not going to work), and it cost a lot of blood and sweat
		752	to make it so, mostly due to the bloody broken perl that nobody seems to
		753	care about. The fork emulation is a bad joke - I have yet to see something
		754	useful that you can do with it without running into memory corruption
		755	issues or other braindamage. Hrrrr.
		756
		757	Cygwin perl is not supported at the moment, as it should implement fd
		758	passing, but doesn't, and rolling my own is hard, as cygwin doesn't
		759	support enough functionality to do it.
		760
		761	=head1 SEE ALSO
		762
		763	L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
		764	L<AnyEvent::Fork::Template> (to create a process by forking the main
		765	program at a convenient time).
197		766
198	=head1 AUTHOR	767	=head1 AUTHOR
199		768
200	Marc Lehmann <schmorp@schmorp.de>	769	Marc Lehmann <schmorp@schmorp.de>
201	http://home.schmorp.de/	770	http://home.schmorp.de/

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/Fork.pm (file contents): Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs. Revision 1.18 by root, Sat Apr 6 01:33:56 2013 UTC

Diff Legend

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.18 by root, Sat Apr 6 01:33:56 2013 UTC