[ViewVC] Diff of: cvs/AnyEvent-Fork/Fork.pm

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.17 by root, Fri Apr 5 23:42:24 2013 UTC

…		…
1	=head1 NAME	1	=head1 NAME
2		2
3	AnyEvent::ProcessPool - manage pools of perl worker processes, exec'ed or fork'ed	3	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
4		4
5	=head1 SYNOPSIS	5	=head1 SYNOPSIS
6		6
7	use AnyEvent::ProcessPool;	7	use AnyEvent::Fork;
		8
		9	##################################################################
		10	# create a single new process, tell it to run your worker function
		11
		12	AnyEvent::Fork
		13	->new
		14	->require ("MyModule")
		15	->run ("MyModule::worker, sub {
		16	my ($master_filehandle) = @_;
		17
		18	# now $master_filehandle is connected to the
		19	# $slave_filehandle in the new process.
		20	});
		21
		22	# MyModule::worker might look like this
		23	sub MyModule::worker {
		24	my ($slave_filehandle) = @_;
		25
		26	# now $slave_filehandle is connected to the $master_filehandle
		27	# in the original prorcess. have fun!
		28	}
		29
		30	##################################################################
		31	# create a pool of server processes all accepting on the same socket
		32
		33	# create listener socket
		34	my $listener = ...;
		35
		36	# create a pool template, initialise it and give it the socket
		37	my $pool = AnyEvent::Fork
		38	->new
		39	->require ("Some::Stuff", "My::Server")
		40	->send_fh ($listener);
		41
		42	# now create 10 identical workers
		43	for my $id (1..10) {
		44	$pool
		45	->fork
		46	->send_arg ($id)
		47	->run ("My::Server::run");
		48	}
		49
		50	# now do other things - maybe use the filehandle provided by run
		51	# to wait for the processes to die. or whatever.
		52
		53	# My::Server::run might look like this
		54	sub My::Server::run {
		55	my ($slave, $listener, $id) = @_;
		56
		57	close $slave; # we do not use the socket, so close it to save resources
		58
		59	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
		60	# or anything we usually couldn't do in a process forked normally.
		61	while (my $socket = $listener->accept) {
		62	# do sth. with new socket
		63	}
		64	}
8		65
9	=head1 DESCRIPTION	66	=head1 DESCRIPTION
10		67
11	This module allows you to create single worker processes but also worker	68	This module allows you to create new processes, without actually forking
12	pool that share memory, by forking from the main program, or exec'ing new	69	them from your current process (avoiding the problems of forking), but
13	perl interpreters from a module.	70	preserving most of the advantages of fork.
14		71
15	You create a new processes in a pool by specifying a function to call	72	It can be used to create new worker processes or new independent
16	with any combination of string values and file handles.	73	subprocesses for short- and long-running jobs, process pools (e.g. for use
		74	in pre-forked servers) but also to spawn new external processes (such as
		75	CGI scripts from a web server), which can be faster (and more well behaved)
		76	than using fork+exec in big processes.
17		77
18	A pool can have initialisation code which is executed before forking. The	78	Special care has been taken to make this module useful from other modules,
19	initialisation code is only executed once and the resulting process is	79	while still supporting specialised environments such as L<App::Staticperl>
20	cached, to be used as a template.	80	or L<PAR::Packer>.
21		81
22	Pools without such initialisation code don't cache an extra process.	82	=head1 WHAT THIS MODULE IS NOT
		83
		84	This module only creates processes and lets you pass file handles and
		85	strings to it, and run perl code. It does not implement any kind of RPC -
		86	there is no back channel from the process back to you, and there is no RPC
		87	or message passing going on.
		88
		89	If you need some form of RPC, you can either implement it yourself
		90	in whatever way you like, use some message-passing module such
		91	as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use
		92	L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
		93	and so on.
23		94
24	=head1 PROBLEM STATEMENT	95	=head1 PROBLEM STATEMENT
25		96
26	There are two ways to implement parallel processing on UNIX like operating	97	There are two ways to implement parallel processing on UNIX like operating
27	systems - fork and process, and fork+exec and process. They have different	98	systems - fork and process, and fork+exec and process. They have different
39	or fork+exec instead.	110	or fork+exec instead.
40		111
41	=item Forking usually creates a copy-on-write copy of the parent	112	=item Forking usually creates a copy-on-write copy of the parent
42	process. Memory (for example, modules or data files that have been	113	process. Memory (for example, modules or data files that have been
43	will not take additional memory). When exec'ing a new process, modules	114	will not take additional memory). When exec'ing a new process, modules
44	and data files might need to be loaded again, at extra cpu and memory	115	and data files might need to be loaded again, at extra CPU and memory
45	cost. Likewise when forking, all data structures are copied as well - if	116	cost. Likewise when forking, all data structures are copied as well - if
46	the program frees them and replaces them by new data, the child processes	117	the program frees them and replaces them by new data, the child processes
47	will retain the memory even if it isn't used.	118	will retain the memory even if it isn't used.
48		119
49	This module allows the main program to do a controlled fork, and allows	120	This module allows the main program to do a controlled fork, and allows
…		…
61	as template, and also tries hard to identify the correct path to the perl	132	as template, and also tries hard to identify the correct path to the perl
62	interpreter. With a cooperative main program, exec'ing the interpreter	133	interpreter. With a cooperative main program, exec'ing the interpreter
63	might not even be necessary.	134	might not even be necessary.
64		135
65	=item Forking might be impossible when a program is running. For example,	136	=item Forking might be impossible when a program is running. For example,
66	POSIX makes it almost impossible to fork from a multithreaded program and	137	POSIX makes it almost impossible to fork from a multi-threaded program and
67	do anything useful in the child - strictly speaking, if your perl program	138	do anything useful in the child - strictly speaking, if your perl program
68	uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),	139	uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),
69	you cannot call fork on the perl level anymore, at all.	140	you cannot call fork on the perl level anymore, at all.
70		141
71	This module can safely fork helper processes at any time, by caling	142	This module can safely fork helper processes at any time, by calling
72	fork+exec in C, in a POSIX-compatible way.	143	fork+exec in C, in a POSIX-compatible way.
73		144
74	=item Parallel processing with fork might be inconvenient or difficult	145	=item Parallel processing with fork might be inconvenient or difficult
75	to implement. For example, when a program uses an event loop and creates	146	to implement. For example, when a program uses an event loop and creates
76	watchers it becomes very hard to use the event loop from a child	147	watchers it becomes very hard to use the event loop from a child
…		…
83	pools are created by fork+exec, after which such modules can again be	154	pools are created by fork+exec, after which such modules can again be
84	loaded.	155	loaded.
85		156
86	=back	157	=back
87		158
		159	=head1 CONCEPTS
		160
		161	This module can create new processes either by executing a new perl
		162	process, or by forking from an existing "template" process.
		163
		164	Each such process comes with its own file handle that can be used to
		165	communicate with it (it's actually a socket - one end in the new process,
		166	one end in the main process), and among the things you can do in it are
		167	load modules, fork new processes, send file handles to it, and execute
		168	functions.
		169
		170	There are multiple ways to create additional processes to execute some
		171	jobs:
		172
88	=over 4	173	=over 4
89		174
90	=cut	175	=item fork a new process from the "default" template process, load code,
		176	run it
91		177
		178	This module has a "default" template process which it executes when it is
		179	needed the first time. Forking from this process shares the memory used
		180	for the perl interpreter with the new process, but loading modules takes
		181	time, and the memory is not shared with anything else.
		182
		183	This is ideal for when you only need one extra process of a kind, with the
		184	option of starting and stopping it on demand.
		185
		186	Example:
		187
		188	AnyEvent::Fork
		189	->new
		190	->require ("Some::Module")
		191	->run ("Some::Module::run", sub {
		192	my ($fork_fh) = @_;
		193	});
		194
		195	=item fork a new template process, load code, then fork processes off of
		196	it and run the code
		197
		198	When you need to have a bunch of processes that all execute the same (or
		199	very similar) tasks, then a good way is to create a new template process
		200	for them, loading all the modules you need, and then create your worker
		201	processes from this new template process.
		202
		203	This way, all code (and data structures) that can be shared (e.g. the
		204	modules you loaded) is shared between the processes, and each new process
		205	consumes relatively little memory of its own.
		206
		207	The disadvantage of this approach is that you need to create a template
		208	process for the sole purpose of forking new processes from it, but if you
		209	only need a fixed number of processes you can create them, and then destroy
		210	the template process.
		211
		212	Example:
		213
		214	my $template = AnyEvent::Fork->new->require ("Some::Module");
		215
		216	for (1..10) {
		217	$template->fork->run ("Some::Module::run", sub {
		218	my ($fork_fh) = @_;
		219	});
		220	}
		221
		222	# at this point, you can keep $template around to fork new processes
		223	# later, or you can destroy it, which causes it to vanish.
		224
		225	=item execute a new perl interpreter, load some code, run it
		226
		227	This is relatively slow, and doesn't allow you to share memory between
		228	multiple processes.
		229
		230	The only advantage is that you don't have to have a template process
		231	hanging around all the time to fork off some new processes, which might be
		232	an advantage when there are long time spans where no extra processes are
		233	needed.
		234
		235	Example:
		236
		237	AnyEvent::Fork
		238	->new_exec
		239	->require ("Some::Module")
		240	->run ("Some::Module::run", sub {
		241	my ($fork_fh) = @_;
		242	});
		243
		244	=back
		245
		246	=head1 FUNCTIONS
		247
		248	=over 4
		249
		250	=cut
		251
92	package AnyEvent::ProcessPool;	252	package AnyEvent::Fork;
93		253
94	use common::sense;	254	use common::sense;
95		255
96	use Socket ();	256	use Socket ();
97		257
98	use Proc::FastSpawn;
99	use AnyEvent;	258	use AnyEvent;
100	use AnyEvent::ProcessPool::Util;
101	use AnyEvent::Util ();	259	use AnyEvent::Util ();
102		260
103	BEGIN {	261	use IO::FDPass;
104	# require Exporter;
105	}
106		262
		263	our $VERSION = 0.2;
		264
		265	our $PERL; # the path to the perl interpreter, deduces with various forms of magic
		266
107	=item my $pool = new AnyEvent::ProcessPool key => value...	267	=item my $pool = new AnyEvent::Fork key => value...
108		268
109	Create a new process pool. The following named parameters are supported:	269	Create a new process pool. The following named parameters are supported:
110		270
111	=over 4	271	=over 4
112		272
113	=back	273	=back
114		274
115	=cut	275	=cut
116		276
		277	# the early fork template process
		278	our $EARLY;
		279
117	# the template process	280	# the empty template process
118	our $template;	281	our $TEMPLATE;
119		282
120	sub _queue {	283	sub _cmd {
		284	my $self = shift;
		285
		286	#TODO: maybe append the packet to any existing string command already in the queue
		287
		288	# ideally, we would want to use "a (w/a)*" as format string, but perl versions
		289	# from at least 5.8.9 to 5.16.3 are all buggy and can't unpack it.
		290	push @{ $self->[2] }, pack "L/a", pack "(w/a)*", @_;
		291
		292	$self->[3] \|\|= AE::io $self->[1], 1, sub {
		293	# send the next "thing" in the queue - either a reference to an fh,
		294	# or a plain string.
		295
		296	if (ref $self->[2][0]) {
		297	# send fh
		298	IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }
		299	and shift @{ $self->[2] };
		300
		301	} else {
		302	# send string
		303	my $len = syswrite $self->[1], $self->[2][0]
		304	or do { undef $self->[3]; die "AnyEvent::Fork: command write failure: $!" };
		305
		306	substr $self->[2][0], 0, $len, "";
		307	shift @{ $self->[2] } unless length $self->[2][0];
		308	}
		309
		310	unless (@{ $self->[2] }) {
		311	undef $self->[3];
		312	# invoke run callback
		313	$self->[0]->($self->[1]) if $self->[0];
		314	}
		315	};
		316
		317	() # make sure we don't leak the watcher
		318	}
		319
		320	sub _new {
121	my ($pid, $fh) = @_;	321	my ($self, $fh) = @_;
122		322
123	[	323	AnyEvent::Util::fh_nonblocking $fh, 1;
124	$pid,	324
		325	$self = bless [
		326	undef, # run callback
125	$fh,	327	$fh,
126	[],	328	[], # write queue - strings or fd's
127	undef	329	undef, # AE watcher
128	]	330	], $self;
129	}
130		331
131	sub queue_cmd {	332	$self
132	my ($queue, $cmd) = @_;
133
134	push @{ $queue->[2] }, pack "N/a", $cmd;
135
136	$queue->[3] \|\|= AE::io $queue->[1], 1, sub {
137	warn "oopl0 ", scalar @{ $queue->[2] };
138	if (ref $queue->[2][0]) {
139	warn "oopla2\n";#d#
140	AnyEvent::ProcessPool::Util::fd_send fileno $queue->[1], fileno ${ $queue->[2][0] }
141	and shift @{ $queue->[2] };
142	} else {
143	warn "write ", length $queue->[2][0];#d#
144	my $len = syswrite $queue->[1], $queue->[2][0]
145	or die "AnyEvent::ProcessPool::queue write failure: $!";
146	substr $queue->[2][0], 0, $len, "";
147	shift @{ $queue->[2] } unless length $queue->[2][0];
148	}
149
150	undef $queue->[3] unless @{ $queue->[2] };
151	warn "oopl3 ", scalar @{ $queue->[2] };
152	warn "oopl4 $queue->[3]\n";#d#
153	};
154	}	333	}
155		334
156	sub run_template {	335	# fork template from current process, used by AnyEvent::Fork::Early/Template
157	return if $template;	336	sub _new_fork {
158
159	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;	337	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
160	AnyEvent::Util::fh_nonblocking $fh, 1;	338	my $parent = $$;
161	fd_inherit fileno $slave;
162		339
163	my %env = %ENV;	340	my $pid = fork;
164	$env{PERL5LIB} = join ":", grep !ref, @INC;
165		341
166	my $pid = spawn	342	if ($pid eq 0) {
167	$^X,	343	require AnyEvent::Fork::Serve;
168	["perl", "-MAnyEvent::ProcessPool::Serve", "-e", "AnyEvent::ProcessPool::Serve::me", fileno $slave],	344	$AnyEvent::Fork::Serve::OWNER = $parent;
169	[map "$_=$env{$_}", keys %env],	345	close $fh;
170	or die "unable to spawn AnyEvent::ProcessPool server: $!";	346	$0 = "$_[1] of $parent";
		347	$SIG{CHLD} = 'IGNORE';
		348	AnyEvent::Fork::Serve::serve ($slave);
		349	exit 0;
		350	} elsif (!$pid) {
		351	die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
		352	}
171		353
172	close $slave;	354	AnyEvent::Fork->_new ($fh)
173
174	$template = _queue $pid, $fh;
175
176	my ($a, $b) = AnyEvent::Util::portable_socketpair;
177
178	queue_cmd $template, "Iabc";
179	push @{ $template->[2] }, \$b;
180
181	use Coro::AnyEvent; Coro::AnyEvent::sleep 1;
182	}	355	}
		356
		357	=item my $proc = new AnyEvent::Fork
		358
		359	Create a new "empty" perl interpreter process and returns its process
		360	object for further manipulation.
		361
		362	The new process is forked from a template process that is kept around
		363	for this purpose. When it doesn't exist yet, it is created by a call to
		364	C<new_exec> and kept around for future calls.
		365
		366	When the process object is destroyed, it will release the file handle
		367	that connects it with the new process. When the new process has not yet
		368	called C<run>, then the process will exit. Otherwise, what happens depends
		369	entirely on the code that is executed.
		370
		371	=cut
183		372
184	sub new {	373	sub new {
185	my $class = shift;	374	my $class = shift;
186		375
187	my $self = bless {	376	$TEMPLATE \|\|= $class->new_exec;
188	@_	377	$TEMPLATE->fork
189	}, $class;	378	}
190		379
191	run_template;	380	=item $new_proc = $proc->fork
		381
		382	Forks C<$proc>, creating a new process, and returns the process object
		383	of the new process.
		384
		385	If any of the C<send_> functions have been called before fork, then they
		386	will be cloned in the child. For example, in a pre-forked server, you
		387	might C<send_fh> the listening socket into the template process, and then
		388	keep calling C<fork> and C<run>.
		389
		390	=cut
		391
		392	sub fork {
		393	my ($self) = @_;
		394
		395	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
		396
		397	$self->send_fh ($slave);
		398	$self->_cmd ("f");
		399
		400	AnyEvent::Fork->_new ($fh)
		401	}
		402
		403	=item my $proc = new_exec AnyEvent::Fork
		404
		405	Create a new "empty" perl interpreter process and returns its process
		406	object for further manipulation.
		407
		408	Unlike the C<new> method, this method I<always> spawns a new perl process
		409	(except in some cases, see L<AnyEvent::Fork::Early> for details). This
		410	reduces the amount of memory sharing that is possible, and is also slower.
		411
		412	You should use C<new> whenever possible, except when having a template
		413	process around is unacceptable.
		414
		415	The path to the perl interpreter is divined using various methods - first
		416	C<$^X> is investigated to see if the path ends with something that sounds
		417	as if it were the perl interpreter. Failing this, the module falls back to
		418	using C<$Config::Config{perlpath}>.
		419
		420	=cut
		421
		422	sub new_exec {
		423	my ($self) = @_;
		424
		425	return $EARLY->fork
		426	if $EARLY;
		427
		428	# first find path of perl
		429	my $perl = $;
		430
		431	# first we try $^X, but the path must be absolute (always on win32), and end in sth.
		432	# that looks like perl. this obviously only works for posix and win32
		433	unless (
		434	($^O eq "MSWin32" \|\| $perl =~ m%^/%)
		435	&& $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i
		436	) {
		437	# if it doesn't look perlish enough, try Config
		438	require Config;
		439	$perl = $Config::Config{perlpath};
		440	$perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;
		441	}
		442
		443	require Proc::FastSpawn;
		444
		445	my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
		446	Proc::FastSpawn::fd_inherit (fileno $slave);
		447
		448	# new fh's should always be set cloexec (due to $^F),
		449	# but hey, not on win32, so we always clear the inherit flag.
		450	Proc::FastSpawn::fd_inherit (fileno $fh, 0);
		451
		452	# quick. also doesn't work in win32. of course. what did you expect
		453	#local $ENV{PERL5LIB} = join ":", grep !ref, @INC;
		454	my %env = %ENV;
		455	$env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;
		456
		457	Proc::FastSpawn::spawn (
		458	$perl,
		459	["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],
		460	[map "$_=$env{$_}", keys %env],
		461	) or die "unable to spawn AnyEvent::Fork server: $!";
		462
		463	$self->_new ($fh)
		464	}
		465
		466	=item $proc = $proc->eval ($perlcode, @args)
		467
		468	Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to
		469	the strings specified by C<@args>.
		470
		471	This call is meant to do any custom initialisation that might be required
		472	(for example, the C<require> method uses it). It's not supposed to be used
		473	to completely take over the process, use C<run> for that.
		474
		475	The code will usually be executed after this call returns, and there is no
		476	way to pass anything back to the calling process. Any evaluation errors
		477	will be reported to stderr and cause the process to exit.
		478
		479	Returns the process object for easy chaining of method calls.
		480
		481	=cut
		482
		483	sub eval {
		484	my ($self, $code, @args) = @_;
		485
		486	$self->_cmd (e => $code, @args);
192		487
193	$self	488	$self
194	}	489	}
195		490
		491	=item $proc = $proc->require ($module, ...)
		492
		493	Tries to load the given module(s) into the process
		494
		495	Returns the process object for easy chaining of method calls.
		496
		497	=cut
		498
		499	sub require {
		500	my ($self, @modules) = @_;
		501
		502	s%::%/%g for @modules;
		503	$self->eval ('require "$_.pm" for @_', @modules);
		504
		505	$self
		506	}
		507
		508	=item $proc = $proc->send_fh ($handle, ...)
		509
		510	Send one or more file handles (I<not> file descriptors) to the process,
		511	to prepare a call to C<run>.
		512
		513	The process object keeps a reference to the handles until this is done,
		514	so you must not explicitly close the handles. This is most easily
		515	accomplished by simply not storing the file handles anywhere after passing
		516	them to this method.
		517
		518	Returns the process object for easy chaining of method calls.
		519
		520	Example: pass a file handle to a process, and release it without
		521	closing. It will be closed automatically when it is no longer used.
		522
		523	$proc->send_fh ($my_fh);
		524	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
		525
		526	=cut
		527
		528	sub send_fh {
		529	my ($self, @fh) = @_;
		530
		531	for my $fh (@fh) {
		532	$self->_cmd ("h");
		533	push @{ $self->[2] }, \$fh;
		534	}
		535
		536	$self
		537	}
		538
		539	=item $proc = $proc->send_arg ($string, ...)
		540
		541	Send one or more argument strings to the process, to prepare a call to
		542	C<run>. The strings can be any octet string.
		543
		544	Returns the process object for easy chaining of method calls.
		545
		546	=cut
		547
		548	sub send_arg {
		549	my ($self, @arg) = @_;
		550
		551	$self->_cmd (a => @arg);
		552
		553	$self
		554	}
		555
		556	=item $proc->run ($func, $cb->($fh))
		557
		558	Enter the function specified by the fully qualified name in C<$func> in
		559	the process. The function is called with the communication socket as first
		560	argument, followed by all file handles and string arguments sent earlier
		561	via C<send_fh> and C<send_arg> methods, in the order they were called.
		562
		563	If the called function returns, the process exits.
		564
		565	Preparing the process can take time - when the process is ready, the
		566	callback is invoked with the local communications socket as argument.
		567
		568	The process object becomes unusable on return from this function.
		569
		570	If the communication socket isn't used, it should be closed on both sides,
		571	to save on kernel memory.
		572
		573	The socket is non-blocking in the parent, and blocking in the newly
		574	created process. The close-on-exec flag is set on both. Even if not used
		575	otherwise, the socket can be a good indicator for the existence of the
		576	process - if the other process exits, you get a readable event on it,
		577	because exiting the process closes the socket (if it didn't create any
		578	children using fork).
		579
		580	Example: create a template for a process pool, pass a few strings, some
		581	file handles, then fork, pass one more string, and run some code.
		582
		583	my $pool = AnyEvent::Fork
		584	->new
		585	->send_arg ("str1", "str2")
		586	->send_fh ($fh1, $fh2);
		587
		588	for (1..2) {
		589	$pool
		590	->fork
		591	->send_arg ("str3")
		592	->run ("Some::function", sub {
		593	my ($fh) = @_;
		594
		595	# fh is nonblocking, but we trust that the OS can accept these
		596	# extra 3 octets anyway.
		597	syswrite $fh, "hi #$_\n";
		598
		599	# $fh is being closed here, as we don't store it anywhere
		600	});
		601	}
		602
		603	# Some::function might look like this - all parameters passed before fork
		604	# and after will be passed, in order, after the communications socket.
		605	sub Some::function {
		606	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
		607
		608	print scalar <$fh>; # prints "hi 1\n" and "hi 2\n"
		609	}
		610
		611	=cut
		612
		613	sub run {
		614	my ($self, $func, $cb) = @_;
		615
		616	$self->[0] = $cb;
		617	$self->_cmd (r => $func);
		618	}
		619
196	=back	620	=back
		621
		622	=head1 PERFORMANCE
		623
		624	Now for some unscientific benchmark numbers (all done on an amd64
		625	GNU/Linux box). These are intended to give you an idea of the relative
		626	performance you can expect.
		627
		628	OK, so, I ran a simple benchmark that creates a socket pair, forks, calls
		629	exit in the child and waits for the socket to close in the parent. I did
		630	load AnyEvent, EV and AnyEvent::Fork, for a total process size of 6312kB.
		631
		632	2079 new processes per second, using socketpair + fork manually
		633
		634	Then I did the same thing, but instead of calling fork, I called
		635	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
		636	socket form the child to close on exit. This does the same thing as manual
		637	socket pair + fork, except that what is forked is the template process
		638	(2440kB), and the socket needs to be passed to the server at the other end
		639	of the socket first.
		640
		641	2307 new processes per second, using AnyEvent::Fork->new
		642
		643	And finally, using C<new_exec> instead C<new>, using vforks+execs to exec
		644	a new perl interpreter and compile the small server each time, I get:
		645
		646	479 vfork+execs per second, using AnyEvent::Fork->new_exec
		647
		648	So how can C<< AnyEvent->new >> be faster than a standard fork, even
		649	though it uses the same operations, but adds a lot of overhead?
		650
		651	The difference is simply the process size: forking the 6MB process takes
		652	so much longer than forking the 2.5MB template process that the overhead
		653	introduced is canceled out.
		654
		655	If the benchmark process grows, the normal fork becomes even slower:
		656
		657	1340 new processes, manual fork in a 20MB process
		658	731 new processes, manual fork in a 200MB process
		659	235 new processes, manual fork in a 2000MB process
		660
		661	What that means (to me) is that I can use this module without having a
		662	very bad conscience because of the extra overhead required to start new
		663	processes.
		664
		665	=head1 TYPICAL PROBLEMS
		666
		667	This section lists typical problems that remain. I hope by recognising
		668	them, most can be avoided.
		669
		670	=over 4
		671
		672	=item exit runs destructors
		673
		674	=item "leaked" file descriptors for exec'ed processes
		675
		676	POSIX systems inherit file descriptors by default when exec'ing a new
		677	process. While perl itself laudably sets the close-on-exec flags on new
		678	file handles, most C libraries don't care, and even if all cared, it's
		679	often not possible to set the flag in a race-free manner.
		680
		681	That means some file descriptors can leak through. And since it isn't
		682	possible to know which file descriptors are "good" and "necessary" (or
		683	even to know which file descriptors are open), there is no good way to
		684	close the ones that might harm.
		685
		686	As an example of what "harm" can be done consider a web server that
		687	accepts connections and afterwards some module uses AnyEvent::Fork for the
		688	first time, causing it to fork and exec a new process, which might inherit
		689	the network socket. When the server closes the socket, it is still open
		690	in the child (which doesn't even know that) and the client might conclude
		691	that the connection is still fine.
		692
		693	For the main program, there are multiple remedies available -
		694	L<AnyEvent::Fork::Early> is one, creating a process early and not using
		695	C<new_exec> is another, as in both cases, the first process can be exec'ed
		696	well before many random file descriptors are open.
		697
		698	In general, the solution for these kind of problems is to fix the
		699	libraries or the code that leaks those file descriptors.
		700
		701	Fortunately, most of these leaked descriptors do no harm, other than
		702	sitting on some resources.
		703
		704	=item "leaked" file descriptors for fork'ed processes
		705
		706	Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
		707	which closes file descriptors not marked for being inherited.
		708
		709	However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
		710	a way to create these processes by forking, and this leaks more file
		711	descriptors than exec'ing them, as there is no way to mark descriptors as
		712	"close on fork".
		713
		714	An example would be modules like L<EV>, L<IO::AIO> or L<Gtk2>. Both create
		715	pipes for internal uses, and L<Gtk2> might open a connection to the X
		716	server. L<EV> and L<IO::AIO> can deal with fork, but Gtk2 might have
		717	trouble with a fork.
		718
		719	The solution is to either not load these modules before use'ing
		720	L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
		721	initialising them, for example, by calling C<init Gtk2> manually.
		722
		723	=back
		724
		725	=head1 PORTABILITY NOTES
		726
		727	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a nop,
		728	and ::Template is not going to work), and it cost a lot of blood and sweat
		729	to make it so, mostly due to the bloody broken perl that nobody seems to
		730	care about. The fork emulation is a bad joke - I have yet to see something
		731	useful that you can do with it without running into memory corruption
		732	issues or other braindamage. Hrrrr.
		733
		734	Cygwin perl is not supported at the moment, as it should implement fd
		735	passing, but doesn't, and rolling my own is hard, as cygwin doesn't
		736	support enough functionality to do it.
		737
		738	=head1 SEE ALSO
		739
		740	L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
		741	L<AnyEvent::Fork::Template> (to create a process by forking the main
		742	program at a convenient time).
197		743
198	=head1 AUTHOR	744	=head1 AUTHOR
199		745
200	Marc Lehmann <schmorp@schmorp.de>	746	Marc Lehmann <schmorp@schmorp.de>
201	http://home.schmorp.de/	747	http://home.schmorp.de/

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/Fork.pm (file contents): Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs. Revision 1.17 by root, Fri Apr 5 23:42:24 2013 UTC

Diff Legend

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.17 by root, Fri Apr 5 23:42:24 2013 UTC