[ViewVC] Diff of: cvs/AnyEvent-Fork/README

Comparing AnyEvent-Fork/README (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.12 by root, Wed Jan 26 16:44:16 2022 UTC

		1	NAME
		2	AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
		3
		4	SYNOPSIS
		5	use AnyEvent::Fork;
		6
		7	AnyEvent::Fork
		8	->new
		9	->require ("MyModule")
		10	->run ("MyModule::server", my $cv = AE::cv);
		11
		12	my $fh = $cv->recv;
		13
		14	DESCRIPTION
		15	This module allows you to create new processes, without actually forking
		16	them from your current process (avoiding the problems of forking), but
		17	preserving most of the advantages of fork.
		18
		19	It can be used to create new worker processes or new independent
		20	subprocesses for short- and long-running jobs, process pools (e.g. for
		21	use in pre-forked servers) but also to spawn new external processes
		22	(such as CGI scripts from a web server), which can be faster (and more
		23	well behaved) than using fork+exec in big processes.
		24
		25	Special care has been taken to make this module useful from other
		26	modules, while still supporting specialised environments such as
		27	App::Staticperl or PAR::Packer.
		28
		29	WHAT THIS MODULE IS NOT
		30	This module only creates processes and lets you pass file handles and
		31	strings to it, and run perl code. It does not implement any kind of RPC
		32	- there is no back channel from the process back to you, and there is no
		33	RPC or message passing going on.
		34
		35	If you need some form of RPC, you could use the AnyEvent::Fork::RPC
		36	companion module, which adds simple RPC/job queueing to a process
		37	created by this module.
		38
		39	And if you need some automatic process pool management on top of
		40	AnyEvent::Fork::RPC, you can look at the AnyEvent::Fork::Pool companion
		41	module.
		42
		43	Or you can implement it yourself in whatever way you like: use some
		44	message-passing module such as AnyEvent::MP, some pipe such as
		45	AnyEvent::ZeroMQ, use AnyEvent::Handle on both sides to send e.g. JSON
		46	or Storable messages, and so on.
		47
		48	COMPARISON TO OTHER MODULES
		49	There is an abundance of modules on CPAN that do "something fork", such
		50	as Parallel::ForkManager, AnyEvent::ForkManager, AnyEvent::Worker or
		51	AnyEvent::Subprocess. There are modules that implement their own process
		52	management, such as AnyEvent::DBI.
		53
		54	The problems that all these modules try to solve are real, however, none
		55	of them (from what I have seen) tackle the very real problems of
		56	unwanted memory sharing, efficiency or not being able to use event
		57	processing, GUI toolkits or similar modules in the processes they
		58	create.
		59
		60	This module doesn't try to replace any of them - instead it tries to
		61	solve the problem of creating processes with a minimum of fuss and
		62	overhead (and also luxury). Ideally, most of these would use
		63	AnyEvent::Fork internally, except they were written before AnyEvent:Fork
		64	was available, so obviously had to roll their own.
		65
		66	PROBLEM STATEMENT
		67	There are two traditional ways to implement parallel processing on UNIX
		68	like operating systems - fork and process, and fork+exec and process.
		69	They have different advantages and disadvantages that I describe below,
		70	together with how this module tries to mitigate the disadvantages.
		71
		72	Forking from a big process can be very slow.
		73	A 5GB process needs 0.05s to fork on my 3.6GHz amd64 GNU/Linux box.
		74	This overhead is often shared with exec (because you have to fork
		75	first), but in some circumstances (e.g. when vfork is used),
		76	fork+exec can be much faster.
		77
		78	This module can help here by telling a small(er) helper process to
		79	fork, which is faster then forking the main process, and also uses
		80	vfork where possible. This gives the speed of vfork, with the
		81	flexibility of fork.
		82
		83	Forking usually creates a copy-on-write copy of the parent process.
		84	For example, modules or data files that are loaded will not use
		85	additional memory after a fork. Exec'ing a new process, in contrast,
		86	means modules and data files might need to be loaded again, at extra
		87	CPU and memory cost.
		88
		89	But when forking, you still create a copy of your data structures -
		90	if the program frees them and replaces them by new data, the child
		91	processes will retain the old version even if it isn't used, which
		92	can suddenly and unexpectedly increase memory usage when freeing
		93	memory.
		94
		95	For example, Gtk2::CV is an image viewer optimised for large
		96	directories (millions of pictures). It also forks subprocesses for
		97	thumbnail generation, which inherit the data structure that stores
		98	all file information. If the user changes the directory, it gets
		99	freed in the main process, leaving a copy in the thumbnailer
		100	processes. This can lead to many times the memory usage that would
		101	actually be required. The solution is to fork early (and being
		102	unable to dynamically generate more subprocesses or do this from a
		103	module)... or to use <AnyEvent:Fork>.
		104
		105	There is a trade-off between more sharing with fork (which can be
		106	good or bad), and no sharing with exec.
		107
		108	This module allows the main program to do a controlled fork, and
		109	allows modules to exec processes safely at any time. When creating a
		110	custom process pool you can take advantage of data sharing via fork
		111	without risking to share large dynamic data structures that will
		112	blow up child memory usage.
		113
		114	In other words, this module puts you into control over what is being
		115	shared and what isn't, at all times.
		116
		117	Exec'ing a new perl process might be difficult.
		118	For example, it is not easy to find the correct path to the perl
		119	interpreter - $^X might not be a perl interpreter at all. Worse,
		120	there might not even be a perl binary installed on the system.
		121
		122	This module tries hard to identify the correct path to the perl
		123	interpreter. With a cooperative main program, exec'ing the
		124	interpreter might not even be necessary, but even without help from
		125	the main program, it will still work when used from a module.
		126
		127	Exec'ing a new perl process might be slow, as all necessary modules have
		128	to be loaded from disk again, with no guarantees of success.
		129	Long running processes might run into problems when perl is upgraded
		130	and modules are no longer loadable because they refer to a different
		131	perl version, or parts of a distribution are newer than the ones
		132	already loaded.
		133
		134	This module supports creating pre-initialised perl processes to be
		135	used as a template for new processes at a later time, e.g. for use
		136	in a process pool.
		137
		138	Forking might be impossible when a program is running.
		139	For example, POSIX makes it almost impossible to fork from a
		140	multi-threaded program while doing anything useful in the child - in
		141	fact, if your perl program uses POSIX threads (even indirectly via
		142	e.g. IO::AIO or threads), you cannot call fork on the perl level
		143	anymore without risking memory corruption or worse on a number of
		144	operating systems.
		145
		146	This module can safely fork helper processes at any time, by calling
		147	fork+exec in C, in a POSIX-compatible way (via Proc::FastSpawn).
		148
		149	Parallel processing with fork might be inconvenient or difficult to
		150	implement. Modules might not work in both parent and child.
		151	For example, when a program uses an event loop and creates watchers
		152	it becomes very hard to use the event loop from a child program, as
		153	the watchers already exist but are only meaningful in the parent.
		154	Worse, a module might want to use such a module, not knowing whether
		155	another module or the main program also does, leading to problems.
		156
		157	Apart from event loops, graphical toolkits also commonly fall into
		158	the "unsafe module" category, or just about anything that
		159	communicates with the external world, such as network libraries and
		160	file I/O modules, which usually don't like being copied and then
		161	allowed to continue in two processes.
		162
		163	With this module only the main program is allowed to create new
		164	processes by forking (because only the main program can know when it
		165	is still safe to do so) - all other processes are created via
		166	fork+exec, which makes it possible to use modules such as event
		167	loops or window interfaces safely.
		168
		169	EXAMPLES
		170	This is where the wall of text ends and code speaks.
		171
		172	Create a single new process, tell it to run your worker function.
		173	AnyEvent::Fork
		174	->new
		175	->require ("MyModule")
		176	->run ("MyModule::worker, sub {
		177	my ($master_filehandle) = @_;
		178
		179	# now $master_filehandle is connected to the
		180	# $slave_filehandle in the new process.
		181	});
		182
		183	"MyModule" might look like this:
		184
		185	package MyModule;
		186
		187	sub worker {
		188	my ($slave_filehandle) = @_;
		189
		190	# now $slave_filehandle is connected to the $master_filehandle
		191	# in the original process. have fun!
		192	}
		193
		194	Create a pool of server processes all accepting on the same socket.
		195	# create listener socket
		196	my $listener = ...;
		197
		198	# create a pool template, initialise it and give it the socket
		199	my $pool = AnyEvent::Fork
		200	->new
		201	->require ("Some::Stuff", "My::Server")
		202	->send_fh ($listener);
		203
		204	# now create 10 identical workers
		205	for my $id (1..10) {
		206	$pool
		207	->fork
		208	->send_arg ($id)
		209	->run ("My::Server::run");
		210	}
		211
		212	# now do other things - maybe use the filehandle provided by run
		213	# to wait for the processes to die. or whatever.
		214
		215	"My::Server" might look like this:
		216
		217	package My::Server;
		218
		219	sub run {
		220	my ($slave, $listener, $id) = @_;
		221
		222	close $slave; # we do not use the socket, so close it to save resources
		223
		224	# we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
		225	# or anything we usually couldn't do in a process forked normally.
		226	while (my $socket = $listener->accept) {
		227	# do sth. with new socket
		228	}
		229	}
		230
		231	use AnyEvent::Fork as a faster fork+exec
		232	This runs "/bin/echo hi", with standard output redirected to /tmp/log
		233	and standard error redirected to the communications socket. It is
		234	usually faster than fork+exec, but still lets you prepare the
		235	environment.
		236
		237	open my $output, ">/tmp/log" or die "$!";
		238
		239	AnyEvent::Fork
		240	->new
		241	->eval ('
		242	# compile a helper function for later use
		243	sub run {
		244	my ($fh, $output, @cmd) = @_;
		245
		246	# perl will clear close-on-exec on STDOUT/STDERR
		247	open STDOUT, ">&", $output or die;
		248	open STDERR, ">&", $fh or die;
		249
		250	exec @cmd;
		251	}
		252	')
		253	->send_fh ($output)
		254	->send_arg ("/bin/echo", "hi")
		255	->run ("run", my $cv = AE::cv);
		256
		257	my $stderr = $cv->recv;
		258
		259	For stingy users: put the worker code into a "DATA" section.
		260	When you want to be stingy with files, you can put your code into the
		261	"DATA" section of your module (or program):
		262
		263	use AnyEvent::Fork;
		264
		265	AnyEvent::Fork
		266	->new
		267	->eval (do { local $/; <DATA> })
		268	->run ("doit", sub { ... });
		269
		270	__DATA__
		271
		272	sub doit {
		273	... do something!
		274	}
		275
		276	For stingy standalone programs: do not rely on external files at
		277	all.
		278	For single-file scripts it can be inconvenient to rely on external files
		279	- even when using a "DATA" section, you still need to "exec" an external
		280	perl interpreter, which might not be available when using
		281	App::Staticperl, Urlader or PAR::Packer for example.
		282
		283	Two modules help here - AnyEvent::Fork::Early forks a template process
		284	for all further calls to "new_exec", and AnyEvent::Fork::Template forks
		285	the main program as a template process.
		286
		287	Here is how your main program should look like:
		288
		289	#! perl
		290
		291	# optional, as the very first thing.
		292	# in case modules want to create their own processes.
		293	use AnyEvent::Fork::Early;
		294
		295	# next, load all modules you need in your template process
		296	use Example::My::Module
		297	use Example::Whatever;
		298
		299	# next, put your run function definition and anything else you
		300	# need, but do not use code outside of BEGIN blocks.
		301	sub worker_run {
		302	my ($fh, @args) = @_;
		303	...
		304	}
		305
		306	# now preserve everything so far as AnyEvent::Fork object
		307	# in $TEMPLATE.
		308	use AnyEvent::Fork::Template;
		309
		310	# do not put code outside of BEGIN blocks until here
		311
		312	# now use the $TEMPLATE process in any way you like
		313
		314	# for example: create 10 worker processes
		315	my @worker;
		316	my $cv = AE::cv;
		317	for (1..10) {
		318	$cv->begin;
		319	$TEMPLATE->fork->send_arg ($_)->run ("worker_run", sub {
		320	push @worker, shift;
		321	$cv->end;
		322	});
		323	}
		324	$cv->recv;
		325
		326	CONCEPTS
		327	This module can create new processes either by executing a new perl
		328	process, or by forking from an existing "template" process.
		329
		330	All these processes are called "child processes" (whether they are
		331	direct children or not), while the process that manages them is called
		332	the "parent process".
		333
		334	Each such process comes with its own file handle that can be used to
		335	communicate with it (it's actually a socket - one end in the new
		336	process, one end in the main process), and among the things you can do
		337	in it are load modules, fork new processes, send file handles to it, and
		338	execute functions.
		339
		340	There are multiple ways to create additional processes to execute some
		341	jobs:
		342
		343	fork a new process from the "default" template process, load code, run
		344	it
		345	This module has a "default" template process which it executes when
		346	it is needed the first time. Forking from this process shares the
		347	memory used for the perl interpreter with the new process, but
		348	loading modules takes time, and the memory is not shared with
		349	anything else.
		350
		351	This is ideal for when you only need one extra process of a kind,
		352	with the option of starting and stopping it on demand.
		353
		354	Example:
		355
		356	AnyEvent::Fork
		357	->new
		358	->require ("Some::Module")
		359	->run ("Some::Module::run", sub {
		360	my ($fork_fh) = @_;
		361	});
		362
		363	fork a new template process, load code, then fork processes off of it
		364	and run the code
		365	When you need to have a bunch of processes that all execute the same
		366	(or very similar) tasks, then a good way is to create a new template
		367	process for them, loading all the modules you need, and then create
		368	your worker processes from this new template process.
		369
		370	This way, all code (and data structures) that can be shared (e.g.
		371	the modules you loaded) is shared between the processes, and each
		372	new process consumes relatively little memory of its own.
		373
		374	The disadvantage of this approach is that you need to create a
		375	template process for the sole purpose of forking new processes from
		376	it, but if you only need a fixed number of processes you can create
		377	them, and then destroy the template process.
		378
		379	Example:
		380
		381	my $template = AnyEvent::Fork->new->require ("Some::Module");
		382
		383	for (1..10) {
		384	$template->fork->run ("Some::Module::run", sub {
		385	my ($fork_fh) = @_;
		386	});
		387	}
		388
		389	# at this point, you can keep $template around to fork new processes
		390	# later, or you can destroy it, which causes it to vanish.
		391
		392	execute a new perl interpreter, load some code, run it
		393	This is relatively slow, and doesn't allow you to share memory
		394	between multiple processes.
		395
		396	The only advantage is that you don't have to have a template process
		397	hanging around all the time to fork off some new processes, which
		398	might be an advantage when there are long time spans where no extra
		399	processes are needed.
		400
		401	Example:
		402
		403	AnyEvent::Fork
		404	->new_exec
		405	->require ("Some::Module")
		406	->run ("Some::Module::run", sub {
		407	my ($fork_fh) = @_;
		408	});
		409
		410	THE "AnyEvent::Fork" CLASS
		411	This module exports nothing, and only implements a single class -
		412	"AnyEvent::Fork".
		413
		414	There are two class constructors that both create new processes - "new"
		415	and "new_exec". The "fork" method creates a new process by forking an
		416	existing one and could be considered a third constructor.
		417
		418	Most of the remaining methods deal with preparing the new process, by
		419	loading code, evaluating code and sending data to the new process. They
		420	usually return the process object, so you can chain method calls.
		421
		422	If a process object is destroyed before calling its "run" method, then
		423	the process simply exits. After "run" is called, all responsibility is
		424	passed to the specified function.
		425
		426	As long as there is any outstanding work to be done, process objects
		427	resist being destroyed, so there is no reason to store them unless you
		428	need them later - configure and forget works just fine.
		429
		430	my $proc = new AnyEvent::Fork
		431	Create a new "empty" perl interpreter process and returns its
		432	process object for further manipulation.
		433
		434	The new process is forked from a template process that is kept
		435	around for this purpose. When it doesn't exist yet, it is created by
		436	a call to "new_exec" first and then stays around for future calls.
		437
		438	$new_proc = $proc->fork
		439	Forks $proc, creating a new process, and returns the process object
		440	of the new process.
		441
		442	If any of the "send_" functions have been called before fork, then
		443	they will be cloned in the child. For example, in a pre-forked
		444	server, you might "send_fh" the listening socket into the template
		445	process, and then keep calling "fork" and "run".
		446
		447	my $proc = new_exec AnyEvent::Fork
		448	Create a new "empty" perl interpreter process and returns its
		449	process object for further manipulation.
		450
		451	Unlike the "new" method, this method always spawns a new perl
		452	process (except in some cases, see AnyEvent::Fork::Early for
		453	details). This reduces the amount of memory sharing that is
		454	possible, and is also slower.
		455
		456	You should use "new" whenever possible, except when having a
		457	template process around is unacceptable.
		458
		459	The path to the perl interpreter is divined using various methods -
		460	first $^X is investigated to see if the path ends with something
		461	that looks as if it were the perl interpreter. Failing this, the
		462	module falls back to using $Config::Config{perlpath}.
		463
		464	The path to perl can also be overridden by setting the global
		465	variable $AnyEvent::Fork::PERL - it's value will be used for all
		466	subsequent invocations.
		467
		468	$pid = $proc->pid
		469	Returns the process id of the process *iff it is a direct child of
		470	the process running AnyEvent::Fork*, and "undef" otherwise. As a
		471	general rule (that you cannot rely upon), processes created via
		472	"new_exec", AnyEvent::Fork::Early or AnyEvent::Fork::Template are
		473	direct children, while all other processes are not.
		474
		475	Or in other words, you do not normally have to take care of zombies
		476	for processes created via "new", but when in doubt, or zombies are a
		477	problem, you need to check whether a process is a diretc child by
		478	calling this method, and possibly creating a child watcher or reap
		479	it manually.
		480
		481	$proc = $proc->eval ($perlcode, @args)
		482	Evaluates the given $perlcode as ... Perl code, while setting @_ to
		483	the strings specified by @args, in the "main" package (so you can
		484	access the args using $_[0] and so on, but not using implicit "shit"
		485	as the latter works on @ARGV).
		486
		487	This call is meant to do any custom initialisation that might be
		488	required (for example, the "require" method uses it). It's not
		489	supposed to be used to completely take over the process, use "run"
		490	for that.
		491
		492	The code will usually be executed after this call returns, and there
		493	is no way to pass anything back to the calling process. Any
		494	evaluation errors will be reported to stderr and cause the process
		495	to exit.
		496
		497	If you want to execute some code (that isn't in a module) to take
		498	over the process, you should compile a function via "eval" first,
		499	and then call it via "run". This also gives you access to any
		500	arguments passed via the "send_xxx" methods, such as file handles.
		501	See the "use AnyEvent::Fork as a faster fork+exec" example to see it
		502	in action.
		503
		504	Returns the process object for easy chaining of method calls.
		505
		506	It's common to want to call an iniitalisation function with some
		507	arguments. Make sure you actually pass @_ to that function (for
		508	example by using &name syntax), and do not just specify a function
		509	name:
		510
		511	$proc->eval ('&MyModule::init', $string1, $string2);
		512
		513	$proc = $proc->require ($module, ...)
		514	Tries to load the given module(s) into the process
		515
		516	Returns the process object for easy chaining of method calls.
		517
		518	$proc = $proc->send_fh ($handle, ...)
		519	Send one or more file handles (not file descriptors) to the
		520	process, to prepare a call to "run".
		521
		522	The process object keeps a reference to the handles until they have
		523	been passed over to the process, so you must not explicitly close
		524	the handles. This is most easily accomplished by simply not storing
		525	the file handles anywhere after passing them to this method - when
		526	AnyEvent::Fork is finished using them, perl will automatically close
		527	them.
		528
		529	Returns the process object for easy chaining of method calls.
		530
		531	Example: pass a file handle to a process, and release it without
		532	closing. It will be closed automatically when it is no longer used.
		533
		534	$proc->send_fh ($my_fh);
		535	undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
		536
		537	$proc = $proc->send_arg ($string, ...)
		538	Send one or more argument strings to the process, to prepare a call
		539	to "run". The strings can be any octet strings.
		540
		541	The protocol is optimised to pass a moderate number of relatively
		542	short strings - while you can pass up to 4GB of data in one go, this
		543	is more meant to pass some ID information or other startup info, not
		544	big chunks of data.
		545
		546	Returns the process object for easy chaining of method calls.
		547
		548	$proc->run ($func, $cb->($fh))
		549	Enter the function specified by the function name in $func in the
		550	process. The function is called with the communication socket as
		551	first argument, followed by all file handles and string arguments
		552	sent earlier via "send_fh" and "send_arg" methods, in the order they
		553	were called.
		554
		555	The process object becomes unusable on return from this function -
		556	any further method calls result in undefined behaviour.
		557
		558	The function name should be fully qualified, but if it isn't, it
		559	will be looked up in the "main" package.
		560
		561	If the called function returns, doesn't exist, or any error occurs,
		562	the process exits.
		563
		564	Preparing the process is done in the background - when all commands
		565	have been sent, the callback is invoked with the local
		566	communications socket as argument. At this point you can start using
		567	the socket in any way you like.
		568
		569	If the communication socket isn't used, it should be closed on both
		570	sides, to save on kernel memory.
		571
		572	The socket is non-blocking in the parent, and blocking in the newly
		573	created process. The close-on-exec flag is set in both.
		574
		575	Even if not used otherwise, the socket can be a good indicator for
		576	the existence of the process - if the other process exits, you get a
		577	readable event on it, because exiting the process closes the socket
		578	(if it didn't create any children using fork).
		579
		580	Compatibility to AnyEvent::Fork::Remote
		581	If you want to write code that works with both this module and
		582	AnyEvent::Fork::Remote, you need to write your code so that it
		583	assumes there are two file handles for communications, which
		584	might not be unix domain sockets. The "run" function should
		585	start like this:
		586
		587	sub run {
		588	my ($rfh, @args) = @_; # @args is your normal arguments
		589	my $wfh = fileno $rfh ? $rfh : *STDOUT;
		590
		591	# now use $rfh for reading and $wfh for writing
		592	}
		593
		594	This checks whether the passed file handle is, in fact, the
		595	process "STDIN" handle. If it is, then the function was invoked
		596	visa AnyEvent::Fork::Remote, so STDIN should be used for reading
		597	and "STDOUT" should be used for writing.
		598
		599	In all other cases, the function was called via this module, and
		600	there is only one file handle that should be sued for reading
		601	and writing.
		602
		603	Example: create a template for a process pool, pass a few strings,
		604	some file handles, then fork, pass one more string, and run some
		605	code.
		606
		607	my $pool = AnyEvent::Fork
		608	->new
		609	->send_arg ("str1", "str2")
		610	->send_fh ($fh1, $fh2);
		611
		612	for (1..2) {
		613	$pool
		614	->fork
		615	->send_arg ("str3")
		616	->run ("Some::function", sub {
		617	my ($fh) = @_;
		618
		619	# fh is nonblocking, but we trust that the OS can accept these
		620	# few octets anyway.
		621	syswrite $fh, "hi #$_\n";
		622
		623	# $fh is being closed here, as we don't store it anywhere
		624	});
		625	}
		626
		627	# Some::function might look like this - all parameters passed before fork
		628	# and after will be passed, in order, after the communications socket.
		629	sub Some::function {
		630	my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
		631
		632	print scalar <$fh>; # prints "hi #1\n" and "hi #2\n" in any order
		633	}
		634
		635	CHILD PROCESS INTERFACE
		636	This module has a limited API for use in child processes.
		637
		638	@args = AnyEvent::Fork::Serve::run_args
		639	This function, which only exists before the "run" method is called,
		640	returns the arguments that would be passed to the run function, and
		641	clears them.
		642
		643	This is mainly useful to get any file handles passed via "send_fh",
		644	but works for any arguments passed via "send_xxx" methods.
		645
		646	EXPERIMENTAL METHODS
		647	These methods might go away completely or change behaviour, at any time.
		648
		649	$proc->to_fh ($cb->($fh)) # EXPERIMENTAL, MIGHT BE REMOVED
		650	Flushes all commands out to the process and then calls the callback
		651	with the communications socket.
		652
		653	The process object becomes unusable on return from this function -
		654	any further method calls result in undefined behaviour.
		655
		656	The point of this method is to give you a file handle that you can
		657	pass to another process. In that other process, you can call
		658	"new_from_fh AnyEvent::Fork $fh" to create a new "AnyEvent::Fork"
		659	object from it, thereby effectively passing a fork object to another
		660	process.
		661
		662	new_from_fh AnyEvent::Fork $fh # EXPERIMENTAL, MIGHT BE REMOVED
		663	Takes a file handle originally rceeived by the "to_fh" method and
		664	creates a new "AnyEvent:Fork" object. The child process itself will
		665	not change in any way, i.e. it will keep all the modifications done
		666	to it before calling "to_fh".
		667
		668	The new object is very much like the original object, except that
		669	the "pid" method will return "undef" even if the process is a direct
		670	child.
		671
		672	PERFORMANCE
		673	Now for some unscientific benchmark numbers (all done on an amd64
		674	GNU/Linux box). These are intended to give you an idea of the relative
		675	performance you can expect, they are not meant to be absolute
		676	performance numbers.
		677
		678	OK, so, I ran a simple benchmark that creates a socket pair, forks,
		679	calls exit in the child and waits for the socket to close in the parent.
		680	I did load AnyEvent, EV and AnyEvent::Fork, for a total process size of
		681	5100kB.
		682
		683	2079 new processes per second, using manual socketpair + fork
		684
		685	Then I did the same thing, but instead of calling fork, I called
		686	AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
		687	socket from the child to close on exit. This does the same thing as
		688	manual socket pair + fork, except that what is forked is the template
		689	process (2440kB), and the socket needs to be passed to the server at the
		690	other end of the socket first.
		691
		692	2307 new processes per second, using AnyEvent::Fork->new
		693
		694	And finally, using "new_exec" instead "new", using vforks+execs to exec
		695	a new perl interpreter and compile the small server each time, I get:
		696
		697	479 vfork+execs per second, using AnyEvent::Fork->new_exec
		698
		699	So how can "AnyEvent->new" be faster than a standard fork, even though
		700	it uses the same operations, but adds a lot of overhead?
		701
		702	The difference is simply the process size: forking the 5MB process takes
		703	so much longer than forking the 2.5MB template process that the extra
		704	overhead is canceled out.
		705
		706	If the benchmark process grows, the normal fork becomes even slower:
		707
		708	1340 new processes, manual fork of a 20MB process
		709	731 new processes, manual fork of a 200MB process
		710	235 new processes, manual fork of a 2000MB process
		711
		712	What that means (to me) is that I can use this module without having a
		713	bad conscience because of the extra overhead required to start new
		714	processes.
		715
		716	TYPICAL PROBLEMS
		717	This section lists typical problems that remain. I hope by recognising
		718	them, most can be avoided.
		719
		720	leaked file descriptors for exec'ed processes
		721	POSIX systems inherit file descriptors by default when exec'ing a
		722	new process. While perl itself laudably sets the close-on-exec flags
		723	on new file handles, most C libraries don't care, and even if all
		724	cared, it's often not possible to set the flag in a race-free
		725	manner.
		726
		727	That means some file descriptors can leak through. And since it
		728	isn't possible to know which file descriptors are "good" and
		729	"necessary" (or even to know which file descriptors are open), there
		730	is no good way to close the ones that might harm.
		731
		732	As an example of what "harm" can be done consider a web server that
		733	accepts connections and afterwards some module uses AnyEvent::Fork
		734	for the first time, causing it to fork and exec a new process, which
		735	might inherit the network socket. When the server closes the socket,
		736	it is still open in the child (which doesn't even know that) and the
		737	client might conclude that the connection is still fine.
		738
		739	For the main program, there are multiple remedies available -
		740	AnyEvent::Fork::Early is one, creating a process early and not using
		741	"new_exec" is another, as in both cases, the first process can be
		742	exec'ed well before many random file descriptors are open.
		743
		744	In general, the solution for these kind of problems is to fix the
		745	libraries or the code that leaks those file descriptors.
		746
		747	Fortunately, most of these leaked descriptors do no harm, other than
		748	sitting on some resources.
		749
		750	leaked file descriptors for fork'ed processes
		751	Normally, AnyEvent::Fork does start new processes by exec'ing them,
		752	which closes file descriptors not marked for being inherited.
		753
		754	However, AnyEvent::Fork::Early and AnyEvent::Fork::Template offer a
		755	way to create these processes by forking, and this leaks more file
		756	descriptors than exec'ing them, as there is no way to mark
		757	descriptors as "close on fork".
		758
		759	An example would be modules like EV, IO::AIO or Gtk2. Both create
		760	pipes for internal uses, and Gtk2 might open a connection to the X
		761	server. EV and IO::AIO can deal with fork, but Gtk2 might have
		762	trouble with a fork.
		763
		764	The solution is to either not load these modules before use'ing
		765	AnyEvent::Fork::Early or AnyEvent::Fork::Template, or to delay
		766	initialising them, for example, by calling "init Gtk2" manually.
		767
		768	exiting calls object destructors
		769	This only applies to users of AnyEvent::Fork:Early and
		770	AnyEvent::Fork::Template, or when initialising code creates objects
		771	that reference external resources.
		772
		773	When a process created by AnyEvent::Fork exits, it might do so by
		774	calling exit, or simply letting perl reach the end of the program.
		775	At which point Perl runs all destructors.
		776
		777	Not all destructors are fork-safe - for example, an object that
		778	represents the connection to an X display might tell the X server to
		779	free resources, which is inconvenient when the "real" object in the
		780	parent still needs to use them.
		781
		782	This is obviously not a problem for AnyEvent::Fork::Early, as you
		783	used it as the very first thing, right?
		784
		785	It is a problem for AnyEvent::Fork::Template though - and the
		786	solution is to not create objects with nontrivial destructors that
		787	might have an effect outside of Perl.
		788
		789	PORTABILITY NOTES
		790	Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a
		791	nop, and ::Template is not going to work), and it cost a lot of blood
		792	and sweat to make it so, mostly due to the bloody broken perl that
		793	nobody seems to care about. The fork emulation is a bad joke - I have
		794	yet to see something useful that you can do with it without running into
		795	memory corruption issues or other braindamage. Hrrrr.
		796
		797	Since fork is endlessly broken on win32 perls (it doesn't even remotely
		798	work within it's documented limits) and quite obviously it's not getting
		799	improved any time soon, the best way to proceed on windows would be to
		800	always use "new_exec" and thus never rely on perl's fork "emulation".
		801
		802	Cygwin perl is not supported at the moment due to some hilarious
		803	shortcomings of its API - see IO::FDPoll for more details. If you never
		804	use "send_fh" and always use "new_exec" to create processes, it should
		805	work though.
		806
		807	USING AnyEvent::Fork IN SUBPROCESSES
		808	AnyEvent::Fork itself cannot generally be used in subprocesses. As long
		809	as only one process ever forks new processes, sharing the template
		810	processes is possible (you could use a pipe as a lock by writing a byte
		811	into it to unlock, and reading the byte to lock for example)
		812
		813	To make concurrent calls possible after fork, you should get rid of the
		814	template and early fork processes. AnyEvent::Fork will create a new
		815	template process as needed.
		816
		817	undef $AnyEvent::Fork::EARLY;
		818	undef $AnyEvent::Fork::TEMPLATE;
		819
		820	It doesn't matter whether you get rid of them in the parent or child
		821	after a fork.
		822
		823	SEE ALSO
		824	AnyEvent::Fork::Early, to avoid executing a perl interpreter at all
		825	(part of this distribution).
		826
		827	AnyEvent::Fork::Template, to create a process by forking the main
		828	program at a convenient time (part of this distribution).
		829
		830	AnyEvent::Fork::Remote, for another way to create processes that is
		831	mostly compatible to this module and modules building on top of it, but
		832	works better with remote processes.
		833
		834	AnyEvent::Fork::RPC, for simple RPC to child processes (on CPAN).
		835
		836	AnyEvent::Fork::Pool, for simple worker process pool (on CPAN).
		837
		838	AUTHOR AND CONTACT INFORMATION
		839	Marc Lehmann <schmorp@schmorp.de>
		840	http://software.schmorp.de/pkg/AnyEvent-Fork
		841

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork/README (file contents): Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs. Revision 1.12 by root, Wed Jan 26 16:44:16 2022 UTC

Diff Legend

Comparing AnyEvent-Fork/README (file contents):
Revision 1.1 by root, Sun Mar 31 03:21:27 2013 UTC vs.
Revision 1.12 by root, Wed Jan 26 16:44:16 2022 UTC