ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-Fork/README
Revision: 1.7
Committed: Sun Apr 21 12:26:00 2013 UTC (11 years ago) by root
Branch: MAIN
CVS Tags: rel-1_0
Changes since 1.6: +108 -4 lines
Log Message:
1.0

File Contents

# User Rev Content
1 root 1.2 NAME
2     AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
3    
4     SYNOPSIS
5     use AnyEvent::Fork;
6    
7 root 1.5 AnyEvent::Fork
8     ->new
9     ->require ("MyModule")
10     ->run ("MyModule::server", my $cv = AE::cv);
11    
12     my $fh = $cv->recv;
13    
14     DESCRIPTION
15     This module allows you to create new processes, without actually forking
16     them from your current process (avoiding the problems of forking), but
17     preserving most of the advantages of fork.
18    
19     It can be used to create new worker processes or new independent
20     subprocesses for short- and long-running jobs, process pools (e.g. for
21     use in pre-forked servers) but also to spawn new external processes
22     (such as CGI scripts from a web server), which can be faster (and more
23     well behaved) than using fork+exec in big processes.
24    
25     Special care has been taken to make this module useful from other
26     modules, while still supporting specialised environments such as
27     App::Staticperl or PAR::Packer.
28    
29     WHAT THIS MODULE IS NOT
30     This module only creates processes and lets you pass file handles and
31     strings to it, and run perl code. It does not implement any kind of RPC
32     - there is no back channel from the process back to you, and there is no
33     RPC or message passing going on.
34    
35 root 1.6 If you need some form of RPC, you could use the AnyEvent::Fork::RPC
36     companion module, which adds simple RPC/job queueing to a process
37     created by this module.
38    
39 root 1.7 And if you need some automatic process pool management on top of
40     AnyEvent::Fork::RPC, you can look at the AnyEvent::Fork::Pool companion
41     module.
42    
43     Or you can implement it yourself in whatever way you like: use some
44 root 1.6 message-passing module such as AnyEvent::MP, some pipe such as
45     AnyEvent::ZeroMQ, use AnyEvent::Handle on both sides to send e.g. JSON
46     or Storable messages, and so on.
47 root 1.5
48     COMPARISON TO OTHER MODULES
49     There is an abundance of modules on CPAN that do "something fork", such
50     as Parallel::ForkManager, AnyEvent::ForkManager, AnyEvent::Worker or
51     AnyEvent::Subprocess. There are modules that implement their own process
52     management, such as AnyEvent::DBI.
53    
54     The problems that all these modules try to solve are real, however, none
55     of them (from what I have seen) tackle the very real problems of
56     unwanted memory sharing, efficiency, not being able to use event
57     processing or similar modules in the processes they create.
58    
59     This module doesn't try to replace any of them - instead it tries to
60     solve the problem of creating processes with a minimum of fuss and
61     overhead (and also luxury). Ideally, most of these would use
62     AnyEvent::Fork internally, except they were written before AnyEvent:Fork
63     was available, so obviously had to roll their own.
64    
65     PROBLEM STATEMENT
66     There are two traditional ways to implement parallel processing on UNIX
67     like operating systems - fork and process, and fork+exec and process.
68     They have different advantages and disadvantages that I describe below,
69     together with how this module tries to mitigate the disadvantages.
70    
71     Forking from a big process can be very slow.
72     A 5GB process needs 0.05s to fork on my 3.6GHz amd64 GNU/Linux box.
73     This overhead is often shared with exec (because you have to fork
74     first), but in some circumstances (e.g. when vfork is used),
75     fork+exec can be much faster.
76    
77     This module can help here by telling a small(er) helper process to
78     fork, which is faster then forking the main process, and also uses
79     vfork where possible. This gives the speed of vfork, with the
80     flexibility of fork.
81 root 1.2
82 root 1.5 Forking usually creates a copy-on-write copy of the parent process.
83     For example, modules or data files that are loaded will not use
84     additional memory after a fork. When exec'ing a new process, modules
85     and data files might need to be loaded again, at extra CPU and
86     memory cost. But when forking, literally all data structures are
87     copied - if the program frees them and replaces them by new data,
88     the child processes will retain the old version even if it isn't
89     used, which can suddenly and unexpectedly increase memory usage when
90     freeing memory.
91    
92     The trade-off is between more sharing with fork (which can be good
93     or bad), and no sharing with exec.
94    
95     This module allows the main program to do a controlled fork, and
96     allows modules to exec processes safely at any time. When creating a
97     custom process pool you can take advantage of data sharing via fork
98     without risking to share large dynamic data structures that will
99     blow up child memory usage.
100    
101     In other words, this module puts you into control over what is being
102     shared and what isn't, at all times.
103    
104     Exec'ing a new perl process might be difficult.
105     For example, it is not easy to find the correct path to the perl
106     interpreter - $^X might not be a perl interpreter at all.
107    
108     This module tries hard to identify the correct path to the perl
109     interpreter. With a cooperative main program, exec'ing the
110     interpreter might not even be necessary, but even without help from
111     the main program, it will still work when used from a module.
112    
113     Exec'ing a new perl process might be slow, as all necessary modules have
114     to be loaded from disk again, with no guarantees of success.
115     Long running processes might run into problems when perl is upgraded
116     and modules are no longer loadable because they refer to a different
117     perl version, or parts of a distribution are newer than the ones
118     already loaded.
119    
120     This module supports creating pre-initialised perl processes to be
121     used as a template for new processes.
122    
123     Forking might be impossible when a program is running.
124     For example, POSIX makes it almost impossible to fork from a
125     multi-threaded program while doing anything useful in the child - in
126     fact, if your perl program uses POSIX threads (even indirectly via
127     e.g. IO::AIO or threads), you cannot call fork on the perl level
128     anymore without risking corruption issues on a number of operating
129     systems.
130    
131     This module can safely fork helper processes at any time, by calling
132     fork+exec in C, in a POSIX-compatible way (via Proc::FastSpawn).
133    
134     Parallel processing with fork might be inconvenient or difficult to
135     implement. Modules might not work in both parent and child.
136     For example, when a program uses an event loop and creates watchers
137     it becomes very hard to use the event loop from a child program, as
138     the watchers already exist but are only meaningful in the parent.
139     Worse, a module might want to use such a module, not knowing whether
140     another module or the main program also does, leading to problems.
141    
142     Apart from event loops, graphical toolkits also commonly fall into
143     the "unsafe module" category, or just about anything that
144     communicates with the external world, such as network libraries and
145     file I/O modules, which usually don't like being copied and then
146     allowed to continue in two processes.
147    
148     With this module only the main program is allowed to create new
149     processes by forking (because only the main program can know when it
150     is still safe to do so) - all other processes are created via
151     fork+exec, which makes it possible to use modules such as event
152     loops or window interfaces safely.
153    
154     EXAMPLES
155     Create a single new process, tell it to run your worker function.
156 root 1.2 AnyEvent::Fork
157     ->new
158     ->require ("MyModule")
159     ->run ("MyModule::worker, sub {
160     my ($master_filehandle) = @_;
161    
162     # now $master_filehandle is connected to the
163     # $slave_filehandle in the new process.
164     });
165    
166 root 1.5 "MyModule" might look like this:
167    
168     package MyModule;
169    
170     sub worker {
171 root 1.2 my ($slave_filehandle) = @_;
172    
173     # now $slave_filehandle is connected to the $master_filehandle
174     # in the original prorcess. have fun!
175     }
176    
177 root 1.5 Create a pool of server processes all accepting on the same socket.
178 root 1.2 # create listener socket
179     my $listener = ...;
180    
181     # create a pool template, initialise it and give it the socket
182     my $pool = AnyEvent::Fork
183     ->new
184     ->require ("Some::Stuff", "My::Server")
185     ->send_fh ($listener);
186    
187     # now create 10 identical workers
188     for my $id (1..10) {
189     $pool
190     ->fork
191     ->send_arg ($id)
192     ->run ("My::Server::run");
193     }
194    
195     # now do other things - maybe use the filehandle provided by run
196     # to wait for the processes to die. or whatever.
197    
198 root 1.5 "My::Server" might look like this:
199    
200     package My::Server;
201    
202     sub run {
203 root 1.2 my ($slave, $listener, $id) = @_;
204    
205     close $slave; # we do not use the socket, so close it to save resources
206    
207     # we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
208     # or anything we usually couldn't do in a process forked normally.
209     while (my $socket = $listener->accept) {
210     # do sth. with new socket
211     }
212     }
213    
214 root 1.5 use AnyEvent::Fork as a faster fork+exec
215 root 1.6 This runs "/bin/echo hi", with standard output redirected to /tmp/log
216 root 1.5 and standard error redirected to the communications socket. It is
217     usually faster than fork+exec, but still lets you prepare the
218     environment.
219 root 1.2
220 root 1.5 open my $output, ">/tmp/log" or die "$!";
221 root 1.2
222 root 1.5 AnyEvent::Fork
223     ->new
224     ->eval ('
225     # compile a helper function for later use
226     sub run {
227     my ($fh, $output, @cmd) = @_;
228    
229     # perl will clear close-on-exec on STDOUT/STDERR
230     open STDOUT, ">&", $output or die;
231     open STDERR, ">&", $fh or die;
232    
233     exec @cmd;
234     }
235     ')
236     ->send_fh ($output)
237     ->send_arg ("/bin/echo", "hi")
238     ->run ("run", my $cv = AE::cv);
239 root 1.4
240 root 1.5 my $stderr = $cv->recv;
241 root 1.2
242 root 1.7 For stingy users: put the worker code into a "DATA" section.
243     When you want to be stingy with files, you cna put your code into the
244     "DATA" section of your module (or program):
245    
246     use AnyEvent::Fork;
247    
248     AnyEvent::Fork
249     ->new
250     ->eval (do { local $/; <DATA> })
251     ->run ("doit", sub { ... });
252    
253     __DATA__
254    
255     sub doit {
256     ... do something!
257     }
258    
259     For stingy standalone programs: do not rely on external files at
260     all.
261     For single-file scripts it can be inconvenient to rely on external files
262     - even when using < "DATA" section, you still need to "exec" an external
263     perl interpreter, which might not be available when using
264     App::Staticperl, Urlader or PAR::Packer for example.
265    
266     Two modules help here - AnyEvent::Fork::Early forks a template process
267     for all further calls to "new_exec", and AnyEvent::Fork::Template forks
268     the main program as a template process.
269    
270     Here is how your main program should look like:
271    
272     #! perl
273    
274     # optional, as the very first thing.
275     # in case modules want to create their own processes.
276     use AnyEvent::Fork::Early;
277    
278     # next, load all modules you need in your template process
279     use Example::My::Module
280     use Example::Whatever;
281    
282     # next, put your run function definition and anything else you
283     # need, but do not use code outside of BEGIN blocks.
284     sub worker_run {
285     my ($fh, @args) = @_;
286     ...
287     }
288    
289     # now preserve everything so far as AnyEvent::Fork object
290     # in §TEMPLATE.
291     use AnyEvent::Fork::Template;
292    
293     # do not put code outside of BEGIN blocks until here
294    
295     # now use the $TEMPLATE process in any way you like
296    
297     # for example: create 10 worker processes
298     my @worker;
299     my $cv = AE::cv;
300     for (1..10) {
301     $cv->begin;
302     $TEMPLATE->fork->send_arg ($_)->run ("worker_run", sub {
303     push @worker, shift;
304     $cv->end;
305     });
306     }
307     $cv->recv;
308    
309     lhead1 CONCEPTS
310    
311 root 1.2 This module can create new processes either by executing a new perl
312     process, or by forking from an existing "template" process.
313    
314 root 1.6 All these processes are called "child processes" (whether they are
315     direct children or not), while the process that manages them is called
316     the "parent process".
317    
318 root 1.2 Each such process comes with its own file handle that can be used to
319     communicate with it (it's actually a socket - one end in the new
320     process, one end in the main process), and among the things you can do
321     in it are load modules, fork new processes, send file handles to it, and
322     execute functions.
323    
324     There are multiple ways to create additional processes to execute some
325     jobs:
326    
327     fork a new process from the "default" template process, load code, run
328     it
329     This module has a "default" template process which it executes when
330     it is needed the first time. Forking from this process shares the
331     memory used for the perl interpreter with the new process, but
332     loading modules takes time, and the memory is not shared with
333     anything else.
334    
335     This is ideal for when you only need one extra process of a kind,
336 root 1.4 with the option of starting and stopping it on demand.
337 root 1.2
338     Example:
339    
340     AnyEvent::Fork
341     ->new
342     ->require ("Some::Module")
343     ->run ("Some::Module::run", sub {
344     my ($fork_fh) = @_;
345     });
346    
347     fork a new template process, load code, then fork processes off of it
348     and run the code
349     When you need to have a bunch of processes that all execute the same
350     (or very similar) tasks, then a good way is to create a new template
351     process for them, loading all the modules you need, and then create
352     your worker processes from this new template process.
353    
354     This way, all code (and data structures) that can be shared (e.g.
355     the modules you loaded) is shared between the processes, and each
356     new process consumes relatively little memory of its own.
357    
358     The disadvantage of this approach is that you need to create a
359     template process for the sole purpose of forking new processes from
360 root 1.4 it, but if you only need a fixed number of processes you can create
361 root 1.2 them, and then destroy the template process.
362    
363     Example:
364    
365     my $template = AnyEvent::Fork->new->require ("Some::Module");
366    
367     for (1..10) {
368     $template->fork->run ("Some::Module::run", sub {
369     my ($fork_fh) = @_;
370     });
371     }
372    
373     # at this point, you can keep $template around to fork new processes
374     # later, or you can destroy it, which causes it to vanish.
375    
376     execute a new perl interpreter, load some code, run it
377     This is relatively slow, and doesn't allow you to share memory
378     between multiple processes.
379    
380     The only advantage is that you don't have to have a template process
381     hanging around all the time to fork off some new processes, which
382     might be an advantage when there are long time spans where no extra
383     processes are needed.
384    
385     Example:
386    
387     AnyEvent::Fork
388     ->new_exec
389     ->require ("Some::Module")
390     ->run ("Some::Module::run", sub {
391     my ($fork_fh) = @_;
392     });
393    
394 root 1.5 THE "AnyEvent::Fork" CLASS
395     This module exports nothing, and only implements a single class -
396     "AnyEvent::Fork".
397    
398     There are two class constructors that both create new processes - "new"
399     and "new_exec". The "fork" method creates a new process by forking an
400     existing one and could be considered a third constructor.
401    
402     Most of the remaining methods deal with preparing the new process, by
403     loading code, evaluating code and sending data to the new process. They
404     usually return the process object, so you can chain method calls.
405    
406     If a process object is destroyed before calling its "run" method, then
407     the process simply exits. After "run" is called, all responsibility is
408     passed to the specified function.
409    
410     As long as there is any outstanding work to be done, process objects
411     resist being destroyed, so there is no reason to store them unless you
412     need them later - configure and forget works just fine.
413    
414 root 1.6 my $proc = new AnyEvent::Fork
415 root 1.2 Create a new "empty" perl interpreter process and returns its
416     process object for further manipulation.
417    
418     The new process is forked from a template process that is kept
419     around for this purpose. When it doesn't exist yet, it is created by
420 root 1.5 a call to "new_exec" first and then stays around for future calls.
421 root 1.2
422 root 1.6 $new_proc = $proc->fork
423 root 1.2 Forks $proc, creating a new process, and returns the process object
424     of the new process.
425    
426     If any of the "send_" functions have been called before fork, then
427     they will be cloned in the child. For example, in a pre-forked
428     server, you might "send_fh" the listening socket into the template
429     process, and then keep calling "fork" and "run".
430    
431 root 1.6 my $proc = new_exec AnyEvent::Fork
432 root 1.2 Create a new "empty" perl interpreter process and returns its
433     process object for further manipulation.
434    
435     Unlike the "new" method, this method *always* spawns a new perl
436     process (except in some cases, see AnyEvent::Fork::Early for
437     details). This reduces the amount of memory sharing that is
438     possible, and is also slower.
439    
440     You should use "new" whenever possible, except when having a
441     template process around is unacceptable.
442    
443 root 1.4 The path to the perl interpreter is divined using various methods -
444 root 1.2 first $^X is investigated to see if the path ends with something
445     that sounds as if it were the perl interpreter. Failing this, the
446     module falls back to using $Config::Config{perlpath}.
447    
448 root 1.6 $pid = $proc->pid
449 root 1.4 Returns the process id of the process *iff it is a direct child of
450 root 1.5 the process running AnyEvent::Fork*, and "undef" otherwise.
451 root 1.4
452     Normally, only processes created via "AnyEvent::Fork->new_exec" and
453     AnyEvent::Fork::Template are direct children, and you are
454     responsible to clean up their zombies when they die.
455    
456     All other processes are not direct children, and will be cleaned up
457 root 1.5 by AnyEvent::Fork itself.
458    
459 root 1.6 $proc = $proc->eval ($perlcode, @args)
460     Evaluates the given $perlcode as ... Perl code, while setting @_ to
461 root 1.5 the strings specified by @args, in the "main" package.
462 root 1.2
463     This call is meant to do any custom initialisation that might be
464     required (for example, the "require" method uses it). It's not
465     supposed to be used to completely take over the process, use "run"
466     for that.
467    
468     The code will usually be executed after this call returns, and there
469     is no way to pass anything back to the calling process. Any
470     evaluation errors will be reported to stderr and cause the process
471     to exit.
472    
473 root 1.5 If you want to execute some code (that isn't in a module) to take
474     over the process, you should compile a function via "eval" first,
475     and then call it via "run". This also gives you access to any
476     arguments passed via the "send_xxx" methods, such as file handles.
477     See the "use AnyEvent::Fork as a faster fork+exec" example to see it
478     in action.
479    
480 root 1.2 Returns the process object for easy chaining of method calls.
481    
482 root 1.6 $proc = $proc->require ($module, ...)
483 root 1.2 Tries to load the given module(s) into the process
484    
485     Returns the process object for easy chaining of method calls.
486    
487 root 1.6 $proc = $proc->send_fh ($handle, ...)
488 root 1.2 Send one or more file handles (*not* file descriptors) to the
489     process, to prepare a call to "run".
490    
491 root 1.5 The process object keeps a reference to the handles until they have
492     been passed over to the process, so you must not explicitly close
493     the handles. This is most easily accomplished by simply not storing
494     the file handles anywhere after passing them to this method - when
495     AnyEvent::Fork is finished using them, perl will automatically close
496     them.
497 root 1.2
498     Returns the process object for easy chaining of method calls.
499    
500 root 1.4 Example: pass a file handle to a process, and release it without
501     closing. It will be closed automatically when it is no longer used.
502 root 1.2
503     $proc->send_fh ($my_fh);
504     undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
505    
506 root 1.6 $proc = $proc->send_arg ($string, ...)
507 root 1.2 Send one or more argument strings to the process, to prepare a call
508 root 1.5 to "run". The strings can be any octet strings.
509 root 1.2
510 root 1.4 The protocol is optimised to pass a moderate number of relatively
511     short strings - while you can pass up to 4GB of data in one go, this
512     is more meant to pass some ID information or other startup info, not
513     big chunks of data.
514    
515     Returns the process object for easy chaining of method calls.
516 root 1.2
517 root 1.6 $proc->run ($func, $cb->($fh))
518 root 1.5 Enter the function specified by the function name in $func in the
519     process. The function is called with the communication socket as
520 root 1.2 first argument, followed by all file handles and string arguments
521     sent earlier via "send_fh" and "send_arg" methods, in the order they
522     were called.
523    
524 root 1.5 The process object becomes unusable on return from this function -
525     any further method calls result in undefined behaviour.
526 root 1.2
527 root 1.5 The function name should be fully qualified, but if it isn't, it
528     will be looked up in the "main" package.
529 root 1.2
530 root 1.5 If the called function returns, doesn't exist, or any error occurs,
531     the process exits.
532    
533     Preparing the process is done in the background - when all commands
534     have been sent, the callback is invoked with the local
535     communications socket as argument. At this point you can start using
536     the socket in any way you like.
537 root 1.2
538     If the communication socket isn't used, it should be closed on both
539     sides, to save on kernel memory.
540    
541     The socket is non-blocking in the parent, and blocking in the newly
542 root 1.5 created process. The close-on-exec flag is set in both.
543    
544     Even if not used otherwise, the socket can be a good indicator for
545     the existence of the process - if the other process exits, you get a
546     readable event on it, because exiting the process closes the socket
547     (if it didn't create any children using fork).
548 root 1.2
549     Example: create a template for a process pool, pass a few strings,
550     some file handles, then fork, pass one more string, and run some
551     code.
552    
553     my $pool = AnyEvent::Fork
554     ->new
555     ->send_arg ("str1", "str2")
556     ->send_fh ($fh1, $fh2);
557    
558     for (1..2) {
559     $pool
560     ->fork
561     ->send_arg ("str3")
562     ->run ("Some::function", sub {
563     my ($fh) = @_;
564    
565     # fh is nonblocking, but we trust that the OS can accept these
566 root 1.5 # few octets anyway.
567 root 1.2 syswrite $fh, "hi #$_\n";
568    
569     # $fh is being closed here, as we don't store it anywhere
570     });
571     }
572    
573     # Some::function might look like this - all parameters passed before fork
574     # and after will be passed, in order, after the communications socket.
575     sub Some::function {
576     my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
577    
578 root 1.5 print scalar <$fh>; # prints "hi #1\n" and "hi #2\n" in any order
579 root 1.2 }
580    
581 root 1.7 $proc->to_fh ($cb->($fh)) # EXPERIMENTAL, MIGHT BE REMOVED
582     Flushes all commands out to the process and then calls the callback
583     with the communications socket.
584    
585     The process object becomes unusable on return from this function -
586     any further method calls result in undefined behaviour.
587    
588     The point of this method is to give you a file handle thta you cna
589     pass to another process. In that other process, you can call
590     "new_from_fh AnyEvent::Fork" to create a new "AnyEvent::Fork" object
591     from it, thereby effectively passing a fork object to another
592     process.
593    
594     new_from_fh AnyEvent::Fork $fh # EXPERIMENTAL, MIGHT BE REMOVED
595     Takes a file handle originally rceeived by the "to_fh" method and
596     creates a new "AnyEvent:Fork" object. The child process itself will
597     not change in any way, i.e. it will keep all the modifications done
598     to it before calling "to_fh".
599    
600     The new object is very much like the original object, except that
601     the "pid" method will return "undef" even if the process is a direct
602     child.
603    
604 root 1.4 PERFORMANCE
605     Now for some unscientific benchmark numbers (all done on an amd64
606     GNU/Linux box). These are intended to give you an idea of the relative
607     performance you can expect, they are not meant to be absolute
608     performance numbers.
609    
610     OK, so, I ran a simple benchmark that creates a socket pair, forks,
611     calls exit in the child and waits for the socket to close in the parent.
612     I did load AnyEvent, EV and AnyEvent::Fork, for a total process size of
613     5100kB.
614    
615     2079 new processes per second, using manual socketpair + fork
616    
617     Then I did the same thing, but instead of calling fork, I called
618     AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
619 root 1.7 socket from the child to close on exit. This does the same thing as
620 root 1.4 manual socket pair + fork, except that what is forked is the template
621     process (2440kB), and the socket needs to be passed to the server at the
622     other end of the socket first.
623    
624     2307 new processes per second, using AnyEvent::Fork->new
625    
626     And finally, using "new_exec" instead "new", using vforks+execs to exec
627     a new perl interpreter and compile the small server each time, I get:
628    
629     479 vfork+execs per second, using AnyEvent::Fork->new_exec
630    
631     So how can "AnyEvent->new" be faster than a standard fork, even though
632     it uses the same operations, but adds a lot of overhead?
633    
634 root 1.5 The difference is simply the process size: forking the 5MB process takes
635     so much longer than forking the 2.5MB template process that the extra
636 root 1.6 overhead is canceled out.
637 root 1.4
638     If the benchmark process grows, the normal fork becomes even slower:
639    
640 root 1.5 1340 new processes, manual fork of a 20MB process
641     731 new processes, manual fork of a 200MB process
642     235 new processes, manual fork of a 2000MB process
643 root 1.4
644     What that means (to me) is that I can use this module without having a
645 root 1.5 bad conscience because of the extra overhead required to start new
646 root 1.4 processes.
647    
648 root 1.3 TYPICAL PROBLEMS
649     This section lists typical problems that remain. I hope by recognising
650     them, most can be avoided.
651    
652 root 1.5 leaked file descriptors for exec'ed processes
653 root 1.3 POSIX systems inherit file descriptors by default when exec'ing a
654     new process. While perl itself laudably sets the close-on-exec flags
655     on new file handles, most C libraries don't care, and even if all
656     cared, it's often not possible to set the flag in a race-free
657     manner.
658    
659     That means some file descriptors can leak through. And since it
660     isn't possible to know which file descriptors are "good" and
661 root 1.4 "necessary" (or even to know which file descriptors are open), there
662     is no good way to close the ones that might harm.
663 root 1.3
664     As an example of what "harm" can be done consider a web server that
665     accepts connections and afterwards some module uses AnyEvent::Fork
666     for the first time, causing it to fork and exec a new process, which
667     might inherit the network socket. When the server closes the socket,
668     it is still open in the child (which doesn't even know that) and the
669     client might conclude that the connection is still fine.
670    
671     For the main program, there are multiple remedies available -
672     AnyEvent::Fork::Early is one, creating a process early and not using
673     "new_exec" is another, as in both cases, the first process can be
674     exec'ed well before many random file descriptors are open.
675    
676     In general, the solution for these kind of problems is to fix the
677     libraries or the code that leaks those file descriptors.
678    
679 root 1.4 Fortunately, most of these leaked descriptors do no harm, other than
680 root 1.3 sitting on some resources.
681    
682 root 1.5 leaked file descriptors for fork'ed processes
683 root 1.3 Normally, AnyEvent::Fork does start new processes by exec'ing them,
684     which closes file descriptors not marked for being inherited.
685    
686     However, AnyEvent::Fork::Early and AnyEvent::Fork::Template offer a
687     way to create these processes by forking, and this leaks more file
688     descriptors than exec'ing them, as there is no way to mark
689     descriptors as "close on fork".
690    
691     An example would be modules like EV, IO::AIO or Gtk2. Both create
692     pipes for internal uses, and Gtk2 might open a connection to the X
693     server. EV and IO::AIO can deal with fork, but Gtk2 might have
694     trouble with a fork.
695    
696     The solution is to either not load these modules before use'ing
697     AnyEvent::Fork::Early or AnyEvent::Fork::Template, or to delay
698     initialising them, for example, by calling "init Gtk2" manually.
699    
700 root 1.5 exiting calls object destructors
701     This only applies to users of AnyEvent::Fork:Early and
702 root 1.6 AnyEvent::Fork::Template, or when initialising code creates objects
703 root 1.5 that reference external resources.
704 root 1.4
705     When a process created by AnyEvent::Fork exits, it might do so by
706     calling exit, or simply letting perl reach the end of the program.
707     At which point Perl runs all destructors.
708    
709     Not all destructors are fork-safe - for example, an object that
710     represents the connection to an X display might tell the X server to
711     free resources, which is inconvenient when the "real" object in the
712     parent still needs to use them.
713    
714     This is obviously not a problem for AnyEvent::Fork::Early, as you
715     used it as the very first thing, right?
716    
717     It is a problem for AnyEvent::Fork::Template though - and the
718     solution is to not create objects with nontrivial destructors that
719     might have an effect outside of Perl.
720    
721 root 1.2 PORTABILITY NOTES
722     Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a
723     nop, and ::Template is not going to work), and it cost a lot of blood
724     and sweat to make it so, mostly due to the bloody broken perl that
725     nobody seems to care about. The fork emulation is a bad joke - I have
726 root 1.4 yet to see something useful that you can do with it without running into
727 root 1.2 memory corruption issues or other braindamage. Hrrrr.
728    
729 root 1.7 Since fork is endlessly broken on win32 perls (it doesn't even remotely
730     work within it's documented limits) and quite obviously it's not getting
731     improved any time soon, the best way to proceed on windows would be to
732     always use "new_exec" and thus never rely on perl's fork "emulation".
733    
734 root 1.5 Cygwin perl is not supported at the moment due to some hilarious
735 root 1.7 shortcomings of its API - see IO::FDPoll for more details. If you never
736     use "send_fh" and always use "new_exec" to create processes, it should
737     work though.
738 root 1.2
739 root 1.3 SEE ALSO
740 root 1.6 AnyEvent::Fork::Early, to avoid executing a perl interpreter at all
741     (part of this distribution).
742 root 1.3
743 root 1.6 AnyEvent::Fork::Template, to create a process by forking the main
744     program at a convenient time (part of this distribution).
745    
746     AnyEvent::Fork::RPC, for simple RPC to child processes (on CPAN).
747    
748 root 1.7 AnyEvent::Fork::Pool, for simple worker process pool (on CPAN).
749    
750 root 1.6 AUTHOR AND CONTACT INFORMATION
751 root 1.2 Marc Lehmann <schmorp@schmorp.de>
752 root 1.6 http://software.schmorp.de/pkg/AnyEvent-Fork
753 root 1.5