ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-Fork/Fork.pm
Revision: 1.24
Committed: Sat Apr 6 08:32:23 2013 UTC (11 years, 2 months ago) by root
Branch: MAIN
Changes since 1.23: +44 -38 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3 root 1.4 AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
4 root 1.1
5     =head1 SYNOPSIS
6    
7 root 1.4 use AnyEvent::Fork;
8 root 1.1
9 root 1.24 AnyEvent::Fork
10     ->new
11     ->require ("MyModule")
12     ->run ("MyModule::server", my $cv = AE::cv);
13    
14     my $fh = $cv->recv;
15    
16     =head1 DESCRIPTION
17    
18     This module allows you to create new processes, without actually forking
19     them from your current process (avoiding the problems of forking), but
20     preserving most of the advantages of fork.
21    
22     It can be used to create new worker processes or new independent
23     subprocesses for short- and long-running jobs, process pools (e.g. for use
24     in pre-forked servers) but also to spawn new external processes (such as
25     CGI scripts from a web server), which can be faster (and more well behaved)
26     than using fork+exec in big processes.
27    
28     Special care has been taken to make this module useful from other modules,
29     while still supporting specialised environments such as L<App::Staticperl>
30     or L<PAR::Packer>.
31    
32     =head1 WHAT THIS MODULE IS NOT
33    
34     This module only creates processes and lets you pass file handles and
35     strings to it, and run perl code. It does not implement any kind of RPC -
36     there is no back channel from the process back to you, and there is no RPC
37     or message passing going on.
38    
39     If you need some form of RPC, you can either implement it yourself
40     in whatever way you like, use some message-passing module such
41     as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use
42     L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
43     and so on.
44    
45     =head1 EXAMPLES
46    
47     =head2 Create a single new process, tell it to run your worker function.
48 root 1.9
49     AnyEvent::Fork
50     ->new
51     ->require ("MyModule")
52     ->run ("MyModule::worker, sub {
53     my ($master_filehandle) = @_;
54    
55     # now $master_filehandle is connected to the
56     # $slave_filehandle in the new process.
57     });
58    
59     # MyModule::worker might look like this
60     sub MyModule::worker {
61     my ($slave_filehandle) = @_;
62    
63     # now $slave_filehandle is connected to the $master_filehandle
64     # in the original prorcess. have fun!
65     }
66    
67 root 1.24 =head2 Create a pool of server processes all accepting on the same socket.
68 root 1.9
69     # create listener socket
70     my $listener = ...;
71    
72     # create a pool template, initialise it and give it the socket
73     my $pool = AnyEvent::Fork
74     ->new
75     ->require ("Some::Stuff", "My::Server")
76     ->send_fh ($listener);
77    
78     # now create 10 identical workers
79     for my $id (1..10) {
80     $pool
81     ->fork
82     ->send_arg ($id)
83     ->run ("My::Server::run");
84     }
85    
86     # now do other things - maybe use the filehandle provided by run
87     # to wait for the processes to die. or whatever.
88    
89     # My::Server::run might look like this
90     sub My::Server::run {
91     my ($slave, $listener, $id) = @_;
92    
93     close $slave; # we do not use the socket, so close it to save resources
94    
95     # we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
96     # or anything we usually couldn't do in a process forked normally.
97     while (my $socket = $listener->accept) {
98     # do sth. with new socket
99     }
100     }
101    
102 root 1.24 =head2 use AnyEvent::Fork as a faster fork+exec
103 root 1.23
104 root 1.24 This runs /bin/echo hi, with stdout redirected to /tmp/log and stderr to
105     the communications socket. It is usually faster than fork+exec, but still
106     let's you prepare the environment.
107 root 1.23
108     open my $output, ">/tmp/log" or die "$!";
109    
110     AnyEvent::Fork
111     ->new
112     ->eval ('
113     sub run {
114     my ($fh, $output, @cmd) = @_;
115    
116     # perl will clear close-on-exec on STDOUT/STDERR
117     open STDOUT, ">&", $output or die;
118     open STDERR, ">&", $fh or die;
119    
120     exec @cmd;
121     }
122     ')
123     ->send_fh ($output)
124     ->send_arg ("/bin/echo", "hi")
125     ->run ("run", my $cv = AE::cv);
126    
127     my $stderr = $cv->recv;
128    
129 root 1.1 =head1 PROBLEM STATEMENT
130    
131     There are two ways to implement parallel processing on UNIX like operating
132     systems - fork and process, and fork+exec and process. They have different
133     advantages and disadvantages that I describe below, together with how this
134     module tries to mitigate the disadvantages.
135    
136     =over 4
137    
138     =item Forking from a big process can be very slow (a 5GB process needs
139     0.05s to fork on my 3.6GHz amd64 GNU/Linux box for example). This overhead
140     is often shared with exec (because you have to fork first), but in some
141     circumstances (e.g. when vfork is used), fork+exec can be much faster.
142    
143     This module can help here by telling a small(er) helper process to fork,
144     or fork+exec instead.
145    
146     =item Forking usually creates a copy-on-write copy of the parent
147     process. Memory (for example, modules or data files that have been
148     will not take additional memory). When exec'ing a new process, modules
149 root 1.17 and data files might need to be loaded again, at extra CPU and memory
150 root 1.1 cost. Likewise when forking, all data structures are copied as well - if
151     the program frees them and replaces them by new data, the child processes
152     will retain the memory even if it isn't used.
153    
154     This module allows the main program to do a controlled fork, and allows
155     modules to exec processes safely at any time. When creating a custom
156     process pool you can take advantage of data sharing via fork without
157     risking to share large dynamic data structures that will blow up child
158     memory usage.
159    
160     =item Exec'ing a new perl process might be difficult and slow. For
161     example, it is not easy to find the correct path to the perl interpreter,
162     and all modules have to be loaded from disk again. Long running processes
163     might run into problems when perl is upgraded for example.
164    
165     This module supports creating pre-initialised perl processes to be used
166     as template, and also tries hard to identify the correct path to the perl
167     interpreter. With a cooperative main program, exec'ing the interpreter
168     might not even be necessary.
169    
170     =item Forking might be impossible when a program is running. For example,
171 root 1.17 POSIX makes it almost impossible to fork from a multi-threaded program and
172 root 1.1 do anything useful in the child - strictly speaking, if your perl program
173     uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),
174     you cannot call fork on the perl level anymore, at all.
175    
176 root 1.17 This module can safely fork helper processes at any time, by calling
177 root 1.1 fork+exec in C, in a POSIX-compatible way.
178    
179     =item Parallel processing with fork might be inconvenient or difficult
180     to implement. For example, when a program uses an event loop and creates
181     watchers it becomes very hard to use the event loop from a child
182     program, as the watchers already exist but are only meaningful in the
183     parent. Worse, a module might want to use such a system, not knowing
184     whether another module or the main program also does, leading to problems.
185    
186     This module only lets the main program create pools by forking (because
187     only the main program can know when it is still safe to do so) - all other
188     pools are created by fork+exec, after which such modules can again be
189     loaded.
190    
191     =back
192    
193 root 1.3 =head1 CONCEPTS
194    
195     This module can create new processes either by executing a new perl
196     process, or by forking from an existing "template" process.
197    
198     Each such process comes with its own file handle that can be used to
199     communicate with it (it's actually a socket - one end in the new process,
200     one end in the main process), and among the things you can do in it are
201     load modules, fork new processes, send file handles to it, and execute
202     functions.
203    
204     There are multiple ways to create additional processes to execute some
205     jobs:
206    
207     =over 4
208    
209     =item fork a new process from the "default" template process, load code,
210     run it
211    
212     This module has a "default" template process which it executes when it is
213     needed the first time. Forking from this process shares the memory used
214     for the perl interpreter with the new process, but loading modules takes
215     time, and the memory is not shared with anything else.
216    
217     This is ideal for when you only need one extra process of a kind, with the
218 root 1.17 option of starting and stopping it on demand.
219 root 1.3
220 root 1.9 Example:
221    
222     AnyEvent::Fork
223     ->new
224     ->require ("Some::Module")
225     ->run ("Some::Module::run", sub {
226     my ($fork_fh) = @_;
227     });
228    
229 root 1.3 =item fork a new template process, load code, then fork processes off of
230     it and run the code
231    
232     When you need to have a bunch of processes that all execute the same (or
233     very similar) tasks, then a good way is to create a new template process
234     for them, loading all the modules you need, and then create your worker
235     processes from this new template process.
236    
237     This way, all code (and data structures) that can be shared (e.g. the
238     modules you loaded) is shared between the processes, and each new process
239     consumes relatively little memory of its own.
240    
241     The disadvantage of this approach is that you need to create a template
242     process for the sole purpose of forking new processes from it, but if you
243 root 1.17 only need a fixed number of processes you can create them, and then destroy
244 root 1.3 the template process.
245    
246 root 1.9 Example:
247    
248     my $template = AnyEvent::Fork->new->require ("Some::Module");
249    
250     for (1..10) {
251     $template->fork->run ("Some::Module::run", sub {
252     my ($fork_fh) = @_;
253     });
254     }
255    
256     # at this point, you can keep $template around to fork new processes
257     # later, or you can destroy it, which causes it to vanish.
258    
259 root 1.3 =item execute a new perl interpreter, load some code, run it
260    
261     This is relatively slow, and doesn't allow you to share memory between
262     multiple processes.
263    
264     The only advantage is that you don't have to have a template process
265     hanging around all the time to fork off some new processes, which might be
266     an advantage when there are long time spans where no extra processes are
267     needed.
268    
269 root 1.9 Example:
270    
271     AnyEvent::Fork
272     ->new_exec
273     ->require ("Some::Module")
274     ->run ("Some::Module::run", sub {
275     my ($fork_fh) = @_;
276     });
277    
278 root 1.3 =back
279    
280     =head1 FUNCTIONS
281    
282 root 1.1 =over 4
283    
284     =cut
285    
286 root 1.4 package AnyEvent::Fork;
287 root 1.1
288     use common::sense;
289    
290 root 1.18 use Errno ();
291 root 1.1
292     use AnyEvent;
293     use AnyEvent::Util ();
294    
295 root 1.15 use IO::FDPass;
296    
297 root 1.21 our $VERSION = 0.5;
298 root 1.12
299 root 1.4 our $PERL; # the path to the perl interpreter, deduces with various forms of magic
300 root 1.1
301 root 1.4 =item my $pool = new AnyEvent::Fork key => value...
302 root 1.1
303     Create a new process pool. The following named parameters are supported:
304    
305     =over 4
306    
307     =back
308    
309     =cut
310    
311 root 1.5 # the early fork template process
312     our $EARLY;
313    
314 root 1.4 # the empty template process
315     our $TEMPLATE;
316    
317     sub _cmd {
318     my $self = shift;
319    
320 root 1.18 # ideally, we would want to use "a (w/a)*" as format string, but perl
321     # versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
322     # it.
323 root 1.19 push @{ $self->[2] }, pack "a L/a*", $_[0], $_[1];
324 root 1.4
325 root 1.19 $self->[3] ||= AE::io $self->[1], 1, sub {
326     do {
327     # send the next "thing" in the queue - either a reference to an fh,
328     # or a plain string.
329    
330     if (ref $self->[2][0]) {
331     # send fh
332     unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) {
333     return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
334     undef $self->[3];
335     die "AnyEvent::Fork: file descriptor send failure: $!";
336 root 1.18 }
337 root 1.4
338 root 1.19 shift @{ $self->[2] };
339 root 1.18
340 root 1.19 } else {
341     # send string
342     my $len = syswrite $self->[1], $self->[2][0];
343    
344     unless ($len) {
345     return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
346     undef $self->[3];
347     die "AnyEvent::Fork: command write failure: $!";
348     }
349 root 1.18
350 root 1.19 substr $self->[2][0], 0, $len, "";
351     shift @{ $self->[2] } unless length $self->[2][0];
352     }
353     } while @{ $self->[2] };
354    
355     # everything written
356     undef $self->[3];
357    
358     # invoke run callback, if any
359 root 1.20 $self->[4]->($self->[1]) if $self->[4];
360 root 1.19 };
361 root 1.14
362     () # make sure we don't leak the watcher
363 root 1.4 }
364 root 1.1
365 root 1.4 sub _new {
366 root 1.19 my ($self, $fh, $pid) = @_;
367 root 1.1
368 root 1.6 AnyEvent::Util::fh_nonblocking $fh, 1;
369    
370 root 1.4 $self = bless [
371 root 1.20 $pid,
372 root 1.1 $fh,
373 root 1.4 [], # write queue - strings or fd's
374     undef, # AE watcher
375     ], $self;
376    
377     $self
378 root 1.1 }
379    
380 root 1.6 # fork template from current process, used by AnyEvent::Fork::Early/Template
381     sub _new_fork {
382     my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
383 root 1.7 my $parent = $$;
384    
385 root 1.6 my $pid = fork;
386    
387     if ($pid eq 0) {
388     require AnyEvent::Fork::Serve;
389 root 1.7 $AnyEvent::Fork::Serve::OWNER = $parent;
390 root 1.6 close $fh;
391 root 1.7 $0 = "$_[1] of $parent";
392 root 1.16 $SIG{CHLD} = 'IGNORE';
393 root 1.6 AnyEvent::Fork::Serve::serve ($slave);
394 root 1.15 exit 0;
395 root 1.6 } elsif (!$pid) {
396     die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
397     }
398    
399 root 1.19 AnyEvent::Fork->_new ($fh, $pid)
400 root 1.6 }
401    
402 root 1.4 =item my $proc = new AnyEvent::Fork
403 root 1.1
404 root 1.4 Create a new "empty" perl interpreter process and returns its process
405     object for further manipulation.
406 root 1.1
407 root 1.4 The new process is forked from a template process that is kept around
408     for this purpose. When it doesn't exist yet, it is created by a call to
409     C<new_exec> and kept around for future calls.
410    
411 root 1.9 When the process object is destroyed, it will release the file handle
412     that connects it with the new process. When the new process has not yet
413     called C<run>, then the process will exit. Otherwise, what happens depends
414     entirely on the code that is executed.
415    
416 root 1.4 =cut
417    
418     sub new {
419     my $class = shift;
420 root 1.1
421 root 1.4 $TEMPLATE ||= $class->new_exec;
422     $TEMPLATE->fork
423 root 1.1 }
424    
425 root 1.4 =item $new_proc = $proc->fork
426    
427     Forks C<$proc>, creating a new process, and returns the process object
428     of the new process.
429    
430     If any of the C<send_> functions have been called before fork, then they
431     will be cloned in the child. For example, in a pre-forked server, you
432     might C<send_fh> the listening socket into the template process, and then
433     keep calling C<fork> and C<run>.
434    
435     =cut
436    
437     sub fork {
438     my ($self) = @_;
439 root 1.1
440     my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
441 root 1.4
442     $self->send_fh ($slave);
443     $self->_cmd ("f");
444    
445     AnyEvent::Fork->_new ($fh)
446     }
447    
448     =item my $proc = new_exec AnyEvent::Fork
449    
450     Create a new "empty" perl interpreter process and returns its process
451     object for further manipulation.
452    
453     Unlike the C<new> method, this method I<always> spawns a new perl process
454     (except in some cases, see L<AnyEvent::Fork::Early> for details). This
455     reduces the amount of memory sharing that is possible, and is also slower.
456    
457     You should use C<new> whenever possible, except when having a template
458     process around is unacceptable.
459    
460 root 1.17 The path to the perl interpreter is divined using various methods - first
461 root 1.4 C<$^X> is investigated to see if the path ends with something that sounds
462     as if it were the perl interpreter. Failing this, the module falls back to
463     using C<$Config::Config{perlpath}>.
464    
465     =cut
466    
467     sub new_exec {
468     my ($self) = @_;
469    
470 root 1.5 return $EARLY->fork
471     if $EARLY;
472    
473 root 1.4 # first find path of perl
474     my $perl = $;
475    
476     # first we try $^X, but the path must be absolute (always on win32), and end in sth.
477     # that looks like perl. this obviously only works for posix and win32
478     unless (
479 root 1.15 ($^O eq "MSWin32" || $perl =~ m%^/%)
480 root 1.4 && $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i
481     ) {
482     # if it doesn't look perlish enough, try Config
483     require Config;
484     $perl = $Config::Config{perlpath};
485     $perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;
486     }
487    
488     require Proc::FastSpawn;
489    
490     my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
491     Proc::FastSpawn::fd_inherit (fileno $slave);
492    
493 root 1.10 # new fh's should always be set cloexec (due to $^F),
494     # but hey, not on win32, so we always clear the inherit flag.
495     Proc::FastSpawn::fd_inherit (fileno $fh, 0);
496    
497 root 1.4 # quick. also doesn't work in win32. of course. what did you expect
498     #local $ENV{PERL5LIB} = join ":", grep !ref, @INC;
499 root 1.1 my %env = %ENV;
500 root 1.15 $env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;
501 root 1.1
502 root 1.19 my $pid = Proc::FastSpawn::spawn (
503 root 1.4 $perl,
504 root 1.7 ["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],
505 root 1.4 [map "$_=$env{$_}", keys %env],
506     ) or die "unable to spawn AnyEvent::Fork server: $!";
507    
508 root 1.19 $self->_new ($fh, $pid)
509 root 1.4 }
510    
511 root 1.20 =item $pid = $proc->pid
512    
513     Returns the process id of the process I<iff it is a direct child of the
514     process> running AnyEvent::Fork, and C<undef> otherwise.
515    
516     Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and
517     L<AnyEvent::Fork::Template> are direct children, and you are responsible
518     to clean up their zombies when they die.
519    
520     All other processes are not direct children, and will be cleaned up by
521     AnyEvent::Fork.
522    
523     =cut
524    
525     sub pid {
526     $_[0][0]
527     }
528    
529 root 1.9 =item $proc = $proc->eval ($perlcode, @args)
530    
531     Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to
532 root 1.23 the strings specified by C<@args>, in the "main" package.
533 root 1.9
534     This call is meant to do any custom initialisation that might be required
535     (for example, the C<require> method uses it). It's not supposed to be used
536     to completely take over the process, use C<run> for that.
537    
538     The code will usually be executed after this call returns, and there is no
539     way to pass anything back to the calling process. Any evaluation errors
540     will be reported to stderr and cause the process to exit.
541    
542 root 1.23 If you want to execute some code to take over the process (see the
543     "fork+exec" example in the SYNOPSIS), you should compile a function via
544     C<eval> first, and then call it via C<run>. This also gives you access to
545     any arguments passed via the C<send_xxx> methods, such as file handles.
546    
547 root 1.9 Returns the process object for easy chaining of method calls.
548    
549     =cut
550    
551     sub eval {
552     my ($self, $code, @args) = @_;
553    
554 root 1.19 $self->_cmd (e => pack "(w/a*)*", $code, @args);
555 root 1.9
556     $self
557     }
558    
559 root 1.4 =item $proc = $proc->require ($module, ...)
560 root 1.1
561 root 1.9 Tries to load the given module(s) into the process
562 root 1.1
563 root 1.4 Returns the process object for easy chaining of method calls.
564 root 1.1
565 root 1.9 =cut
566    
567     sub require {
568     my ($self, @modules) = @_;
569    
570     s%::%/%g for @modules;
571     $self->eval ('require "$_.pm" for @_', @modules);
572    
573     $self
574     }
575    
576 root 1.4 =item $proc = $proc->send_fh ($handle, ...)
577 root 1.1
578 root 1.4 Send one or more file handles (I<not> file descriptors) to the process,
579     to prepare a call to C<run>.
580 root 1.1
581 root 1.4 The process object keeps a reference to the handles until this is done,
582     so you must not explicitly close the handles. This is most easily
583     accomplished by simply not storing the file handles anywhere after passing
584     them to this method.
585    
586     Returns the process object for easy chaining of method calls.
587    
588 root 1.17 Example: pass a file handle to a process, and release it without
589     closing. It will be closed automatically when it is no longer used.
590 root 1.9
591     $proc->send_fh ($my_fh);
592     undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
593    
594 root 1.4 =cut
595    
596     sub send_fh {
597     my ($self, @fh) = @_;
598    
599     for my $fh (@fh) {
600     $self->_cmd ("h");
601     push @{ $self->[2] }, \$fh;
602     }
603    
604     $self
605 root 1.1 }
606    
607 root 1.4 =item $proc = $proc->send_arg ($string, ...)
608    
609     Send one or more argument strings to the process, to prepare a call to
610     C<run>. The strings can be any octet string.
611    
612 root 1.18 The protocol is optimised to pass a moderate number of relatively short
613     strings - while you can pass up to 4GB of data in one go, this is more
614     meant to pass some ID information or other startup info, not big chunks of
615     data.
616    
617 root 1.17 Returns the process object for easy chaining of method calls.
618 root 1.4
619     =cut
620 root 1.1
621 root 1.4 sub send_arg {
622     my ($self, @arg) = @_;
623 root 1.1
624 root 1.19 $self->_cmd (a => pack "(w/a*)*", @arg);
625 root 1.1
626     $self
627     }
628    
629 root 1.4 =item $proc->run ($func, $cb->($fh))
630    
631 root 1.23 Enter the function specified by the function name in C<$func> in the
632     process. The function is called with the communication socket as first
633 root 1.4 argument, followed by all file handles and string arguments sent earlier
634     via C<send_fh> and C<send_arg> methods, in the order they were called.
635    
636 root 1.23 The function name should be fully qualified, but if it isn't, it will be
637     looked up in the main package.
638 root 1.4
639 root 1.23 If the called function returns, doesn't exist, or any error occurs, the
640     process exits.
641 root 1.4
642 root 1.23 Preparing the process is done in the background - when all commands have
643     been sent, the callback is invoked with the local communications socket
644     as argument. At this point you can start using the socket in any way you
645     like.
646    
647     The process object becomes unusable on return from this function - any
648     further method calls result in undefined behaviour.
649 root 1.4
650     If the communication socket isn't used, it should be closed on both sides,
651     to save on kernel memory.
652    
653     The socket is non-blocking in the parent, and blocking in the newly
654 root 1.23 created process. The close-on-exec flag is set in both.
655    
656     Even if not used otherwise, the socket can be a good indicator for the
657     existence of the process - if the other process exits, you get a readable
658     event on it, because exiting the process closes the socket (if it didn't
659     create any children using fork).
660 root 1.4
661 root 1.9 Example: create a template for a process pool, pass a few strings, some
662     file handles, then fork, pass one more string, and run some code.
663    
664     my $pool = AnyEvent::Fork
665     ->new
666     ->send_arg ("str1", "str2")
667     ->send_fh ($fh1, $fh2);
668    
669     for (1..2) {
670     $pool
671     ->fork
672     ->send_arg ("str3")
673     ->run ("Some::function", sub {
674     my ($fh) = @_;
675    
676     # fh is nonblocking, but we trust that the OS can accept these
677 root 1.22 # few octets anyway.
678 root 1.9 syswrite $fh, "hi #$_\n";
679    
680     # $fh is being closed here, as we don't store it anywhere
681     });
682     }
683    
684     # Some::function might look like this - all parameters passed before fork
685     # and after will be passed, in order, after the communications socket.
686     sub Some::function {
687     my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
688    
689 root 1.22 print scalar <$fh>; # prints "hi #1\n" and "hi #2\n" in any order
690 root 1.9 }
691    
692 root 1.4 =cut
693    
694     sub run {
695     my ($self, $func, $cb) = @_;
696    
697 root 1.20 $self->[4] = $cb;
698 root 1.9 $self->_cmd (r => $func);
699 root 1.4 }
700    
701 root 1.1 =back
702    
703 root 1.16 =head1 PERFORMANCE
704    
705     Now for some unscientific benchmark numbers (all done on an amd64
706     GNU/Linux box). These are intended to give you an idea of the relative
707 root 1.18 performance you can expect, they are not meant to be absolute performance
708     numbers.
709 root 1.16
710 root 1.17 OK, so, I ran a simple benchmark that creates a socket pair, forks, calls
711 root 1.16 exit in the child and waits for the socket to close in the parent. I did
712 root 1.18 load AnyEvent, EV and AnyEvent::Fork, for a total process size of 5100kB.
713 root 1.16
714 root 1.18 2079 new processes per second, using manual socketpair + fork
715 root 1.16
716     Then I did the same thing, but instead of calling fork, I called
717     AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
718     socket form the child to close on exit. This does the same thing as manual
719 root 1.17 socket pair + fork, except that what is forked is the template process
720 root 1.16 (2440kB), and the socket needs to be passed to the server at the other end
721     of the socket first.
722    
723     2307 new processes per second, using AnyEvent::Fork->new
724    
725     And finally, using C<new_exec> instead C<new>, using vforks+execs to exec
726     a new perl interpreter and compile the small server each time, I get:
727    
728     479 vfork+execs per second, using AnyEvent::Fork->new_exec
729    
730 root 1.17 So how can C<< AnyEvent->new >> be faster than a standard fork, even
731     though it uses the same operations, but adds a lot of overhead?
732 root 1.16
733     The difference is simply the process size: forking the 6MB process takes
734     so much longer than forking the 2.5MB template process that the overhead
735     introduced is canceled out.
736    
737     If the benchmark process grows, the normal fork becomes even slower:
738    
739     1340 new processes, manual fork in a 20MB process
740     731 new processes, manual fork in a 200MB process
741     235 new processes, manual fork in a 2000MB process
742    
743 root 1.17 What that means (to me) is that I can use this module without having a
744     very bad conscience because of the extra overhead required to start new
745 root 1.16 processes.
746    
747 root 1.15 =head1 TYPICAL PROBLEMS
748    
749     This section lists typical problems that remain. I hope by recognising
750     them, most can be avoided.
751    
752     =over 4
753    
754     =item "leaked" file descriptors for exec'ed processes
755    
756     POSIX systems inherit file descriptors by default when exec'ing a new
757     process. While perl itself laudably sets the close-on-exec flags on new
758     file handles, most C libraries don't care, and even if all cared, it's
759     often not possible to set the flag in a race-free manner.
760    
761     That means some file descriptors can leak through. And since it isn't
762 root 1.17 possible to know which file descriptors are "good" and "necessary" (or
763     even to know which file descriptors are open), there is no good way to
764 root 1.15 close the ones that might harm.
765    
766     As an example of what "harm" can be done consider a web server that
767     accepts connections and afterwards some module uses AnyEvent::Fork for the
768     first time, causing it to fork and exec a new process, which might inherit
769     the network socket. When the server closes the socket, it is still open
770     in the child (which doesn't even know that) and the client might conclude
771     that the connection is still fine.
772    
773     For the main program, there are multiple remedies available -
774     L<AnyEvent::Fork::Early> is one, creating a process early and not using
775     C<new_exec> is another, as in both cases, the first process can be exec'ed
776     well before many random file descriptors are open.
777    
778     In general, the solution for these kind of problems is to fix the
779     libraries or the code that leaks those file descriptors.
780    
781 root 1.17 Fortunately, most of these leaked descriptors do no harm, other than
782 root 1.15 sitting on some resources.
783    
784     =item "leaked" file descriptors for fork'ed processes
785    
786     Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
787     which closes file descriptors not marked for being inherited.
788    
789     However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
790     a way to create these processes by forking, and this leaks more file
791     descriptors than exec'ing them, as there is no way to mark descriptors as
792     "close on fork".
793    
794     An example would be modules like L<EV>, L<IO::AIO> or L<Gtk2>. Both create
795     pipes for internal uses, and L<Gtk2> might open a connection to the X
796     server. L<EV> and L<IO::AIO> can deal with fork, but Gtk2 might have
797     trouble with a fork.
798    
799     The solution is to either not load these modules before use'ing
800     L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
801     initialising them, for example, by calling C<init Gtk2> manually.
802    
803 root 1.19 =item exit runs destructors
804    
805     This only applies to users of Lc<AnyEvent::Fork:Early> and
806     L<AnyEvent::Fork::Template>.
807    
808     When a process created by AnyEvent::Fork exits, it might do so by calling
809     exit, or simply letting perl reach the end of the program. At which point
810     Perl runs all destructors.
811    
812     Not all destructors are fork-safe - for example, an object that represents
813     the connection to an X display might tell the X server to free resources,
814     which is inconvenient when the "real" object in the parent still needs to
815     use them.
816    
817     This is obviously not a problem for L<AnyEvent::Fork::Early>, as you used
818     it as the very first thing, right?
819    
820     It is a problem for L<AnyEvent::Fork::Template> though - and the solution
821     is to not create objects with nontrivial destructors that might have an
822     effect outside of Perl.
823    
824 root 1.15 =back
825    
826 root 1.8 =head1 PORTABILITY NOTES
827    
828 root 1.10 Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a nop,
829     and ::Template is not going to work), and it cost a lot of blood and sweat
830     to make it so, mostly due to the bloody broken perl that nobody seems to
831     care about. The fork emulation is a bad joke - I have yet to see something
832 root 1.17 useful that you can do with it without running into memory corruption
833 root 1.10 issues or other braindamage. Hrrrr.
834    
835     Cygwin perl is not supported at the moment, as it should implement fd
836     passing, but doesn't, and rolling my own is hard, as cygwin doesn't
837     support enough functionality to do it.
838 root 1.8
839 root 1.13 =head1 SEE ALSO
840    
841     L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
842     L<AnyEvent::Fork::Template> (to create a process by forking the main
843     program at a convenient time).
844    
845 root 1.1 =head1 AUTHOR
846    
847     Marc Lehmann <schmorp@schmorp.de>
848     http://home.schmorp.de/
849    
850     =cut
851    
852     1
853