ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-Fork/Fork.pm
Revision: 1.18
Committed: Sat Apr 6 01:33:56 2013 UTC (11 years, 2 months ago) by root
Branch: MAIN
Changes since 1.17: +51 -28 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3 root 1.4 AnyEvent::Fork - everything you wanted to use fork() for, but couldn't
4 root 1.1
5     =head1 SYNOPSIS
6    
7 root 1.4 use AnyEvent::Fork;
8 root 1.1
9 root 1.9 ##################################################################
10     # create a single new process, tell it to run your worker function
11    
12     AnyEvent::Fork
13     ->new
14     ->require ("MyModule")
15     ->run ("MyModule::worker, sub {
16     my ($master_filehandle) = @_;
17    
18     # now $master_filehandle is connected to the
19     # $slave_filehandle in the new process.
20     });
21    
22     # MyModule::worker might look like this
23     sub MyModule::worker {
24     my ($slave_filehandle) = @_;
25    
26     # now $slave_filehandle is connected to the $master_filehandle
27     # in the original prorcess. have fun!
28     }
29    
30     ##################################################################
31     # create a pool of server processes all accepting on the same socket
32    
33     # create listener socket
34     my $listener = ...;
35    
36     # create a pool template, initialise it and give it the socket
37     my $pool = AnyEvent::Fork
38     ->new
39     ->require ("Some::Stuff", "My::Server")
40     ->send_fh ($listener);
41    
42     # now create 10 identical workers
43     for my $id (1..10) {
44     $pool
45     ->fork
46     ->send_arg ($id)
47     ->run ("My::Server::run");
48     }
49    
50     # now do other things - maybe use the filehandle provided by run
51     # to wait for the processes to die. or whatever.
52    
53     # My::Server::run might look like this
54     sub My::Server::run {
55     my ($slave, $listener, $id) = @_;
56    
57     close $slave; # we do not use the socket, so close it to save resources
58    
59     # we could go ballistic and use e.g. AnyEvent here, or IO::AIO,
60     # or anything we usually couldn't do in a process forked normally.
61     while (my $socket = $listener->accept) {
62     # do sth. with new socket
63     }
64     }
65    
66 root 1.1 =head1 DESCRIPTION
67    
68 root 1.4 This module allows you to create new processes, without actually forking
69     them from your current process (avoiding the problems of forking), but
70     preserving most of the advantages of fork.
71    
72     It can be used to create new worker processes or new independent
73     subprocesses for short- and long-running jobs, process pools (e.g. for use
74     in pre-forked servers) but also to spawn new external processes (such as
75 root 1.17 CGI scripts from a web server), which can be faster (and more well behaved)
76 root 1.4 than using fork+exec in big processes.
77 root 1.1
78 root 1.5 Special care has been taken to make this module useful from other modules,
79     while still supporting specialised environments such as L<App::Staticperl>
80     or L<PAR::Packer>.
81    
82 root 1.16 =head1 WHAT THIS MODULE IS NOT
83    
84     This module only creates processes and lets you pass file handles and
85     strings to it, and run perl code. It does not implement any kind of RPC -
86     there is no back channel from the process back to you, and there is no RPC
87     or message passing going on.
88    
89     If you need some form of RPC, you can either implement it yourself
90     in whatever way you like, use some message-passing module such
91     as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use
92     L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
93     and so on.
94    
95 root 1.1 =head1 PROBLEM STATEMENT
96    
97     There are two ways to implement parallel processing on UNIX like operating
98     systems - fork and process, and fork+exec and process. They have different
99     advantages and disadvantages that I describe below, together with how this
100     module tries to mitigate the disadvantages.
101    
102     =over 4
103    
104     =item Forking from a big process can be very slow (a 5GB process needs
105     0.05s to fork on my 3.6GHz amd64 GNU/Linux box for example). This overhead
106     is often shared with exec (because you have to fork first), but in some
107     circumstances (e.g. when vfork is used), fork+exec can be much faster.
108    
109     This module can help here by telling a small(er) helper process to fork,
110     or fork+exec instead.
111    
112     =item Forking usually creates a copy-on-write copy of the parent
113     process. Memory (for example, modules or data files that have been
114     will not take additional memory). When exec'ing a new process, modules
115 root 1.17 and data files might need to be loaded again, at extra CPU and memory
116 root 1.1 cost. Likewise when forking, all data structures are copied as well - if
117     the program frees them and replaces them by new data, the child processes
118     will retain the memory even if it isn't used.
119    
120     This module allows the main program to do a controlled fork, and allows
121     modules to exec processes safely at any time. When creating a custom
122     process pool you can take advantage of data sharing via fork without
123     risking to share large dynamic data structures that will blow up child
124     memory usage.
125    
126     =item Exec'ing a new perl process might be difficult and slow. For
127     example, it is not easy to find the correct path to the perl interpreter,
128     and all modules have to be loaded from disk again. Long running processes
129     might run into problems when perl is upgraded for example.
130    
131     This module supports creating pre-initialised perl processes to be used
132     as template, and also tries hard to identify the correct path to the perl
133     interpreter. With a cooperative main program, exec'ing the interpreter
134     might not even be necessary.
135    
136     =item Forking might be impossible when a program is running. For example,
137 root 1.17 POSIX makes it almost impossible to fork from a multi-threaded program and
138 root 1.1 do anything useful in the child - strictly speaking, if your perl program
139     uses posix threads (even indirectly via e.g. L<IO::AIO> or L<threads>),
140     you cannot call fork on the perl level anymore, at all.
141    
142 root 1.17 This module can safely fork helper processes at any time, by calling
143 root 1.1 fork+exec in C, in a POSIX-compatible way.
144    
145     =item Parallel processing with fork might be inconvenient or difficult
146     to implement. For example, when a program uses an event loop and creates
147     watchers it becomes very hard to use the event loop from a child
148     program, as the watchers already exist but are only meaningful in the
149     parent. Worse, a module might want to use such a system, not knowing
150     whether another module or the main program also does, leading to problems.
151    
152     This module only lets the main program create pools by forking (because
153     only the main program can know when it is still safe to do so) - all other
154     pools are created by fork+exec, after which such modules can again be
155     loaded.
156    
157     =back
158    
159 root 1.3 =head1 CONCEPTS
160    
161     This module can create new processes either by executing a new perl
162     process, or by forking from an existing "template" process.
163    
164     Each such process comes with its own file handle that can be used to
165     communicate with it (it's actually a socket - one end in the new process,
166     one end in the main process), and among the things you can do in it are
167     load modules, fork new processes, send file handles to it, and execute
168     functions.
169    
170     There are multiple ways to create additional processes to execute some
171     jobs:
172    
173     =over 4
174    
175     =item fork a new process from the "default" template process, load code,
176     run it
177    
178     This module has a "default" template process which it executes when it is
179     needed the first time. Forking from this process shares the memory used
180     for the perl interpreter with the new process, but loading modules takes
181     time, and the memory is not shared with anything else.
182    
183     This is ideal for when you only need one extra process of a kind, with the
184 root 1.17 option of starting and stopping it on demand.
185 root 1.3
186 root 1.9 Example:
187    
188     AnyEvent::Fork
189     ->new
190     ->require ("Some::Module")
191     ->run ("Some::Module::run", sub {
192     my ($fork_fh) = @_;
193     });
194    
195 root 1.3 =item fork a new template process, load code, then fork processes off of
196     it and run the code
197    
198     When you need to have a bunch of processes that all execute the same (or
199     very similar) tasks, then a good way is to create a new template process
200     for them, loading all the modules you need, and then create your worker
201     processes from this new template process.
202    
203     This way, all code (and data structures) that can be shared (e.g. the
204     modules you loaded) is shared between the processes, and each new process
205     consumes relatively little memory of its own.
206    
207     The disadvantage of this approach is that you need to create a template
208     process for the sole purpose of forking new processes from it, but if you
209 root 1.17 only need a fixed number of processes you can create them, and then destroy
210 root 1.3 the template process.
211    
212 root 1.9 Example:
213    
214     my $template = AnyEvent::Fork->new->require ("Some::Module");
215    
216     for (1..10) {
217     $template->fork->run ("Some::Module::run", sub {
218     my ($fork_fh) = @_;
219     });
220     }
221    
222     # at this point, you can keep $template around to fork new processes
223     # later, or you can destroy it, which causes it to vanish.
224    
225 root 1.3 =item execute a new perl interpreter, load some code, run it
226    
227     This is relatively slow, and doesn't allow you to share memory between
228     multiple processes.
229    
230     The only advantage is that you don't have to have a template process
231     hanging around all the time to fork off some new processes, which might be
232     an advantage when there are long time spans where no extra processes are
233     needed.
234    
235 root 1.9 Example:
236    
237     AnyEvent::Fork
238     ->new_exec
239     ->require ("Some::Module")
240     ->run ("Some::Module::run", sub {
241     my ($fork_fh) = @_;
242     });
243    
244 root 1.3 =back
245    
246     =head1 FUNCTIONS
247    
248 root 1.1 =over 4
249    
250     =cut
251    
252 root 1.4 package AnyEvent::Fork;
253 root 1.1
254     use common::sense;
255    
256 root 1.18 use Errno ();
257 root 1.1
258     use AnyEvent;
259     use AnyEvent::Util ();
260    
261 root 1.15 use IO::FDPass;
262    
263     our $VERSION = 0.2;
264 root 1.12
265 root 1.4 our $PERL; # the path to the perl interpreter, deduces with various forms of magic
266 root 1.1
267 root 1.4 =item my $pool = new AnyEvent::Fork key => value...
268 root 1.1
269     Create a new process pool. The following named parameters are supported:
270    
271     =over 4
272    
273     =back
274    
275     =cut
276    
277 root 1.5 # the early fork template process
278     our $EARLY;
279    
280 root 1.4 # the empty template process
281     our $TEMPLATE;
282    
283     sub _cmd {
284     my $self = shift;
285    
286 root 1.18 # ideally, we would want to use "a (w/a)*" as format string, but perl
287     # versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
288     # it.
289 root 1.16 push @{ $self->[2] }, pack "L/a*", pack "(w/a*)*", @_;
290 root 1.4
291 root 1.18 unless ($self->[3]) {
292     my $wcb = sub {
293     do {
294     # send the next "thing" in the queue - either a reference to an fh,
295     # or a plain string.
296    
297     if (ref $self->[2][0]) {
298     # send fh
299     unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) {
300     return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
301     undef $self->[3];
302     die "AnyEvent::Fork: file descriptor send failure: $!";
303     }
304    
305     shift @{ $self->[2] };
306    
307     } else {
308     # send string
309     my $len = syswrite $self->[1], $self->[2][0];
310    
311     unless ($len) {
312     return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
313     undef $self->[3];
314     die "AnyEvent::Fork: command write failure: $!";
315     }
316    
317     substr $self->[2][0], 0, $len, "";
318     shift @{ $self->[2] } unless length $self->[2][0];
319     }
320     } while @{ $self->[2] };
321 root 1.4
322 root 1.18 # everything written
323 root 1.4 undef $self->[3];
324 root 1.9 # invoke run callback
325 root 1.4 $self->[0]->($self->[1]) if $self->[0];
326 root 1.18 };
327    
328     $wcb->();
329    
330     $self->[3] ||= AE::io $self->[1], 1, $wcb
331     if @{ $self->[2] };
332     }
333 root 1.14
334     () # make sure we don't leak the watcher
335 root 1.4 }
336 root 1.1
337 root 1.4 sub _new {
338     my ($self, $fh) = @_;
339 root 1.1
340 root 1.6 AnyEvent::Util::fh_nonblocking $fh, 1;
341    
342 root 1.4 $self = bless [
343     undef, # run callback
344 root 1.1 $fh,
345 root 1.4 [], # write queue - strings or fd's
346     undef, # AE watcher
347     ], $self;
348    
349     $self
350 root 1.1 }
351    
352 root 1.6 # fork template from current process, used by AnyEvent::Fork::Early/Template
353     sub _new_fork {
354     my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
355 root 1.7 my $parent = $$;
356    
357 root 1.6 my $pid = fork;
358    
359     if ($pid eq 0) {
360     require AnyEvent::Fork::Serve;
361 root 1.7 $AnyEvent::Fork::Serve::OWNER = $parent;
362 root 1.6 close $fh;
363 root 1.7 $0 = "$_[1] of $parent";
364 root 1.16 $SIG{CHLD} = 'IGNORE';
365 root 1.6 AnyEvent::Fork::Serve::serve ($slave);
366 root 1.15 exit 0;
367 root 1.6 } elsif (!$pid) {
368     die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
369     }
370    
371     AnyEvent::Fork->_new ($fh)
372     }
373    
374 root 1.4 =item my $proc = new AnyEvent::Fork
375 root 1.1
376 root 1.4 Create a new "empty" perl interpreter process and returns its process
377     object for further manipulation.
378 root 1.1
379 root 1.4 The new process is forked from a template process that is kept around
380     for this purpose. When it doesn't exist yet, it is created by a call to
381     C<new_exec> and kept around for future calls.
382    
383 root 1.9 When the process object is destroyed, it will release the file handle
384     that connects it with the new process. When the new process has not yet
385     called C<run>, then the process will exit. Otherwise, what happens depends
386     entirely on the code that is executed.
387    
388 root 1.4 =cut
389    
390     sub new {
391     my $class = shift;
392 root 1.1
393 root 1.4 $TEMPLATE ||= $class->new_exec;
394     $TEMPLATE->fork
395 root 1.1 }
396    
397 root 1.4 =item $new_proc = $proc->fork
398    
399     Forks C<$proc>, creating a new process, and returns the process object
400     of the new process.
401    
402     If any of the C<send_> functions have been called before fork, then they
403     will be cloned in the child. For example, in a pre-forked server, you
404     might C<send_fh> the listening socket into the template process, and then
405     keep calling C<fork> and C<run>.
406    
407     =cut
408    
409     sub fork {
410     my ($self) = @_;
411 root 1.1
412     my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
413 root 1.4
414     $self->send_fh ($slave);
415     $self->_cmd ("f");
416    
417     AnyEvent::Fork->_new ($fh)
418     }
419    
420     =item my $proc = new_exec AnyEvent::Fork
421    
422     Create a new "empty" perl interpreter process and returns its process
423     object for further manipulation.
424    
425     Unlike the C<new> method, this method I<always> spawns a new perl process
426     (except in some cases, see L<AnyEvent::Fork::Early> for details). This
427     reduces the amount of memory sharing that is possible, and is also slower.
428    
429     You should use C<new> whenever possible, except when having a template
430     process around is unacceptable.
431    
432 root 1.17 The path to the perl interpreter is divined using various methods - first
433 root 1.4 C<$^X> is investigated to see if the path ends with something that sounds
434     as if it were the perl interpreter. Failing this, the module falls back to
435     using C<$Config::Config{perlpath}>.
436    
437     =cut
438    
439     sub new_exec {
440     my ($self) = @_;
441    
442 root 1.5 return $EARLY->fork
443     if $EARLY;
444    
445 root 1.4 # first find path of perl
446     my $perl = $;
447    
448     # first we try $^X, but the path must be absolute (always on win32), and end in sth.
449     # that looks like perl. this obviously only works for posix and win32
450     unless (
451 root 1.15 ($^O eq "MSWin32" || $perl =~ m%^/%)
452 root 1.4 && $perl =~ m%[/\\]perl(?:[0-9]+(\.[0-9]+)+)?(\.exe)?$%i
453     ) {
454     # if it doesn't look perlish enough, try Config
455     require Config;
456     $perl = $Config::Config{perlpath};
457     $perl =~ s/(?:\Q$Config::Config{_exe}\E)?$/$Config::Config{_exe}/;
458     }
459    
460     require Proc::FastSpawn;
461    
462     my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
463     Proc::FastSpawn::fd_inherit (fileno $slave);
464    
465 root 1.10 # new fh's should always be set cloexec (due to $^F),
466     # but hey, not on win32, so we always clear the inherit flag.
467     Proc::FastSpawn::fd_inherit (fileno $fh, 0);
468    
469 root 1.4 # quick. also doesn't work in win32. of course. what did you expect
470     #local $ENV{PERL5LIB} = join ":", grep !ref, @INC;
471 root 1.1 my %env = %ENV;
472 root 1.15 $env{PERL5LIB} = join +($^O eq "MSWin32" ? ";" : ":"), grep !ref, @INC;
473 root 1.1
474 root 1.4 Proc::FastSpawn::spawn (
475     $perl,
476 root 1.7 ["perl", "-MAnyEvent::Fork::Serve", "-e", "AnyEvent::Fork::Serve::me", fileno $slave, $$],
477 root 1.4 [map "$_=$env{$_}", keys %env],
478     ) or die "unable to spawn AnyEvent::Fork server: $!";
479    
480     $self->_new ($fh)
481     }
482    
483 root 1.9 =item $proc = $proc->eval ($perlcode, @args)
484    
485     Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to
486     the strings specified by C<@args>.
487    
488     This call is meant to do any custom initialisation that might be required
489     (for example, the C<require> method uses it). It's not supposed to be used
490     to completely take over the process, use C<run> for that.
491    
492     The code will usually be executed after this call returns, and there is no
493     way to pass anything back to the calling process. Any evaluation errors
494     will be reported to stderr and cause the process to exit.
495    
496     Returns the process object for easy chaining of method calls.
497    
498     =cut
499    
500     sub eval {
501     my ($self, $code, @args) = @_;
502    
503     $self->_cmd (e => $code, @args);
504    
505     $self
506     }
507    
508 root 1.4 =item $proc = $proc->require ($module, ...)
509 root 1.1
510 root 1.9 Tries to load the given module(s) into the process
511 root 1.1
512 root 1.4 Returns the process object for easy chaining of method calls.
513 root 1.1
514 root 1.9 =cut
515    
516     sub require {
517     my ($self, @modules) = @_;
518    
519     s%::%/%g for @modules;
520     $self->eval ('require "$_.pm" for @_', @modules);
521    
522     $self
523     }
524    
525 root 1.4 =item $proc = $proc->send_fh ($handle, ...)
526 root 1.1
527 root 1.4 Send one or more file handles (I<not> file descriptors) to the process,
528     to prepare a call to C<run>.
529 root 1.1
530 root 1.4 The process object keeps a reference to the handles until this is done,
531     so you must not explicitly close the handles. This is most easily
532     accomplished by simply not storing the file handles anywhere after passing
533     them to this method.
534    
535     Returns the process object for easy chaining of method calls.
536    
537 root 1.17 Example: pass a file handle to a process, and release it without
538     closing. It will be closed automatically when it is no longer used.
539 root 1.9
540     $proc->send_fh ($my_fh);
541     undef $my_fh; # free the reference if you want, but DO NOT CLOSE IT
542    
543 root 1.4 =cut
544    
545     sub send_fh {
546     my ($self, @fh) = @_;
547    
548     for my $fh (@fh) {
549     $self->_cmd ("h");
550     push @{ $self->[2] }, \$fh;
551     }
552    
553     $self
554 root 1.1 }
555    
556 root 1.4 =item $proc = $proc->send_arg ($string, ...)
557    
558     Send one or more argument strings to the process, to prepare a call to
559     C<run>. The strings can be any octet string.
560    
561 root 1.18 The protocol is optimised to pass a moderate number of relatively short
562     strings - while you can pass up to 4GB of data in one go, this is more
563     meant to pass some ID information or other startup info, not big chunks of
564     data.
565    
566 root 1.17 Returns the process object for easy chaining of method calls.
567 root 1.4
568     =cut
569 root 1.1
570 root 1.4 sub send_arg {
571     my ($self, @arg) = @_;
572 root 1.1
573 root 1.4 $self->_cmd (a => @arg);
574 root 1.1
575     $self
576     }
577    
578 root 1.4 =item $proc->run ($func, $cb->($fh))
579    
580     Enter the function specified by the fully qualified name in C<$func> in
581     the process. The function is called with the communication socket as first
582     argument, followed by all file handles and string arguments sent earlier
583     via C<send_fh> and C<send_arg> methods, in the order they were called.
584    
585     If the called function returns, the process exits.
586    
587     Preparing the process can take time - when the process is ready, the
588     callback is invoked with the local communications socket as argument.
589    
590     The process object becomes unusable on return from this function.
591    
592     If the communication socket isn't used, it should be closed on both sides,
593     to save on kernel memory.
594    
595     The socket is non-blocking in the parent, and blocking in the newly
596     created process. The close-on-exec flag is set on both. Even if not used
597 root 1.17 otherwise, the socket can be a good indicator for the existence of the
598 root 1.8 process - if the other process exits, you get a readable event on it,
599 root 1.4 because exiting the process closes the socket (if it didn't create any
600     children using fork).
601    
602 root 1.9 Example: create a template for a process pool, pass a few strings, some
603     file handles, then fork, pass one more string, and run some code.
604    
605     my $pool = AnyEvent::Fork
606     ->new
607     ->send_arg ("str1", "str2")
608     ->send_fh ($fh1, $fh2);
609    
610     for (1..2) {
611     $pool
612     ->fork
613     ->send_arg ("str3")
614     ->run ("Some::function", sub {
615     my ($fh) = @_;
616    
617     # fh is nonblocking, but we trust that the OS can accept these
618     # extra 3 octets anyway.
619     syswrite $fh, "hi #$_\n";
620    
621     # $fh is being closed here, as we don't store it anywhere
622     });
623     }
624    
625     # Some::function might look like this - all parameters passed before fork
626     # and after will be passed, in order, after the communications socket.
627     sub Some::function {
628     my ($fh, $str1, $str2, $fh1, $fh2, $str3) = @_;
629    
630     print scalar <$fh>; # prints "hi 1\n" and "hi 2\n"
631     }
632    
633 root 1.4 =cut
634    
635     sub run {
636     my ($self, $func, $cb) = @_;
637    
638     $self->[0] = $cb;
639 root 1.9 $self->_cmd (r => $func);
640 root 1.4 }
641    
642 root 1.1 =back
643    
644 root 1.16 =head1 PERFORMANCE
645    
646     Now for some unscientific benchmark numbers (all done on an amd64
647     GNU/Linux box). These are intended to give you an idea of the relative
648 root 1.18 performance you can expect, they are not meant to be absolute performance
649     numbers.
650 root 1.16
651 root 1.17 OK, so, I ran a simple benchmark that creates a socket pair, forks, calls
652 root 1.16 exit in the child and waits for the socket to close in the parent. I did
653 root 1.18 load AnyEvent, EV and AnyEvent::Fork, for a total process size of 5100kB.
654 root 1.16
655 root 1.18 2079 new processes per second, using manual socketpair + fork
656 root 1.16
657     Then I did the same thing, but instead of calling fork, I called
658     AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
659     socket form the child to close on exit. This does the same thing as manual
660 root 1.17 socket pair + fork, except that what is forked is the template process
661 root 1.16 (2440kB), and the socket needs to be passed to the server at the other end
662     of the socket first.
663    
664     2307 new processes per second, using AnyEvent::Fork->new
665    
666     And finally, using C<new_exec> instead C<new>, using vforks+execs to exec
667     a new perl interpreter and compile the small server each time, I get:
668    
669     479 vfork+execs per second, using AnyEvent::Fork->new_exec
670    
671 root 1.17 So how can C<< AnyEvent->new >> be faster than a standard fork, even
672     though it uses the same operations, but adds a lot of overhead?
673 root 1.16
674     The difference is simply the process size: forking the 6MB process takes
675     so much longer than forking the 2.5MB template process that the overhead
676     introduced is canceled out.
677    
678     If the benchmark process grows, the normal fork becomes even slower:
679    
680     1340 new processes, manual fork in a 20MB process
681     731 new processes, manual fork in a 200MB process
682     235 new processes, manual fork in a 2000MB process
683    
684 root 1.17 What that means (to me) is that I can use this module without having a
685     very bad conscience because of the extra overhead required to start new
686 root 1.16 processes.
687    
688 root 1.15 =head1 TYPICAL PROBLEMS
689    
690     This section lists typical problems that remain. I hope by recognising
691     them, most can be avoided.
692    
693     =over 4
694    
695 root 1.17 =item exit runs destructors
696    
697 root 1.15 =item "leaked" file descriptors for exec'ed processes
698    
699     POSIX systems inherit file descriptors by default when exec'ing a new
700     process. While perl itself laudably sets the close-on-exec flags on new
701     file handles, most C libraries don't care, and even if all cared, it's
702     often not possible to set the flag in a race-free manner.
703    
704     That means some file descriptors can leak through. And since it isn't
705 root 1.17 possible to know which file descriptors are "good" and "necessary" (or
706     even to know which file descriptors are open), there is no good way to
707 root 1.15 close the ones that might harm.
708    
709     As an example of what "harm" can be done consider a web server that
710     accepts connections and afterwards some module uses AnyEvent::Fork for the
711     first time, causing it to fork and exec a new process, which might inherit
712     the network socket. When the server closes the socket, it is still open
713     in the child (which doesn't even know that) and the client might conclude
714     that the connection is still fine.
715    
716     For the main program, there are multiple remedies available -
717     L<AnyEvent::Fork::Early> is one, creating a process early and not using
718     C<new_exec> is another, as in both cases, the first process can be exec'ed
719     well before many random file descriptors are open.
720    
721     In general, the solution for these kind of problems is to fix the
722     libraries or the code that leaks those file descriptors.
723    
724 root 1.17 Fortunately, most of these leaked descriptors do no harm, other than
725 root 1.15 sitting on some resources.
726    
727     =item "leaked" file descriptors for fork'ed processes
728    
729     Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
730     which closes file descriptors not marked for being inherited.
731    
732     However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
733     a way to create these processes by forking, and this leaks more file
734     descriptors than exec'ing them, as there is no way to mark descriptors as
735     "close on fork".
736    
737     An example would be modules like L<EV>, L<IO::AIO> or L<Gtk2>. Both create
738     pipes for internal uses, and L<Gtk2> might open a connection to the X
739     server. L<EV> and L<IO::AIO> can deal with fork, but Gtk2 might have
740     trouble with a fork.
741    
742     The solution is to either not load these modules before use'ing
743     L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
744     initialising them, for example, by calling C<init Gtk2> manually.
745    
746     =back
747    
748 root 1.8 =head1 PORTABILITY NOTES
749    
750 root 1.10 Native win32 perls are somewhat supported (AnyEvent::Fork::Early is a nop,
751     and ::Template is not going to work), and it cost a lot of blood and sweat
752     to make it so, mostly due to the bloody broken perl that nobody seems to
753     care about. The fork emulation is a bad joke - I have yet to see something
754 root 1.17 useful that you can do with it without running into memory corruption
755 root 1.10 issues or other braindamage. Hrrrr.
756    
757     Cygwin perl is not supported at the moment, as it should implement fd
758     passing, but doesn't, and rolling my own is hard, as cygwin doesn't
759     support enough functionality to do it.
760 root 1.8
761 root 1.13 =head1 SEE ALSO
762    
763     L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
764     L<AnyEvent::Fork::Template> (to create a process by forking the main
765     program at a convenient time).
766    
767 root 1.1 =head1 AUTHOR
768    
769     Marc Lehmann <schmorp@schmorp.de>
770     http://home.schmorp.de/
771    
772     =cut
773    
774     1
775