ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-Fork/Fork.pm
(Generate patch)

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.31 by root, Sat Apr 6 09:29:26 2013 UTC vs.
Revision 1.44 by root, Thu Apr 18 10:49:59 2013 UTC

27 27
28Special care has been taken to make this module useful from other modules, 28Special care has been taken to make this module useful from other modules,
29while still supporting specialised environments such as L<App::Staticperl> 29while still supporting specialised environments such as L<App::Staticperl>
30or L<PAR::Packer>. 30or L<PAR::Packer>.
31 31
32=head1 WHAT THIS MODULE IS NOT 32=head2 WHAT THIS MODULE IS NOT
33 33
34This module only creates processes and lets you pass file handles and 34This module only creates processes and lets you pass file handles and
35strings to it, and run perl code. It does not implement any kind of RPC - 35strings to it, and run perl code. It does not implement any kind of RPC -
36there is no back channel from the process back to you, and there is no RPC 36there is no back channel from the process back to you, and there is no RPC
37or message passing going on. 37or message passing going on.
38 38
39If you need some form of RPC, you can either implement it yourself 39If you need some form of RPC, you could use the L<AnyEvent::Fork::RPC>
40in whatever way you like, use some message-passing module such 40companion module, which adds simple RPC/job queueing to a process created
41as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use 41by this module.
42L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
43and so on.
44 42
43Or you can implement it yourself in whatever way you like, use some
44message-passing module such as L<AnyEvent::MP>, some pipe such as
45L<AnyEvent::ZeroMQ>, use L<AnyEvent::Handle> on both sides to send
46e.g. JSON or Storable messages, and so on.
47
48=head2 COMPARISON TO OTHER MODULES
49
50There is an abundance of modules on CPAN that do "something fork", such as
51L<Parallel::ForkManager>, L<AnyEvent::ForkManager>, L<AnyEvent::Worker>
52or L<AnyEvent::Subprocess>. There are modules that implement their own
53process management, such as L<AnyEvent::DBI>.
54
55The problems that all these modules try to solve are real, however, none
56of them (from what I have seen) tackle the very real problems of unwanted
57memory sharing, efficiency, not being able to use event processing or
58similar modules in the processes they create.
59
60This module doesn't try to replace any of them - instead it tries to solve
61the problem of creating processes with a minimum of fuss and overhead (and
62also luxury). Ideally, most of these would use AnyEvent::Fork internally,
63except they were written before AnyEvent:Fork was available, so obviously
64had to roll their own.
65
45=head1 PROBLEM STATEMENT 66=head2 PROBLEM STATEMENT
46 67
47There are two traditional ways to implement parallel processing on UNIX 68There are two traditional ways to implement parallel processing on UNIX
48like operating systems - fork and process, and fork+exec and process. They 69like operating systems - fork and process, and fork+exec and process. They
49have different advantages and disadvantages that I describe below, 70have different advantages and disadvantages that I describe below,
50together with how this module tries to mitigate the disadvantages. 71together with how this module tries to mitigate the disadvantages.
152 173
153 # now $master_filehandle is connected to the 174 # now $master_filehandle is connected to the
154 # $slave_filehandle in the new process. 175 # $slave_filehandle in the new process.
155 }); 176 });
156 177
157MyModule might look like this: 178C<MyModule> might look like this:
158 179
159 package MyModule; 180 package MyModule;
160 181
161 sub worker { 182 sub worker {
162 my ($slave_filehandle) = @_; 183 my ($slave_filehandle) = @_;
185 } 206 }
186 207
187 # now do other things - maybe use the filehandle provided by run 208 # now do other things - maybe use the filehandle provided by run
188 # to wait for the processes to die. or whatever. 209 # to wait for the processes to die. or whatever.
189 210
190My::Server might look like this: 211C<My::Server> might look like this:
191 212
192 package My::Server; 213 package My::Server;
193 214
194 sub run { 215 sub run {
195 my ($slave, $listener, $id) = @_; 216 my ($slave, $listener, $id) = @_;
203 } 224 }
204 } 225 }
205 226
206=head2 use AnyEvent::Fork as a faster fork+exec 227=head2 use AnyEvent::Fork as a faster fork+exec
207 228
208This runs /bin/echo hi, with stdout redirected to /tmp/log and stderr to 229This runs C</bin/echo hi>, with standard output redirected to F</tmp/log>
209the communications socket. It is usually faster than fork+exec, but still 230and standard error redirected to the communications socket. It is usually
210let's you prepare the environment. 231faster than fork+exec, but still lets you prepare the environment.
211 232
212 open my $output, ">/tmp/log" or die "$!"; 233 open my $output, ">/tmp/log" or die "$!";
213 234
214 AnyEvent::Fork 235 AnyEvent::Fork
215 ->new 236 ->new
216 ->eval (' 237 ->eval ('
238 # compile a helper function for later use
217 sub run { 239 sub run {
218 my ($fh, $output, @cmd) = @_; 240 my ($fh, $output, @cmd) = @_;
219 241
220 # perl will clear close-on-exec on STDOUT/STDERR 242 # perl will clear close-on-exec on STDOUT/STDERR
221 open STDOUT, ">&", $output or die; 243 open STDOUT, ">&", $output or die;
351use AnyEvent; 373use AnyEvent;
352use AnyEvent::Util (); 374use AnyEvent::Util ();
353 375
354use IO::FDPass; 376use IO::FDPass;
355 377
356our $VERSION = 0.5; 378our $VERSION = 0.6;
357
358our $PERL; # the path to the perl interpreter, deduces with various forms of magic
359
360=over 4
361
362=back
363
364=cut
365 379
366# the early fork template process 380# the early fork template process
367our $EARLY; 381our $EARLY;
368 382
369# the empty template process 383# the empty template process
370our $TEMPLATE; 384our $TEMPLATE;
385
386sub QUEUE() { 0 }
387sub FH() { 1 }
388sub WW() { 2 }
389sub PID() { 3 }
390sub CB() { 4 }
391
392sub _new {
393 my ($self, $fh, $pid) = @_;
394
395 AnyEvent::Util::fh_nonblocking $fh, 1;
396
397 $self = bless [
398 [], # write queue - strings or fd's
399 $fh,
400 undef, # AE watcher
401 $pid,
402 ], $self;
403
404 $self
405}
371 406
372sub _cmd { 407sub _cmd {
373 my $self = shift; 408 my $self = shift;
374 409
375 # ideally, we would want to use "a (w/a)*" as format string, but perl 410 # ideally, we would want to use "a (w/a)*" as format string, but perl
376 # versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack 411 # versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
377 # it. 412 # it.
378 push @{ $self->[2] }, pack "a L/a*", $_[0], $_[1]; 413 push @{ $self->[QUEUE] }, pack "a L/a*", $_[0], $_[1];
379 414
380 $self->[3] ||= AE::io $self->[1], 1, sub { 415 $self->[WW] ||= AE::io $self->[FH], 1, sub {
381 do { 416 do {
382 # send the next "thing" in the queue - either a reference to an fh, 417 # send the next "thing" in the queue - either a reference to an fh,
383 # or a plain string. 418 # or a plain string.
384 419
385 if (ref $self->[2][0]) { 420 if (ref $self->[QUEUE][0]) {
386 # send fh 421 # send fh
387 unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) { 422 unless (IO::FDPass::send fileno $self->[FH], fileno ${ $self->[QUEUE][0] }) {
388 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK; 423 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
389 undef $self->[3]; 424 undef $self->[WW];
390 die "AnyEvent::Fork: file descriptor send failure: $!"; 425 die "AnyEvent::Fork: file descriptor send failure: $!";
391 } 426 }
392 427
393 shift @{ $self->[2] }; 428 shift @{ $self->[QUEUE] };
394 429
395 } else { 430 } else {
396 # send string 431 # send string
397 my $len = syswrite $self->[1], $self->[2][0]; 432 my $len = syswrite $self->[FH], $self->[QUEUE][0];
398 433
399 unless ($len) { 434 unless ($len) {
400 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK; 435 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
401 undef $self->[3]; 436 undef $self->[3];
402 die "AnyEvent::Fork: command write failure: $!"; 437 die "AnyEvent::Fork: command write failure: $!";
403 } 438 }
404 439
405 substr $self->[2][0], 0, $len, ""; 440 substr $self->[QUEUE][0], 0, $len, "";
406 shift @{ $self->[2] } unless length $self->[2][0]; 441 shift @{ $self->[QUEUE] } unless length $self->[QUEUE][0];
407 } 442 }
408 } while @{ $self->[2] }; 443 } while @{ $self->[QUEUE] };
409 444
410 # everything written 445 # everything written
411 undef $self->[3]; 446 undef $self->[WW];
412 447
413 # invoke run callback, if any 448 # invoke run callback, if any
414 $self->[4]->($self->[1]) if $self->[4]; 449 $self->[CB]->($self->[FH]) if $self->[CB];
415 }; 450 };
416 451
417 () # make sure we don't leak the watcher 452 () # make sure we don't leak the watcher
418}
419
420sub _new {
421 my ($self, $fh, $pid) = @_;
422
423 AnyEvent::Util::fh_nonblocking $fh, 1;
424
425 $self = bless [
426 $pid,
427 $fh,
428 [], # write queue - strings or fd's
429 undef, # AE watcher
430 ], $self;
431
432 $self
433} 453}
434 454
435# fork template from current process, used by AnyEvent::Fork::Early/Template 455# fork template from current process, used by AnyEvent::Fork::Early/Template
436sub _new_fork { 456sub _new_fork {
437 my ($fh, $slave) = AnyEvent::Util::portable_socketpair; 457 my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
442 if ($pid eq 0) { 462 if ($pid eq 0) {
443 require AnyEvent::Fork::Serve; 463 require AnyEvent::Fork::Serve;
444 $AnyEvent::Fork::Serve::OWNER = $parent; 464 $AnyEvent::Fork::Serve::OWNER = $parent;
445 close $fh; 465 close $fh;
446 $0 = "$_[1] of $parent"; 466 $0 = "$_[1] of $parent";
447 $SIG{CHLD} = 'IGNORE';
448 AnyEvent::Fork::Serve::serve ($slave); 467 AnyEvent::Fork::Serve::serve ($slave);
449 exit 0; 468 exit 0;
450 } elsif (!$pid) { 469 } elsif (!$pid) {
451 die "AnyEvent::Fork::Early/Template: unable to fork template process: $!"; 470 die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
452 } 471 }
559} 578}
560 579
561=item $pid = $proc->pid 580=item $pid = $proc->pid
562 581
563Returns the process id of the process I<iff it is a direct child of the 582Returns the process id of the process I<iff it is a direct child of the
564process> running AnyEvent::Fork, and C<undef> otherwise. 583process running AnyEvent::Fork>, and C<undef> otherwise.
565 584
566Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and 585Normally, only processes created via C<< AnyEvent::Fork->new_exec >> and
567L<AnyEvent::Fork::Template> are direct children, and you are responsible 586L<AnyEvent::Fork::Template> are direct children, and you are responsible
568to clean up their zombies when they die. 587to clean up their zombies when they die.
569 588
571AnyEvent::Fork itself. 590AnyEvent::Fork itself.
572 591
573=cut 592=cut
574 593
575sub pid { 594sub pid {
576 $_[0][0] 595 $_[0][PID]
577} 596}
578 597
579=item $proc = $proc->eval ($perlcode, @args) 598=item $proc = $proc->eval ($perlcode, @args)
580 599
581Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to 600Evaluates the given C<$perlcode> as ... Perl code, while setting C<@_> to
582the strings specified by C<@args>, in the "main" package. 601the strings specified by C<@args>, in the "main" package.
583 602
584This call is meant to do any custom initialisation that might be required 603This call is meant to do any custom initialisation that might be required
585(for example, the C<require> method uses it). It's not supposed to be used 604(for example, the C<require> method uses it). It's not supposed to be used
586to completely take over the process, use C<run> for that. 605to completely take over the process, use C<run> for that.
587 606
588The code will usually be executed after this call returns, and there is no 607The code will usually be executed after this call returns, and there is no
589way to pass anything back to the calling process. Any evaluation errors 608way to pass anything back to the calling process. Any evaluation errors
590will be reported to stderr and cause the process to exit. 609will be reported to stderr and cause the process to exit.
591 610
592If you want to execute some code to take over the process (see the 611If you want to execute some code (that isn't in a module) to take over the
593"fork+exec" example in the SYNOPSIS), you should compile a function via 612process, you should compile a function via C<eval> first, and then call
594C<eval> first, and then call it via C<run>. This also gives you access to 613it via C<run>. This also gives you access to any arguments passed via the
595any arguments passed via the C<send_xxx> methods, such as file handles. 614C<send_xxx> methods, such as file handles. See the L<use AnyEvent::Fork as
615a faster fork+exec> example to see it in action.
596 616
597Returns the process object for easy chaining of method calls. 617Returns the process object for easy chaining of method calls.
598 618
599=cut 619=cut
600 620
626=item $proc = $proc->send_fh ($handle, ...) 646=item $proc = $proc->send_fh ($handle, ...)
627 647
628Send one or more file handles (I<not> file descriptors) to the process, 648Send one or more file handles (I<not> file descriptors) to the process,
629to prepare a call to C<run>. 649to prepare a call to C<run>.
630 650
631The process object keeps a reference to the handles until this is done, 651The process object keeps a reference to the handles until they have
632so you must not explicitly close the handles. This is most easily 652been passed over to the process, so you must not explicitly close the
633accomplished by simply not storing the file handles anywhere after passing 653handles. This is most easily accomplished by simply not storing the file
634them to this method. 654handles anywhere after passing them to this method - when AnyEvent::Fork
655is finished using them, perl will automatically close them.
635 656
636Returns the process object for easy chaining of method calls. 657Returns the process object for easy chaining of method calls.
637 658
638Example: pass a file handle to a process, and release it without 659Example: pass a file handle to a process, and release it without
639closing. It will be closed automatically when it is no longer used. 660closing. It will be closed automatically when it is no longer used.
646sub send_fh { 667sub send_fh {
647 my ($self, @fh) = @_; 668 my ($self, @fh) = @_;
648 669
649 for my $fh (@fh) { 670 for my $fh (@fh) {
650 $self->_cmd ("h"); 671 $self->_cmd ("h");
651 push @{ $self->[2] }, \$fh; 672 push @{ $self->[QUEUE] }, \$fh;
652 } 673 }
653 674
654 $self 675 $self
655} 676}
656 677
657=item $proc = $proc->send_arg ($string, ...) 678=item $proc = $proc->send_arg ($string, ...)
658 679
659Send one or more argument strings to the process, to prepare a call to 680Send one or more argument strings to the process, to prepare a call to
660C<run>. The strings can be any octet string. 681C<run>. The strings can be any octet strings.
661 682
662The protocol is optimised to pass a moderate number of relatively short 683The protocol is optimised to pass a moderate number of relatively short
663strings - while you can pass up to 4GB of data in one go, this is more 684strings - while you can pass up to 4GB of data in one go, this is more
664meant to pass some ID information or other startup info, not big chunks of 685meant to pass some ID information or other startup info, not big chunks of
665data. 686data.
681Enter the function specified by the function name in C<$func> in the 702Enter the function specified by the function name in C<$func> in the
682process. The function is called with the communication socket as first 703process. The function is called with the communication socket as first
683argument, followed by all file handles and string arguments sent earlier 704argument, followed by all file handles and string arguments sent earlier
684via C<send_fh> and C<send_arg> methods, in the order they were called. 705via C<send_fh> and C<send_arg> methods, in the order they were called.
685 706
707The process object becomes unusable on return from this function - any
708further method calls result in undefined behaviour.
709
686The function name should be fully qualified, but if it isn't, it will be 710The function name should be fully qualified, but if it isn't, it will be
687looked up in the main package. 711looked up in the C<main> package.
688 712
689If the called function returns, doesn't exist, or any error occurs, the 713If the called function returns, doesn't exist, or any error occurs, the
690process exits. 714process exits.
691 715
692Preparing the process is done in the background - when all commands have 716Preparing the process is done in the background - when all commands have
693been sent, the callback is invoked with the local communications socket 717been sent, the callback is invoked with the local communications socket
694as argument. At this point you can start using the socket in any way you 718as argument. At this point you can start using the socket in any way you
695like. 719like.
696
697The process object becomes unusable on return from this function - any
698further method calls result in undefined behaviour.
699 720
700If the communication socket isn't used, it should be closed on both sides, 721If the communication socket isn't used, it should be closed on both sides,
701to save on kernel memory. 722to save on kernel memory.
702 723
703The socket is non-blocking in the parent, and blocking in the newly 724The socket is non-blocking in the parent, and blocking in the newly
742=cut 763=cut
743 764
744sub run { 765sub run {
745 my ($self, $func, $cb) = @_; 766 my ($self, $func, $cb) = @_;
746 767
747 $self->[4] = $cb; 768 $self->[CB] = $cb;
748 $self->_cmd (r => $func); 769 $self->_cmd (r => $func);
749} 770}
750 771
751=back 772=back
752 773
778 479 vfork+execs per second, using AnyEvent::Fork->new_exec 799 479 vfork+execs per second, using AnyEvent::Fork->new_exec
779 800
780So how can C<< AnyEvent->new >> be faster than a standard fork, even 801So how can C<< AnyEvent->new >> be faster than a standard fork, even
781though it uses the same operations, but adds a lot of overhead? 802though it uses the same operations, but adds a lot of overhead?
782 803
783The difference is simply the process size: forking the 6MB process takes 804The difference is simply the process size: forking the 5MB process takes
784so much longer than forking the 2.5MB template process that the overhead 805so much longer than forking the 2.5MB template process that the extra
785introduced is canceled out. 806overhead is canceled out.
786 807
787If the benchmark process grows, the normal fork becomes even slower: 808If the benchmark process grows, the normal fork becomes even slower:
788 809
789 1340 new processes, manual fork in a 20MB process 810 1340 new processes, manual fork of a 20MB process
790 731 new processes, manual fork in a 200MB process 811 731 new processes, manual fork of a 200MB process
791 235 new processes, manual fork in a 2000MB process 812 235 new processes, manual fork of a 2000MB process
792 813
793What that means (to me) is that I can use this module without having a 814What that means (to me) is that I can use this module without having a bad
794very bad conscience because of the extra overhead required to start new 815conscience because of the extra overhead required to start new processes.
795processes.
796 816
797=head1 TYPICAL PROBLEMS 817=head1 TYPICAL PROBLEMS
798 818
799This section lists typical problems that remain. I hope by recognising 819This section lists typical problems that remain. I hope by recognising
800them, most can be avoided. 820them, most can be avoided.
801 821
802=over 4 822=over 4
803 823
804=item "leaked" file descriptors for exec'ed processes 824=item leaked file descriptors for exec'ed processes
805 825
806POSIX systems inherit file descriptors by default when exec'ing a new 826POSIX systems inherit file descriptors by default when exec'ing a new
807process. While perl itself laudably sets the close-on-exec flags on new 827process. While perl itself laudably sets the close-on-exec flags on new
808file handles, most C libraries don't care, and even if all cared, it's 828file handles, most C libraries don't care, and even if all cared, it's
809often not possible to set the flag in a race-free manner. 829often not possible to set the flag in a race-free manner.
829libraries or the code that leaks those file descriptors. 849libraries or the code that leaks those file descriptors.
830 850
831Fortunately, most of these leaked descriptors do no harm, other than 851Fortunately, most of these leaked descriptors do no harm, other than
832sitting on some resources. 852sitting on some resources.
833 853
834=item "leaked" file descriptors for fork'ed processes 854=item leaked file descriptors for fork'ed processes
835 855
836Normally, L<AnyEvent::Fork> does start new processes by exec'ing them, 856Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
837which closes file descriptors not marked for being inherited. 857which closes file descriptors not marked for being inherited.
838 858
839However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer 859However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
848 868
849The solution is to either not load these modules before use'ing 869The solution is to either not load these modules before use'ing
850L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay 870L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
851initialising them, for example, by calling C<init Gtk2> manually. 871initialising them, for example, by calling C<init Gtk2> manually.
852 872
853=item exit runs destructors 873=item exiting calls object destructors
854 874
855This only applies to users of Lc<AnyEvent::Fork:Early> and 875This only applies to users of L<AnyEvent::Fork:Early> and
856L<AnyEvent::Fork::Template>. 876L<AnyEvent::Fork::Template>, or when initialising code creates objects
877that reference external resources.
857 878
858When a process created by AnyEvent::Fork exits, it might do so by calling 879When a process created by AnyEvent::Fork exits, it might do so by calling
859exit, or simply letting perl reach the end of the program. At which point 880exit, or simply letting perl reach the end of the program. At which point
860Perl runs all destructors. 881Perl runs all destructors.
861 882
880to make it so, mostly due to the bloody broken perl that nobody seems to 901to make it so, mostly due to the bloody broken perl that nobody seems to
881care about. The fork emulation is a bad joke - I have yet to see something 902care about. The fork emulation is a bad joke - I have yet to see something
882useful that you can do with it without running into memory corruption 903useful that you can do with it without running into memory corruption
883issues or other braindamage. Hrrrr. 904issues or other braindamage. Hrrrr.
884 905
885Cygwin perl is not supported at the moment, as it should implement fd 906Cygwin perl is not supported at the moment due to some hilarious
886passing, but doesn't, and rolling my own is hard, as cygwin doesn't 907shortcomings of its API - see L<IO::FDPoll> for more details.
887support enough functionality to do it.
888 908
889=head1 SEE ALSO 909=head1 SEE ALSO
890 910
891L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter), 911L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter),
892L<AnyEvent::Fork::Template> (to create a process by forking the main 912L<AnyEvent::Fork::Template> (to create a process by forking the main
893program at a convenient time). 913program at a convenient time), L<AnyEvent::Fork::RPC> (for simple RPC to
914child processes).
894 915
895=head1 AUTHOR 916=head1 AUTHOR AND CONTACT INFORMATION
896 917
897 Marc Lehmann <schmorp@schmorp.de> 918 Marc Lehmann <schmorp@schmorp.de>
898 http://home.schmorp.de/ 919 http://software.schmorp.de/pkg/AnyEvent-Fork
899 920
900=cut 921=cut
901 922
9021 9231
903 924

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines