ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-Fork/Fork.pm
(Generate patch)

Comparing AnyEvent-Fork/Fork.pm (file contents):
Revision 1.35 by root, Sat Apr 6 09:39:12 2013 UTC vs.
Revision 1.49 by root, Fri Apr 19 12:56:53 2013 UTC

27 27
28Special care has been taken to make this module useful from other modules, 28Special care has been taken to make this module useful from other modules,
29while still supporting specialised environments such as L<App::Staticperl> 29while still supporting specialised environments such as L<App::Staticperl>
30or L<PAR::Packer>. 30or L<PAR::Packer>.
31 31
32=head1 WHAT THIS MODULE IS NOT 32=head2 WHAT THIS MODULE IS NOT
33 33
34This module only creates processes and lets you pass file handles and 34This module only creates processes and lets you pass file handles and
35strings to it, and run perl code. It does not implement any kind of RPC - 35strings to it, and run perl code. It does not implement any kind of RPC -
36there is no back channel from the process back to you, and there is no RPC 36there is no back channel from the process back to you, and there is no RPC
37or message passing going on. 37or message passing going on.
38 38
39If you need some form of RPC, you can either implement it yourself 39If you need some form of RPC, you could use the L<AnyEvent::Fork::RPC>
40in whatever way you like, use some message-passing module such 40companion module, which adds simple RPC/job queueing to a process created
41as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use 41by this module.
42L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages,
43and so on.
44 42
43Or you can implement it yourself in whatever way you like, use some
44message-passing module such as L<AnyEvent::MP>, some pipe such as
45L<AnyEvent::ZeroMQ>, use L<AnyEvent::Handle> on both sides to send
46e.g. JSON or Storable messages, and so on.
47
48=head2 COMPARISON TO OTHER MODULES
49
50There is an abundance of modules on CPAN that do "something fork", such as
51L<Parallel::ForkManager>, L<AnyEvent::ForkManager>, L<AnyEvent::Worker>
52or L<AnyEvent::Subprocess>. There are modules that implement their own
53process management, such as L<AnyEvent::DBI>.
54
55The problems that all these modules try to solve are real, however, none
56of them (from what I have seen) tackle the very real problems of unwanted
57memory sharing, efficiency, not being able to use event processing or
58similar modules in the processes they create.
59
60This module doesn't try to replace any of them - instead it tries to solve
61the problem of creating processes with a minimum of fuss and overhead (and
62also luxury). Ideally, most of these would use AnyEvent::Fork internally,
63except they were written before AnyEvent:Fork was available, so obviously
64had to roll their own.
65
45=head1 PROBLEM STATEMENT 66=head2 PROBLEM STATEMENT
46 67
47There are two traditional ways to implement parallel processing on UNIX 68There are two traditional ways to implement parallel processing on UNIX
48like operating systems - fork and process, and fork+exec and process. They 69like operating systems - fork and process, and fork+exec and process. They
49have different advantages and disadvantages that I describe below, 70have different advantages and disadvantages that I describe below,
50together with how this module tries to mitigate the disadvantages. 71together with how this module tries to mitigate the disadvantages.
203 } 224 }
204 } 225 }
205 226
206=head2 use AnyEvent::Fork as a faster fork+exec 227=head2 use AnyEvent::Fork as a faster fork+exec
207 228
208This runs C</bin/echo hi>, with stdandard output redirected to /tmp/log 229This runs C</bin/echo hi>, with standard output redirected to F</tmp/log>
209and standard error redirected to the communications socket. It is usually 230and standard error redirected to the communications socket. It is usually
210faster than fork+exec, but still lets you prepare the environment. 231faster than fork+exec, but still lets you prepare the environment.
211 232
212 open my $output, ">/tmp/log" or die "$!"; 233 open my $output, ">/tmp/log" or die "$!";
213 234
214 AnyEvent::Fork 235 AnyEvent::Fork
215 ->new 236 ->new
216 ->eval (' 237 ->eval ('
238 # compile a helper function for later use
217 sub run { 239 sub run {
218 my ($fh, $output, @cmd) = @_; 240 my ($fh, $output, @cmd) = @_;
219 241
220 # perl will clear close-on-exec on STDOUT/STDERR 242 # perl will clear close-on-exec on STDOUT/STDERR
221 open STDOUT, ">&", $output or die; 243 open STDOUT, ">&", $output or die;
232 254
233=head1 CONCEPTS 255=head1 CONCEPTS
234 256
235This module can create new processes either by executing a new perl 257This module can create new processes either by executing a new perl
236process, or by forking from an existing "template" process. 258process, or by forking from an existing "template" process.
259
260All these processes are called "child processes" (whether they are direct
261children or not), while the process that manages them is called the
262"parent process".
237 263
238Each such process comes with its own file handle that can be used to 264Each such process comes with its own file handle that can be used to
239communicate with it (it's actually a socket - one end in the new process, 265communicate with it (it's actually a socket - one end in the new process,
240one end in the main process), and among the things you can do in it are 266one end in the main process), and among the things you can do in it are
241load modules, fork new processes, send file handles to it, and execute 267load modules, fork new processes, send file handles to it, and execute
351use AnyEvent; 377use AnyEvent;
352use AnyEvent::Util (); 378use AnyEvent::Util ();
353 379
354use IO::FDPass; 380use IO::FDPass;
355 381
356our $VERSION = 0.5; 382our $VERSION = 0.7;
357
358our $PERL; # the path to the perl interpreter, deduces with various forms of magic
359
360=over 4
361
362=back
363
364=cut
365 383
366# the early fork template process 384# the early fork template process
367our $EARLY; 385our $EARLY;
368 386
369# the empty template process 387# the empty template process
370our $TEMPLATE; 388our $TEMPLATE;
389
390sub QUEUE() { 0 }
391sub FH() { 1 }
392sub WW() { 2 }
393sub PID() { 3 }
394sub CB() { 4 }
395
396sub _new {
397 my ($self, $fh, $pid) = @_;
398
399 AnyEvent::Util::fh_nonblocking $fh, 1;
400
401 $self = bless [
402 [], # write queue - strings or fd's
403 $fh,
404 undef, # AE watcher
405 $pid,
406 ], $self;
407
408 $self
409}
371 410
372sub _cmd { 411sub _cmd {
373 my $self = shift; 412 my $self = shift;
374 413
375 # ideally, we would want to use "a (w/a)*" as format string, but perl 414 # ideally, we would want to use "a (w/a)*" as format string, but perl
376 # versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack 415 # versions from at least 5.8.9 to 5.16.3 are all buggy and can't unpack
377 # it. 416 # it.
378 push @{ $self->[2] }, pack "a L/a*", $_[0], $_[1]; 417 push @{ $self->[QUEUE] }, pack "a L/a*", $_[0], $_[1];
379 418
380 $self->[3] ||= AE::io $self->[1], 1, sub { 419 $self->[WW] ||= AE::io $self->[FH], 1, sub {
381 do { 420 do {
382 # send the next "thing" in the queue - either a reference to an fh, 421 # send the next "thing" in the queue - either a reference to an fh,
383 # or a plain string. 422 # or a plain string.
384 423
385 if (ref $self->[2][0]) { 424 if (ref $self->[QUEUE][0]) {
386 # send fh 425 # send fh
387 unless (IO::FDPass::send fileno $self->[1], fileno ${ $self->[2][0] }) { 426 unless (IO::FDPass::send fileno $self->[FH], fileno ${ $self->[QUEUE][0] }) {
388 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK; 427 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
389 undef $self->[3]; 428 undef $self->[WW];
390 die "AnyEvent::Fork: file descriptor send failure: $!"; 429 die "AnyEvent::Fork: file descriptor send failure: $!";
391 } 430 }
392 431
393 shift @{ $self->[2] }; 432 shift @{ $self->[QUEUE] };
394 433
395 } else { 434 } else {
396 # send string 435 # send string
397 my $len = syswrite $self->[1], $self->[2][0]; 436 my $len = syswrite $self->[FH], $self->[QUEUE][0];
398 437
399 unless ($len) { 438 unless ($len) {
400 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK; 439 return if $! == Errno::EAGAIN || $! == Errno::EWOULDBLOCK;
401 undef $self->[3]; 440 undef $self->[3];
402 die "AnyEvent::Fork: command write failure: $!"; 441 die "AnyEvent::Fork: command write failure: $!";
403 } 442 }
404 443
405 substr $self->[2][0], 0, $len, ""; 444 substr $self->[QUEUE][0], 0, $len, "";
406 shift @{ $self->[2] } unless length $self->[2][0]; 445 shift @{ $self->[QUEUE] } unless length $self->[QUEUE][0];
407 } 446 }
408 } while @{ $self->[2] }; 447 } while @{ $self->[QUEUE] };
409 448
410 # everything written 449 # everything written
411 undef $self->[3]; 450 undef $self->[WW];
412 451
413 # invoke run callback, if any 452 # invoke run callback, if any
414 $self->[4]->($self->[1]) if $self->[4]; 453 if ($self->[CB]) {
454 $self->[CB]->($self->[FH]);
455 @$self = ();
456 }
415 }; 457 };
416 458
417 () # make sure we don't leak the watcher 459 () # make sure we don't leak the watcher
418}
419
420sub _new {
421 my ($self, $fh, $pid) = @_;
422
423 AnyEvent::Util::fh_nonblocking $fh, 1;
424
425 $self = bless [
426 $pid,
427 $fh,
428 [], # write queue - strings or fd's
429 undef, # AE watcher
430 ], $self;
431
432 $self
433} 460}
434 461
435# fork template from current process, used by AnyEvent::Fork::Early/Template 462# fork template from current process, used by AnyEvent::Fork::Early/Template
436sub _new_fork { 463sub _new_fork {
437 my ($fh, $slave) = AnyEvent::Util::portable_socketpair; 464 my ($fh, $slave) = AnyEvent::Util::portable_socketpair;
442 if ($pid eq 0) { 469 if ($pid eq 0) {
443 require AnyEvent::Fork::Serve; 470 require AnyEvent::Fork::Serve;
444 $AnyEvent::Fork::Serve::OWNER = $parent; 471 $AnyEvent::Fork::Serve::OWNER = $parent;
445 close $fh; 472 close $fh;
446 $0 = "$_[1] of $parent"; 473 $0 = "$_[1] of $parent";
447 $SIG{CHLD} = 'IGNORE';
448 AnyEvent::Fork::Serve::serve ($slave); 474 AnyEvent::Fork::Serve::serve ($slave);
449 exit 0; 475 exit 0;
450 } elsif (!$pid) { 476 } elsif (!$pid) {
451 die "AnyEvent::Fork::Early/Template: unable to fork template process: $!"; 477 die "AnyEvent::Fork::Early/Template: unable to fork template process: $!";
452 } 478 }
571AnyEvent::Fork itself. 597AnyEvent::Fork itself.
572 598
573=cut 599=cut
574 600
575sub pid { 601sub pid {
576 $_[0][0] 602 $_[0][PID]
577} 603}
578 604
579=item $proc = $proc->eval ($perlcode, @args) 605=item $proc = $proc->eval ($perlcode, @args)
580 606
581Evaluates the given C<$perlcode> as ... perl code, while setting C<@_> to 607Evaluates the given C<$perlcode> as ... Perl code, while setting C<@_> to
582the strings specified by C<@args>, in the "main" package. 608the strings specified by C<@args>, in the "main" package.
583 609
584This call is meant to do any custom initialisation that might be required 610This call is meant to do any custom initialisation that might be required
585(for example, the C<require> method uses it). It's not supposed to be used 611(for example, the C<require> method uses it). It's not supposed to be used
586to completely take over the process, use C<run> for that. 612to completely take over the process, use C<run> for that.
648sub send_fh { 674sub send_fh {
649 my ($self, @fh) = @_; 675 my ($self, @fh) = @_;
650 676
651 for my $fh (@fh) { 677 for my $fh (@fh) {
652 $self->_cmd ("h"); 678 $self->_cmd ("h");
653 push @{ $self->[2] }, \$fh; 679 push @{ $self->[QUEUE] }, \$fh;
654 } 680 }
655 681
656 $self 682 $self
657} 683}
658 684
744=cut 770=cut
745 771
746sub run { 772sub run {
747 my ($self, $func, $cb) = @_; 773 my ($self, $func, $cb) = @_;
748 774
749 $self->[4] = $cb; 775 $self->[CB] = $cb;
750 $self->_cmd (r => $func); 776 $self->_cmd (r => $func);
777}
778
779=item $proc->to_fh ($cb->($fh))
780
781Flushes all commands out to the process and then calls the callback with
782the communications socket.
783
784The process object becomes unusable on return from this function - any
785further method calls result in undefined behaviour.
786
787The point of this method is to give you a file handle thta you cna pass
788to another process. In that other process, you can call C<new_from_fh
789AnyEvent::Fork::RPC> to create a new C<AnyEvent::Fork> object from it,
790thereby effectively passing a fork object to another process.
791
792=cut
793
794sub to_fh {
795 my ($self, $cb) = @_;
796
797 $self->[CB] = $cb;
798
799 unless ($self->[WW]) {
800 $self->[CB]->($self->[FH]);
801 @$self = ();
802 }
803}
804
805=item new_from_fh AnyEvent::Fork $fh
806
807Takes a file handle originally rceeived by the C<to_fh> method and creates
808a new C<AnyEvent:Fork> object. The child process itself will not change in
809any way, i.e. it will keep all the modifications done to it before calling
810C<to_fh>.
811
812The new object is very much like the original object, except that the
813C<pid> method will return C<undef> even if the process is a direct child.
814
815=cut
816
817sub new_from_fh {
818 my ($class, $fh) = @_;
819
820 $class->_new ($fh)
751} 821}
752 822
753=back 823=back
754 824
755=head1 PERFORMANCE 825=head1 PERFORMANCE
765 835
766 2079 new processes per second, using manual socketpair + fork 836 2079 new processes per second, using manual socketpair + fork
767 837
768Then I did the same thing, but instead of calling fork, I called 838Then I did the same thing, but instead of calling fork, I called
769AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the 839AnyEvent::Fork->new->run ("CORE::exit") and then again waited for the
770socket form the child to close on exit. This does the same thing as manual 840socket from the child to close on exit. This does the same thing as manual
771socket pair + fork, except that what is forked is the template process 841socket pair + fork, except that what is forked is the template process
772(2440kB), and the socket needs to be passed to the server at the other end 842(2440kB), and the socket needs to be passed to the server at the other end
773of the socket first. 843of the socket first.
774 844
775 2307 new processes per second, using AnyEvent::Fork->new 845 2307 new processes per second, using AnyEvent::Fork->new
780 479 vfork+execs per second, using AnyEvent::Fork->new_exec 850 479 vfork+execs per second, using AnyEvent::Fork->new_exec
781 851
782So how can C<< AnyEvent->new >> be faster than a standard fork, even 852So how can C<< AnyEvent->new >> be faster than a standard fork, even
783though it uses the same operations, but adds a lot of overhead? 853though it uses the same operations, but adds a lot of overhead?
784 854
785The difference is simply the process size: forking the 6MB process takes 855The difference is simply the process size: forking the 5MB process takes
786so much longer than forking the 2.5MB template process that the overhead 856so much longer than forking the 2.5MB template process that the extra
787introduced is canceled out. 857overhead is canceled out.
788 858
789If the benchmark process grows, the normal fork becomes even slower: 859If the benchmark process grows, the normal fork becomes even slower:
790 860
791 1340 new processes, manual fork in a 20MB process 861 1340 new processes, manual fork of a 20MB process
792 731 new processes, manual fork in a 200MB process 862 731 new processes, manual fork of a 200MB process
793 235 new processes, manual fork in a 2000MB process 863 235 new processes, manual fork of a 2000MB process
794 864
795What that means (to me) is that I can use this module without having a 865What that means (to me) is that I can use this module without having a bad
796very bad conscience because of the extra overhead required to start new 866conscience because of the extra overhead required to start new processes.
797processes.
798 867
799=head1 TYPICAL PROBLEMS 868=head1 TYPICAL PROBLEMS
800 869
801This section lists typical problems that remain. I hope by recognising 870This section lists typical problems that remain. I hope by recognising
802them, most can be avoided. 871them, most can be avoided.
803 872
804=over 4 873=over 4
805 874
806=item "leaked" file descriptors for exec'ed processes 875=item leaked file descriptors for exec'ed processes
807 876
808POSIX systems inherit file descriptors by default when exec'ing a new 877POSIX systems inherit file descriptors by default when exec'ing a new
809process. While perl itself laudably sets the close-on-exec flags on new 878process. While perl itself laudably sets the close-on-exec flags on new
810file handles, most C libraries don't care, and even if all cared, it's 879file handles, most C libraries don't care, and even if all cared, it's
811often not possible to set the flag in a race-free manner. 880often not possible to set the flag in a race-free manner.
831libraries or the code that leaks those file descriptors. 900libraries or the code that leaks those file descriptors.
832 901
833Fortunately, most of these leaked descriptors do no harm, other than 902Fortunately, most of these leaked descriptors do no harm, other than
834sitting on some resources. 903sitting on some resources.
835 904
836=item "leaked" file descriptors for fork'ed processes 905=item leaked file descriptors for fork'ed processes
837 906
838Normally, L<AnyEvent::Fork> does start new processes by exec'ing them, 907Normally, L<AnyEvent::Fork> does start new processes by exec'ing them,
839which closes file descriptors not marked for being inherited. 908which closes file descriptors not marked for being inherited.
840 909
841However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer 910However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer
850 919
851The solution is to either not load these modules before use'ing 920The solution is to either not load these modules before use'ing
852L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay 921L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay
853initialising them, for example, by calling C<init Gtk2> manually. 922initialising them, for example, by calling C<init Gtk2> manually.
854 923
855=item exit runs destructors 924=item exiting calls object destructors
856 925
857This only applies to users of Lc<AnyEvent::Fork:Early> and 926This only applies to users of L<AnyEvent::Fork:Early> and
858L<AnyEvent::Fork::Template>. 927L<AnyEvent::Fork::Template>, or when initialising code creates objects
928that reference external resources.
859 929
860When a process created by AnyEvent::Fork exits, it might do so by calling 930When a process created by AnyEvent::Fork exits, it might do so by calling
861exit, or simply letting perl reach the end of the program. At which point 931exit, or simply letting perl reach the end of the program. At which point
862Perl runs all destructors. 932Perl runs all destructors.
863 933
882to make it so, mostly due to the bloody broken perl that nobody seems to 952to make it so, mostly due to the bloody broken perl that nobody seems to
883care about. The fork emulation is a bad joke - I have yet to see something 953care about. The fork emulation is a bad joke - I have yet to see something
884useful that you can do with it without running into memory corruption 954useful that you can do with it without running into memory corruption
885issues or other braindamage. Hrrrr. 955issues or other braindamage. Hrrrr.
886 956
887Cygwin perl is not supported at the moment, as it should implement fd 957Since fork is endlessly broken on win32 perls (it doesn't even remotely
888passing, but doesn't, and rolling my own is hard, as cygwin doesn't 958work within it's documented limits) and quite obviously it's not getting
889support enough functionality to do it. 959improved any time soon, the best way to proceed on windows would be to
960always use C<new_exec> and thus never rely on perl's fork "emulation".
961
962Cygwin perl is not supported at the moment due to some hilarious
963shortcomings of its API - see L<IO::FDPoll> for more details. If you never
964use C<send_fh> and always use C<new_exec> to create processes, it should
965work though.
890 966
891=head1 SEE ALSO 967=head1 SEE ALSO
892 968
893L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter), 969L<AnyEvent::Fork::Early>, to avoid executing a perl interpreter at all
970(part of this distribution).
971
894L<AnyEvent::Fork::Template> (to create a process by forking the main 972L<AnyEvent::Fork::Template>, to create a process by forking the main
895program at a convenient time). 973program at a convenient time (part of this distribution).
896 974
897=head1 AUTHOR 975L<AnyEvent::Fork::RPC>, for simple RPC to child processes (on CPAN).
976
977=head1 AUTHOR AND CONTACT INFORMATION
898 978
899 Marc Lehmann <schmorp@schmorp.de> 979 Marc Lehmann <schmorp@schmorp.de>
900 http://home.schmorp.de/ 980 http://software.schmorp.de/pkg/AnyEvent-Fork
901 981
902=cut 982=cut
903 983
9041 9841
905 985

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines