… | |
… | |
27 | |
27 | |
28 | Special care has been taken to make this module useful from other modules, |
28 | Special care has been taken to make this module useful from other modules, |
29 | while still supporting specialised environments such as L<App::Staticperl> |
29 | while still supporting specialised environments such as L<App::Staticperl> |
30 | or L<PAR::Packer>. |
30 | or L<PAR::Packer>. |
31 | |
31 | |
32 | =head1 WHAT THIS MODULE IS NOT |
32 | =head2 WHAT THIS MODULE IS NOT |
33 | |
33 | |
34 | This module only creates processes and lets you pass file handles and |
34 | This module only creates processes and lets you pass file handles and |
35 | strings to it, and run perl code. It does not implement any kind of RPC - |
35 | strings to it, and run perl code. It does not implement any kind of RPC - |
36 | there is no back channel from the process back to you, and there is no RPC |
36 | there is no back channel from the process back to you, and there is no RPC |
37 | or message passing going on. |
37 | or message passing going on. |
… | |
… | |
40 | in whatever way you like, use some message-passing module such |
40 | in whatever way you like, use some message-passing module such |
41 | as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use |
41 | as L<AnyEvent::MP>, some pipe such as L<AnyEvent::ZeroMQ>, use |
42 | L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages, |
42 | L<AnyEvent::Handle> on both sides to send e.g. JSON or Storable messages, |
43 | and so on. |
43 | and so on. |
44 | |
44 | |
|
|
45 | =head2 COMPARISON TO OTHER MODULES |
|
|
46 | |
|
|
47 | There is an abundance of modules on CPAN that do "something fork", such as |
|
|
48 | L<Parallel::ForkManager>, L<AnyEvent::ForkManager>, L<AnyEvent::Worker> |
|
|
49 | or L<AnyEvent::Subprocess>. There are modules that implement their own |
|
|
50 | process management, such as L<AnyEvent::DBI>. |
|
|
51 | |
|
|
52 | The problems that all these modules try to solve are real, however, none |
|
|
53 | of them (from what I have seen) tackle the very real problems of unwanted |
|
|
54 | memory sharing, efficiency, not being able to use event processing or |
|
|
55 | similar modules in the processes they create. |
|
|
56 | |
|
|
57 | This module doesn't try to replace any of them - instead it tries to solve |
|
|
58 | the problem of creating processes with a minimum of fuss and overhead (and |
|
|
59 | also luxury). Ideally, most of these would use AnyEvent::Fork internally, |
|
|
60 | except they were written before AnyEvent:Fork was available, so obviously |
|
|
61 | had to roll their own. |
|
|
62 | |
45 | =head1 PROBLEM STATEMENT |
63 | =head2 PROBLEM STATEMENT |
46 | |
64 | |
47 | There are two traditional ways to implement parallel processing on UNIX |
65 | There are two traditional ways to implement parallel processing on UNIX |
48 | like operating systems - fork and process, and fork+exec and process. They |
66 | like operating systems - fork and process, and fork+exec and process. They |
49 | have different advantages and disadvantages that I describe below, |
67 | have different advantages and disadvantages that I describe below, |
50 | together with how this module tries to mitigate the disadvantages. |
68 | together with how this module tries to mitigate the disadvantages. |
… | |
… | |
591 | |
609 | |
592 | If you want to execute some code (that isn't in a module) to take over the |
610 | If you want to execute some code (that isn't in a module) to take over the |
593 | process, you should compile a function via C<eval> first, and then call |
611 | process, you should compile a function via C<eval> first, and then call |
594 | it via C<run>. This also gives you access to any arguments passed via the |
612 | it via C<run>. This also gives you access to any arguments passed via the |
595 | C<send_xxx> methods, such as file handles. See the L<use AnyEvent::Fork as |
613 | C<send_xxx> methods, such as file handles. See the L<use AnyEvent::Fork as |
596 | a faster fork+exec> example. |
614 | a faster fork+exec> example to see it in action. |
597 | |
615 | |
598 | Returns the process object for easy chaining of method calls. |
616 | Returns the process object for easy chaining of method calls. |
599 | |
617 | |
600 | =cut |
618 | =cut |
601 | |
619 | |
… | |
… | |
627 | =item $proc = $proc->send_fh ($handle, ...) |
645 | =item $proc = $proc->send_fh ($handle, ...) |
628 | |
646 | |
629 | Send one or more file handles (I<not> file descriptors) to the process, |
647 | Send one or more file handles (I<not> file descriptors) to the process, |
630 | to prepare a call to C<run>. |
648 | to prepare a call to C<run>. |
631 | |
649 | |
632 | The process object keeps a reference to the handles until this is done, |
650 | The process object keeps a reference to the handles until they have |
633 | so you must not explicitly close the handles. This is most easily |
651 | been passed over to the process, so you must not explicitly close the |
634 | accomplished by simply not storing the file handles anywhere after passing |
652 | handles. This is most easily accomplished by simply not storing the file |
635 | them to this method. |
653 | handles anywhere after passing them to this method - when AnyEvent::Fork |
|
|
654 | is finished using them, perl will automatically close them. |
636 | |
655 | |
637 | Returns the process object for easy chaining of method calls. |
656 | Returns the process object for easy chaining of method calls. |
638 | |
657 | |
639 | Example: pass a file handle to a process, and release it without |
658 | Example: pass a file handle to a process, and release it without |
640 | closing. It will be closed automatically when it is no longer used. |
659 | closing. It will be closed automatically when it is no longer used. |
… | |
… | |
656 | } |
675 | } |
657 | |
676 | |
658 | =item $proc = $proc->send_arg ($string, ...) |
677 | =item $proc = $proc->send_arg ($string, ...) |
659 | |
678 | |
660 | Send one or more argument strings to the process, to prepare a call to |
679 | Send one or more argument strings to the process, to prepare a call to |
661 | C<run>. The strings can be any octet string. |
680 | C<run>. The strings can be any octet strings. |
662 | |
681 | |
663 | The protocol is optimised to pass a moderate number of relatively short |
682 | The protocol is optimised to pass a moderate number of relatively short |
664 | strings - while you can pass up to 4GB of data in one go, this is more |
683 | strings - while you can pass up to 4GB of data in one go, this is more |
665 | meant to pass some ID information or other startup info, not big chunks of |
684 | meant to pass some ID information or other startup info, not big chunks of |
666 | data. |
685 | data. |
… | |
… | |
682 | Enter the function specified by the function name in C<$func> in the |
701 | Enter the function specified by the function name in C<$func> in the |
683 | process. The function is called with the communication socket as first |
702 | process. The function is called with the communication socket as first |
684 | argument, followed by all file handles and string arguments sent earlier |
703 | argument, followed by all file handles and string arguments sent earlier |
685 | via C<send_fh> and C<send_arg> methods, in the order they were called. |
704 | via C<send_fh> and C<send_arg> methods, in the order they were called. |
686 | |
705 | |
|
|
706 | The process object becomes unusable on return from this function - any |
|
|
707 | further method calls result in undefined behaviour. |
|
|
708 | |
687 | The function name should be fully qualified, but if it isn't, it will be |
709 | The function name should be fully qualified, but if it isn't, it will be |
688 | looked up in the main package. |
710 | looked up in the C<main> package. |
689 | |
711 | |
690 | If the called function returns, doesn't exist, or any error occurs, the |
712 | If the called function returns, doesn't exist, or any error occurs, the |
691 | process exits. |
713 | process exits. |
692 | |
714 | |
693 | Preparing the process is done in the background - when all commands have |
715 | Preparing the process is done in the background - when all commands have |
694 | been sent, the callback is invoked with the local communications socket |
716 | been sent, the callback is invoked with the local communications socket |
695 | as argument. At this point you can start using the socket in any way you |
717 | as argument. At this point you can start using the socket in any way you |
696 | like. |
718 | like. |
697 | |
|
|
698 | The process object becomes unusable on return from this function - any |
|
|
699 | further method calls result in undefined behaviour. |
|
|
700 | |
719 | |
701 | If the communication socket isn't used, it should be closed on both sides, |
720 | If the communication socket isn't used, it should be closed on both sides, |
702 | to save on kernel memory. |
721 | to save on kernel memory. |
703 | |
722 | |
704 | The socket is non-blocking in the parent, and blocking in the newly |
723 | The socket is non-blocking in the parent, and blocking in the newly |
… | |
… | |
779 | 479 vfork+execs per second, using AnyEvent::Fork->new_exec |
798 | 479 vfork+execs per second, using AnyEvent::Fork->new_exec |
780 | |
799 | |
781 | So how can C<< AnyEvent->new >> be faster than a standard fork, even |
800 | So how can C<< AnyEvent->new >> be faster than a standard fork, even |
782 | though it uses the same operations, but adds a lot of overhead? |
801 | though it uses the same operations, but adds a lot of overhead? |
783 | |
802 | |
784 | The difference is simply the process size: forking the 6MB process takes |
803 | The difference is simply the process size: forking the 5MB process takes |
785 | so much longer than forking the 2.5MB template process that the overhead |
804 | so much longer than forking the 2.5MB template process that the extra |
786 | introduced is canceled out. |
805 | overhead introduced is canceled out. |
787 | |
806 | |
788 | If the benchmark process grows, the normal fork becomes even slower: |
807 | If the benchmark process grows, the normal fork becomes even slower: |
789 | |
808 | |
790 | 1340 new processes, manual fork in a 20MB process |
809 | 1340 new processes, manual fork of a 20MB process |
791 | 731 new processes, manual fork in a 200MB process |
810 | 731 new processes, manual fork of a 200MB process |
792 | 235 new processes, manual fork in a 2000MB process |
811 | 235 new processes, manual fork of a 2000MB process |
793 | |
812 | |
794 | What that means (to me) is that I can use this module without having a |
813 | What that means (to me) is that I can use this module without having a bad |
795 | very bad conscience because of the extra overhead required to start new |
814 | conscience because of the extra overhead required to start new processes. |
796 | processes. |
|
|
797 | |
815 | |
798 | =head1 TYPICAL PROBLEMS |
816 | =head1 TYPICAL PROBLEMS |
799 | |
817 | |
800 | This section lists typical problems that remain. I hope by recognising |
818 | This section lists typical problems that remain. I hope by recognising |
801 | them, most can be avoided. |
819 | them, most can be avoided. |
802 | |
820 | |
803 | =over 4 |
821 | =over 4 |
804 | |
822 | |
805 | =item "leaked" file descriptors for exec'ed processes |
823 | =item leaked file descriptors for exec'ed processes |
806 | |
824 | |
807 | POSIX systems inherit file descriptors by default when exec'ing a new |
825 | POSIX systems inherit file descriptors by default when exec'ing a new |
808 | process. While perl itself laudably sets the close-on-exec flags on new |
826 | process. While perl itself laudably sets the close-on-exec flags on new |
809 | file handles, most C libraries don't care, and even if all cared, it's |
827 | file handles, most C libraries don't care, and even if all cared, it's |
810 | often not possible to set the flag in a race-free manner. |
828 | often not possible to set the flag in a race-free manner. |
… | |
… | |
830 | libraries or the code that leaks those file descriptors. |
848 | libraries or the code that leaks those file descriptors. |
831 | |
849 | |
832 | Fortunately, most of these leaked descriptors do no harm, other than |
850 | Fortunately, most of these leaked descriptors do no harm, other than |
833 | sitting on some resources. |
851 | sitting on some resources. |
834 | |
852 | |
835 | =item "leaked" file descriptors for fork'ed processes |
853 | =item leaked file descriptors for fork'ed processes |
836 | |
854 | |
837 | Normally, L<AnyEvent::Fork> does start new processes by exec'ing them, |
855 | Normally, L<AnyEvent::Fork> does start new processes by exec'ing them, |
838 | which closes file descriptors not marked for being inherited. |
856 | which closes file descriptors not marked for being inherited. |
839 | |
857 | |
840 | However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer |
858 | However, L<AnyEvent::Fork::Early> and L<AnyEvent::Fork::Template> offer |
… | |
… | |
849 | |
867 | |
850 | The solution is to either not load these modules before use'ing |
868 | The solution is to either not load these modules before use'ing |
851 | L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay |
869 | L<AnyEvent::Fork::Early> or L<AnyEvent::Fork::Template>, or to delay |
852 | initialising them, for example, by calling C<init Gtk2> manually. |
870 | initialising them, for example, by calling C<init Gtk2> manually. |
853 | |
871 | |
854 | =item exit runs destructors |
872 | =item exiting calls object destructors |
855 | |
873 | |
856 | This only applies to users of Lc<AnyEvent::Fork:Early> and |
874 | This only applies to users of L<AnyEvent::Fork:Early> and |
857 | L<AnyEvent::Fork::Template>. |
875 | L<AnyEvent::Fork::Template>, or when initialiasing code creates objects |
|
|
876 | that reference external resources. |
858 | |
877 | |
859 | When a process created by AnyEvent::Fork exits, it might do so by calling |
878 | When a process created by AnyEvent::Fork exits, it might do so by calling |
860 | exit, or simply letting perl reach the end of the program. At which point |
879 | exit, or simply letting perl reach the end of the program. At which point |
861 | Perl runs all destructors. |
880 | Perl runs all destructors. |
862 | |
881 | |
… | |
… | |
881 | to make it so, mostly due to the bloody broken perl that nobody seems to |
900 | to make it so, mostly due to the bloody broken perl that nobody seems to |
882 | care about. The fork emulation is a bad joke - I have yet to see something |
901 | care about. The fork emulation is a bad joke - I have yet to see something |
883 | useful that you can do with it without running into memory corruption |
902 | useful that you can do with it without running into memory corruption |
884 | issues or other braindamage. Hrrrr. |
903 | issues or other braindamage. Hrrrr. |
885 | |
904 | |
886 | Cygwin perl is not supported at the moment, as it should implement fd |
905 | Cygwin perl is not supported at the moment due to some hilarious |
887 | passing, but doesn't, and rolling my own is hard, as cygwin doesn't |
906 | shortcomings of its API - see L<IO::FDPoll> for more details. |
888 | support enough functionality to do it. |
|
|
889 | |
907 | |
890 | =head1 SEE ALSO |
908 | =head1 SEE ALSO |
891 | |
909 | |
892 | L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter), |
910 | L<AnyEvent::Fork::Early> (to avoid executing a perl interpreter), |
893 | L<AnyEvent::Fork::Template> (to create a process by forking the main |
911 | L<AnyEvent::Fork::Template> (to create a process by forking the main |