ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent.pm
(Generate patch)

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.67 by root, Fri Apr 25 06:58:38 2008 UTC vs.
Revision 1.75 by root, Fri Apr 25 07:49:39 2008 UTC

864 864
865 865
866=head1 BENCHMARK 866=head1 BENCHMARK
867 867
868To give you an idea of the performance and overheads that AnyEvent adds 868To give you an idea of the performance and overheads that AnyEvent adds
869over the backends directly, here is a benchmark of various supported event 869over the event loops themselves (and to give you an impression of the
870speed of various event loops), here is a benchmark of various supported
870models natively and with anyevent. The benchmark creates a lot of timers 871event models natively and with anyevent. The benchmark creates a lot of
871(with a zero timeout) and io watchers (watching STDOUT, a pty, to become 872timers (with a zero timeout) and io watchers (watching STDOUT, a pty, to
872writable, which it is), lets them fire exactly once and destroys them 873become writable, which it is), lets them fire exactly once and destroys
873again. 874them again.
874 875
875Explanation of the fields: 876=head2 Explanation of the columns
876 877
877I<watcher> is the number of event watchers created/destroyed. Sicne 878I<watcher> is the number of event watchers created/destroyed. Since
878different event models have vastly different performance each backend was 879different event models feature vastly different performances, each event
879handed a number of watchers so that overall runtime is acceptable and 880loop was given a number of watchers so that overall runtime is acceptable
880similar to all backends (and keep them from crashing). 881and similar between tested event loop (and keep them from crashing): Glib
882would probably take thousands of years if asked to process the same number
883of watchers as EV in this benchmark.
881 884
882I<bytes> is the number of bytes (as measured by resident set size) used by 885I<bytes> is the number of bytes (as measured by the resident set size,
883each watcher. 886RSS) consumed by each watcher. This method of measuring captures both C
887and Perl-based overheads.
884 888
885I<create> is the time, in microseconds, to create a single watcher. 889I<create> is the time, in microseconds (millionths of seconds), that it
890takes to create a single watcher. The callback is a closure shared between
891all watchers, to avoid adding memory overhead. That means closure creation
892and memory usage is not included in the figures.
886 893
887I<invoke> is the time, in microseconds, used to invoke a simple callback 894I<invoke> is the time, in microseconds, used to invoke a simple
888that simply counts down. 895callback. The callback simply counts down a Perl variable and after it was
896invoked "watcher" times, it would C<< ->broadcast >> a condvar once to
897signal the end of this phase.
889 898
890I<destroy> is the time, in microseconds, to destroy a single watcher. 899I<destroy> is the time, in microseconds, that it takes to destroy a single
900watcher.
891 901
902=head2 Results
903
892 name watcher bytes create invoke destroy comment 904 name watchers bytes create invoke destroy comment
893 EV/EV 400000 244 0.56 0.46 0.31 EV native interface 905 EV/EV 400000 244 0.56 0.46 0.31 EV native interface
894 EV/Any 100000 610 3.52 0.91 0.75 906 EV/Any 100000 610 3.52 0.91 0.75 EV + AnyEvent watchers
895 CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal 907 CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal
896 Perl/Any 10000 654 4.64 1.22 0.77 pure perl implementation 908 Perl/Any 16000 654 4.64 1.22 0.77 pure perl implementation
897 Event/Event 10000 523 28.05 21.38 5.22 Event native interface 909 Event/Event 16000 523 28.05 21.38 0.86 Event native interface
898 Event/Any 10000 943 34.43 20.48 1.39 910 Event/Any 16000 943 34.43 20.48 1.39 Event + AnyEvent watchers
899 Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour 911 Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour
900 Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers 912 Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers
913 POE/Event 2000 6644 108.15 768.19 14.33 via POE::Loop::Event
901 POE/Select 2000 6343 94.69 807.65 562.69 POE::Loop::Select 914 POE/Select 2000 6343 94.69 807.65 562.69 via POE::Loop::Select
902 POE/Event 2000 6644 108.15 768.19 14.33 POE::Loop::Event
903 915
904Discussion: The benchmark does I<not> bench scalability of the 916=head2 Discussion
917
918The benchmark does I<not> measure scalability of the event loop very
905backend. For example a select-based backend (such as the pureperl one) can 919well. For example, a select-based event loop (such as the pure perl one)
906never compete with a backend using epoll. In this benchmark, only a single 920can never compete with an event loop that uses epoll when the number of
907filehandle is used. 921file descriptors grows high. In this benchmark, only a single filehandle
922is used (although some of the AnyEvent adaptors dup() its file descriptor
923to worka round bugs).
908 924
909EV is the sole leader regarding speed and memory use, which are both 925C<EV> is the sole leader regarding speed and memory use, which are both
910maximal/minimal. Even when going through AnyEvent, there is only one event 926maximal/minimal, respectively. Even when going through AnyEvent, there is
911loop that uses less memory (the Event module natively), and no faster 927only one event loop that uses less memory (the C<Event> module natively), and
912event model. 928no faster event model, not event C<Event> natively.
913 929
914The pure perl implementation is hit in a few sweet spots (both the 930The pure perl implementation is hit in a few sweet spots (both the
915zero timeout and the use of a single fd hit optimisations in the perl 931zero timeout and the use of a single fd hit optimisations in the perl
916interpreter and the backend itself), but it shows that it adds very little 932interpreter and the backend itself). Nevertheless tis shows that it
917overhead in itself. Like any select-based backend it's performance becomes 933adds very little overhead in itself. Like any select-based backend its
918really bad with lots of file descriptors. 934performance becomes really bad with lots of file descriptors, of course,
935but this was not subjetc of this benchmark.
919 936
920The Event module has a relatively high setup and callback invocation cost, 937The C<Event> module has a relatively high setup and callback invocation cost,
921but overall scores on the third place. 938but overall scores on the third place.
922 939
923Glib has a little higher memory cost, a bit fster callback invocation and 940C<Glib>'s memory usage is quite a bit bit higher, but it features a
924has a similar speed as Event. 941faster callback invocation and overall ends up in the same class as
942C<Event>. However, Glib scales extremely badly, doubling the number of
943watchers increases the processing time by more than a factor of four,
944making it completely unusable when using larger numbers of watchers
945(note that only a single file descriptor was used in the benchmark, so
946inefficiencies of C<poll> do not account for this).
925 947
926The Tk backend works relatively well, the fact that it crashes with 948The C<Tk> adaptor works relatively well. The fact that it crashes with
927more than 2000 watchers is a big setback, however, as correctness takes 949more than 2000 watchers is a big setback, however, as correctness takes
928precedence over speed. 950precedence over speed. Nevertheless, its performance is surprising, as the
951file descriptor is dup()ed for each watcher. This shows that the dup()
952employed by some adaptors is not a big performance issue (it does incur a
953hidden memory cost inside the kernel, though, that is not reflected in the
954figures above).
929 955
930POE, regardless of backend (wether it's pure perl select backend or the 956C<POE>, regardless of underlying event loop (wether using its pure perl
931Event backend) shows abysmal performance and memory usage: Watchers use 957select-based backend or the Event module) shows abysmal performance and
932almost 30 times as much memory as EV watchers, and 10 times as much memory 958memory usage: Watchers use almost 30 times as much memory as EV watchers,
933as both Event or EV via AnyEvent. 959and 10 times as much memory as both Event or EV via AnyEvent. Watcher
960invocation is almost 700 times slower than with AnyEvent's pure perl
961implementation. The design of the POE adaptor class in AnyEvent can not
962really account for this, as session creation overhead is small compared
963to execution of the state machine, which is coded pretty optimally within
964L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.
934 965
966=head2 Summary
967
935Summary: using EV through AnyEvent is faster than any other event 968Using EV through AnyEvent is faster than any other event loop, but most
936loop. The overhead AnyEvent adds can be very small, and you should avoid 969event loops have acceptable performance with or without AnyEvent.
937POE like the plague if you want performance or reasonable memory usage. 970
971The overhead AnyEvent adds is usually much smaller than the overhead of
972the actual event loop, only with extremely fast event loops such as the EV
973adds AnyEvent significant overhead.
974
975And you should simply avoid POE like the plague if you want performance or
976reasonable memory usage.
938 977
939 978
940=head1 FORK 979=head1 FORK
941 980
942Most event libraries are not fork-safe. The ones who are usually are 981Most event libraries are not fork-safe. The ones who are usually are

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines