… | |
… | |
864 | |
864 | |
865 | |
865 | |
866 | =head1 BENCHMARK |
866 | =head1 BENCHMARK |
867 | |
867 | |
868 | To give you an idea of the performance and overheads that AnyEvent adds |
868 | To give you an idea of the performance and overheads that AnyEvent adds |
|
|
869 | over the event loops themselves (and to give you an impression of the |
869 | over the event loops directly, here is a benchmark of various supported |
870 | speed of various event loops), here is a benchmark of various supported |
870 | event models natively and with anyevent. The benchmark creates a lot of |
871 | event models natively and with anyevent. The benchmark creates a lot of |
871 | timers (with a zero timeout) and io watchers (watching STDOUT, a pty, to |
872 | timers (with a zero timeout) and io watchers (watching STDOUT, a pty, to |
872 | become writable, which it is), lets them fire exactly once and destroys |
873 | become writable, which it is), lets them fire exactly once and destroys |
873 | them again. |
874 | them again. |
874 | |
875 | |
875 | =head2 Explanation of the fields |
876 | =head2 Explanation of the columns |
876 | |
877 | |
877 | I<watcher> is the number of event watchers created/destroyed. Since |
878 | I<watcher> is the number of event watchers created/destroyed. Since |
878 | different event models feature vastly different performances, each event |
879 | different event models feature vastly different performances, each event |
879 | loop was given a number of watchers so that overall runtime is acceptable |
880 | loop was given a number of watchers so that overall runtime is acceptable |
880 | and similar between tested event loop (and keep them from crashing): Glib |
881 | and similar between tested event loop (and keep them from crashing): Glib |
… | |
… | |
890 | all watchers, to avoid adding memory overhead. That means closure creation |
891 | all watchers, to avoid adding memory overhead. That means closure creation |
891 | and memory usage is not included in the figures. |
892 | and memory usage is not included in the figures. |
892 | |
893 | |
893 | I<invoke> is the time, in microseconds, used to invoke a simple |
894 | I<invoke> is the time, in microseconds, used to invoke a simple |
894 | callback. The callback simply counts down a Perl variable and after it was |
895 | callback. The callback simply counts down a Perl variable and after it was |
895 | invoked "watcher" times, it would C<< ->broadcast >> a condvar once. |
896 | invoked "watcher" times, it would C<< ->broadcast >> a condvar once to |
|
|
897 | signal the end of this phase. |
896 | |
898 | |
897 | I<destroy> is the time, in microseconds, that it takes destroy a single |
899 | I<destroy> is the time, in microseconds, that it takes to destroy a single |
898 | watcher. |
900 | watcher. |
899 | |
901 | |
900 | =head2 Results |
902 | =head2 Results |
901 | |
903 | |
902 | name watcher bytes create invoke destroy comment |
904 | name watcher bytes create invoke destroy comment |
903 | EV/EV 400000 244 0.56 0.46 0.31 EV native interface |
905 | EV/EV 400000 244 0.56 0.46 0.31 EV native interface |
904 | EV/Any 100000 610 3.52 0.91 0.75 |
906 | EV/Any 100000 610 3.52 0.91 0.75 EV + AnyEvent watchers |
905 | CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal |
907 | CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal |
906 | Perl/Any 10000 654 4.64 1.22 0.77 pure perl implementation |
908 | Perl/Any 16000 654 4.64 1.22 0.77 pure perl implementation |
907 | Event/Event 10000 523 28.05 21.38 5.22 Event native interface |
909 | Event/Event 16000 523 28.05 21.38 0.86 Event native interface |
908 | Event/Any 10000 943 34.43 20.48 1.39 |
910 | Event/Any 16000 943 34.43 20.48 1.39 Event + AnyEvent watchers |
909 | Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour |
911 | Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour |
910 | Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers |
912 | Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers |
|
|
913 | POE/Event 2000 6644 108.15 768.19 14.33 via POE::Loop::Event |
911 | POE/Select 2000 6343 94.69 807.65 562.69 POE::Loop::Select |
914 | POE/Select 2000 6343 94.69 807.65 562.69 via POE::Loop::Select |
912 | POE/Event 2000 6644 108.15 768.19 14.33 POE::Loop::Event |
|
|
913 | |
915 | |
914 | =head2 Discussion |
916 | =head2 Discussion |
915 | |
917 | |
916 | The benchmark does I<not> measure scalability of the event loop very |
918 | The benchmark does I<not> measure scalability of the event loop very |
917 | well. For example, a select-based event loop (such as the pure perl one) |
919 | well. For example, a select-based event loop (such as the pure perl one) |
… | |
… | |
943 | precedence over speed. Nevertheless, its performance is surprising, as the |
945 | precedence over speed. Nevertheless, its performance is surprising, as the |
944 | file descriptor is dup()ed for each watcher. This shows that the dup() |
946 | file descriptor is dup()ed for each watcher. This shows that the dup() |
945 | employed by some adaptors is not a big performance issue (it does incur a |
947 | employed by some adaptors is not a big performance issue (it does incur a |
946 | hidden memory cost inside the kernel, though). |
948 | hidden memory cost inside the kernel, though). |
947 | |
949 | |
948 | C<POE>, regardless of backend (wether using its pure perl select-based |
950 | C<POE>, regardless of underlying event loop (wether using its pure perl |
949 | backend or the Event backend) shows abysmal performance and memory |
951 | select-based backend or the Event module) shows abysmal performance and |
950 | usage: Watchers use almost 30 times as much memory as EV watchers, and 10 |
952 | memory usage: Watchers use almost 30 times as much memory as EV watchers, |
951 | times as much memory as both Event or EV via AnyEvent. Watcher invocation |
953 | and 10 times as much memory as both Event or EV via AnyEvent. Watcher |
952 | is almost 700 times slower as with AnyEvent's pure perl implementation. |
954 | invocation is almost 700 times slower than with AnyEvent's pure perl |
|
|
955 | implementation. The design of the POE adaptor class in AnyEvent can not |
|
|
956 | really account for this, as session creation overhead is small compared |
|
|
957 | to execution of the state machine, which is coded pretty optimally within |
|
|
958 | L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow. |
953 | |
959 | |
|
|
960 | =head2 Summary |
|
|
961 | |
954 | Summary: using EV through AnyEvent is faster than any other event |
962 | Using EV through AnyEvent is faster than any other event loop, but most |
955 | loop. The overhead AnyEvent adds can be very small, and you should avoid |
963 | event loops have acceptable performance with or without AnyEvent. |
956 | POE like the plague if you want performance or reasonable memory usage. |
964 | |
|
|
965 | The overhead AnyEvent adds is usually much smaller than the overhead of |
|
|
966 | the actual event loop, only with extremely fast event loops such as the EV |
|
|
967 | adds Anyevent significant overhead. |
|
|
968 | |
|
|
969 | And you should simply avoid POE like the plague if you want performance or |
|
|
970 | reasonable memory usage. |
957 | |
971 | |
958 | |
972 | |
959 | =head1 FORK |
973 | =head1 FORK |
960 | |
974 | |
961 | Most event libraries are not fork-safe. The ones who are usually are |
975 | Most event libraries are not fork-safe. The ones who are usually are |