--- AnyEvent/lib/AnyEvent.pm 2008/04/25 02:03:18 1.62 +++ AnyEvent/lib/AnyEvent.pm 2008/04/25 07:43:25 1.72 @@ -436,7 +436,7 @@ use Carp; -our $VERSION = '3.2'; +our $VERSION = '3.3'; our $MODEL; our $AUTOLOAD; @@ -862,6 +862,114 @@ $quit->wait; + +=head1 BENCHMARK + +To give you an idea of the performance and overheads that AnyEvent adds +over the event loops themselves (and to give you an impression of the +speed of various event loops), here is a benchmark of various supported +event models natively and with anyevent. The benchmark creates a lot of +timers (with a zero timeout) and io watchers (watching STDOUT, a pty, to +become writable, which it is), lets them fire exactly once and destroys +them again. + +=head2 Explanation of the columns + +I is the number of event watchers created/destroyed. Since +different event models feature vastly different performances, each event +loop was given a number of watchers so that overall runtime is acceptable +and similar between tested event loop (and keep them from crashing): Glib +would probably take thousands of years if asked to process the same number +of watchers as EV in this benchmark. + +I is the number of bytes (as measured by the resident set size, +RSS) consumed by each watcher. This method of measuring captures both C +and Perl-based overheads. + +I is the time, in microseconds (millionths of seconds), that it +takes to create a single watcher. The callback is a closure shared between +all watchers, to avoid adding memory overhead. That means closure creation +and memory usage is not included in the figures. + +I is the time, in microseconds, used to invoke a simple +callback. The callback simply counts down a Perl variable and after it was +invoked "watcher" times, it would C<< ->broadcast >> a condvar once to +signal the end of this phase. + +I is the time, in microseconds, that it takes to destroy a single +watcher. + +=head2 Results + + name watcher bytes create invoke destroy comment + EV/EV 400000 244 0.56 0.46 0.31 EV native interface + EV/Any 100000 610 3.52 0.91 0.75 EV + AnyEvent watchers + CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal + Perl/Any 16000 654 4.64 1.22 0.77 pure perl implementation + Event/Event 16000 523 28.05 21.38 0.86 Event native interface + Event/Any 16000 943 34.43 20.48 1.39 Event + AnyEvent watchers + Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour + Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers + POE/Event 2000 6644 108.15 768.19 14.33 via POE::Loop::Event + POE/Select 2000 6343 94.69 807.65 562.69 via POE::Loop::Select + +=head2 Discussion + +The benchmark does I measure scalability of the event loop very +well. For example, a select-based event loop (such as the pure perl one) +can never compete with an event loop that uses epoll when the number of +file descriptors grows high. In this benchmark, only a single filehandle +is used (although some of the AnyEvent adaptors dup() its file descriptor +to worka round bugs). + +C is the sole leader regarding speed and memory use, which are both +maximal/minimal, respectively. Even when going through AnyEvent, there is +only one event loop that uses less memory (the C module natively), and +no faster event model, not event C natively. + +The pure perl implementation is hit in a few sweet spots (both the +zero timeout and the use of a single fd hit optimisations in the perl +interpreter and the backend itself). Nevertheless tis shows that it +adds very little overhead in itself. Like any select-based backend its +performance becomes really bad with lots of file descriptors, of course, +but this was not subjetc of this benchmark. + +The C module has a relatively high setup and callback invocation cost, +but overall scores on the third place. + +C's memory usage is quite a bit bit higher, features a faster +callback invocation and overall lands in the same class as C. + +The C adaptor works relatively well, the fact that it crashes with +more than 2000 watchers is a big setback, however, as correctness takes +precedence over speed. Nevertheless, its performance is surprising, as the +file descriptor is dup()ed for each watcher. This shows that the dup() +employed by some adaptors is not a big performance issue (it does incur a +hidden memory cost inside the kernel, though). + +C, regardless of underlying event loop (wether using its pure perl +select-based backend or the Event module) shows abysmal performance and +memory usage: Watchers use almost 30 times as much memory as EV watchers, +and 10 times as much memory as both Event or EV via AnyEvent. Watcher +invocation is almost 700 times slower than with AnyEvent's pure perl +implementation. The design of the POE adaptor class in AnyEvent can not +really account for this, as session creation overhead is small compared +to execution of the state machine, which is coded pretty optimally within +L. POE simply seems to be abysmally slow. + +=head2 Summary + +Using EV through AnyEvent is faster than any other event loop, but most +event loops have acceptable performance with or without AnyEvent. + +The overhead AnyEvent adds is usually much smaller than the overhead of +the actual event loop, only with extremely fast event loops such as the EV +adds Anyevent significant overhead. + +And you should simply avoid POE like the plague if you want performance or +reasonable memory usage. + + =head1 FORK Most event libraries are not fork-safe. The ones who are usually are @@ -870,6 +978,7 @@ If you have to fork, you must either do so I creating your first watcher OR you must not use AnyEvent at all in the child. + =head1 SECURITY CONSIDERATIONS AnyEvent can be forced to load any event model via @@ -886,6 +995,7 @@ use AnyEvent; + =head1 SEE ALSO Event modules: L, L, L, L, @@ -899,6 +1009,7 @@ Nontrivial usage examples: L, L. + =head1 AUTHOR Marc Lehmann