--- AnyEvent/lib/AnyEvent.pm 2008/04/25 06:54:08 1.64 +++ AnyEvent/lib/AnyEvent.pm 2008/04/25 09:08:16 1.79 @@ -138,7 +138,7 @@ my variables are only visible after the statement in which they are declared. -=head2 IO WATCHERS +=head2 I/O WATCHERS You can create an I/O watcher by calling the C<< AnyEvent->io >> method with the following mandatory key-value pairs as arguments: @@ -708,7 +708,7 @@ =head1 EXAMPLE PROGRAM -The following program uses an IO watcher to read data from STDIN, a timer +The following program uses an I/O watcher to read data from STDIN, a timer to display a message once per second, and a condition variable to quit the program when the user enters quit: @@ -865,74 +865,122 @@ =head1 BENCHMARK -To give you an idea of the performance an doverheads that AnyEvent adds -over the backends, here is a benchmark of various supported backends. The -benchmark creates a lot of timers (with zero timeout) and io events -(watching STDOUT, a pty, to become writable). - -Explanation of the fields: - -I is the number of event watchers created/destroyed. Sicne -different event models have vastly different performance each backend was -handed a number of watchers so that overall runtime is acceptable and -similar to all backends (and keep them from crashing). - -I is the number of bytes (as measured by resident set size) used by -each watcher. - -I is the time, in microseconds, to create a single watcher. - -I is the time, in microseconds, used to invoke a simple callback -that simply counts down. - -I is the time, in microseconds, to destroy a single watcher. - - name watcher bytes create invoke destroy comment - EV/EV 400000 244 0.56 0.46 0.31 EV native interface - EV/Any 100000 610 3.52 0.91 0.75 - CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal - Perl/Any 10000 654 4.64 1.22 0.77 pure perl implementation - Event/Event 10000 523 28.05 21.38 5.22 Event native interface - Event/Any 10000 943 34.43 20.48 1.39 - Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour - Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers - POE/Select 2000 6343 94.69 807.65 562.69 POE::Loop::Select - POE/Event 2000 6644 108.15 768.19 14.33 POE::Loop::Event - -Discussion: The benchmark does I bench scalability of the -backend. For example a select-based backend (such as the pureperl one) can -never compete with a backend using epoll. In this benchmark, only a single -filehandle is used. - -EV is the sole leader regarding speed and memory use, which are both -maximal/minimal. Even when going through AnyEvent, there is only one event -loop that uses less memory (the Event module natively), and no faster -event model. +To give you an idea of the performance and overheads that AnyEvent adds +over the event loops themselves (and to give you an impression of the +speed of various event loops), here is a benchmark of various supported +event models natively and with anyevent. The benchmark creates a lot of +timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to +become writable, which it is), lets them fire exactly once and destroys +them again. + +Rewriting the benchmark to use many different sockets instead of using +the same filehandle for all I/O watchers results in a much longer runtime +(socket creation is expensive), but qualitatively the same figures, so it +was not used. + +=head2 Explanation of the columns + +I is the number of event watchers created/destroyed. Since +different event models feature vastly different performances, each event +loop was given a number of watchers so that overall runtime is acceptable +and similar between tested event loop (and keep them from crashing): Glib +would probably take thousands of years if asked to process the same number +of watchers as EV in this benchmark. + +I is the number of bytes (as measured by the resident set size, +RSS) consumed by each watcher. This method of measuring captures both C +and Perl-based overheads. + +I is the time, in microseconds (millionths of seconds), that it +takes to create a single watcher. The callback is a closure shared between +all watchers, to avoid adding memory overhead. That means closure creation +and memory usage is not included in the figures. + +I is the time, in microseconds, used to invoke a simple +callback. The callback simply counts down a Perl variable and after it was +invoked "watcher" times, it would C<< ->broadcast >> a condvar once to +signal the end of this phase. + +I is the time, in microseconds, that it takes to destroy a single +watcher. + +=head2 Results + + name watchers bytes create invoke destroy comment + EV/EV 400000 244 0.56 0.46 0.31 EV native interface + EV/Any 100000 610 3.52 0.91 0.75 EV + AnyEvent watchers + CoroEV/Any 100000 610 3.49 0.92 0.75 coroutines + Coro::Signal + Perl/Any 100000 513 4.91 0.92 1.15 pure perl implementation + Event/Event 16000 523 28.05 21.38 0.86 Event native interface + Event/Any 16000 943 34.43 20.48 1.39 Event + AnyEvent watchers + Glib/Any 16000 1357 96.99 12.55 55.51 quadratic behaviour + Tk/Any 2000 1855 27.01 66.61 14.03 SEGV with >> 2000 watchers + POE/Event 2000 6644 108.15 768.19 14.33 via POE::Loop::Event + POE/Select 2000 6343 94.69 807.65 562.69 via POE::Loop::Select + +=head2 Discussion + +The benchmark does I measure scalability of the event loop very +well. For example, a select-based event loop (such as the pure perl one) +can never compete with an event loop that uses epoll when the number of +file descriptors grows high. In this benchmark, only a single filehandle +is used (although some of the AnyEvent adaptors dup() its file descriptor +to worka round bugs). + +C is the sole leader regarding speed and memory use, which are both +maximal/minimal, respectively. Even when going through AnyEvent, there are +only two event loops that use slightly less memory (the C module +natively and the pure perl backend), and no faster event models, not even +C natively. The pure perl implementation is hit in a few sweet spots (both the zero timeout and the use of a single fd hit optimisations in the perl -interpreter and the backend itself), but it shows that it adds very little -overhead in itself. Like any select-based backend it's performance becomes -really bad with lots of file descriptors. +interpreter and the backend itself, and all watchers become ready at the +same time). Nevertheless this shows that it adds very little overhead in +itself. Like any select-based backend its performance becomes really bad +with lots of file descriptors (and few of them active), of course, but +this was not subject of this benchmark. -The Event module has a relatively high setup and callback invocation cost, +The C module has a relatively high setup and callback invocation cost, but overall scores on the third place. -Glib has a little higher memory cost, a bit fster callback invocation and -has a similar speed as Event. +C's memory usage is quite a bit bit higher, but it features a +faster callback invocation and overall ends up in the same class as +C. However, Glib scales extremely badly, doubling the number of +watchers increases the processing time by more than a factor of four, +making it completely unusable when using larger numbers of watchers +(note that only a single file descriptor was used in the benchmark, so +inefficiencies of C do not account for this). -The Tk backend works relatively well, the fact that it crashes with +The C adaptor works relatively well. The fact that it crashes with more than 2000 watchers is a big setback, however, as correctness takes -precedence over speed. +precedence over speed. Nevertheless, its performance is surprising, as the +file descriptor is dup()ed for each watcher. This shows that the dup() +employed by some adaptors is not a big performance issue (it does incur a +hidden memory cost inside the kernel, though, that is not reflected in the +figures above). + +C, regardless of underlying event loop (wether using its pure perl +select-based backend or the Event module) shows abysmal performance and +memory usage: Watchers use almost 30 times as much memory as EV watchers, +and 10 times as much memory as both Event or EV via AnyEvent. Watcher +invocation is almost 900 times slower than with AnyEvent's pure perl +implementation. The design of the POE adaptor class in AnyEvent can not +really account for this, as session creation overhead is small compared +to execution of the state machine, which is coded pretty optimally within +L. POE simply seems to be abysmally slow. + +=head2 Summary + +Using EV through AnyEvent is faster than any other event loop, but most +event loops have acceptable performance with or without AnyEvent. + +The overhead AnyEvent adds is usually much smaller than the overhead of +the actual event loop, only with extremely fast event loops such as the EV +adds AnyEvent significant overhead. -POE, regardless of backend (wether it's pure perl select backend or the -Event backend) shows abysmal performance and memory usage: Watchers use -almost 30 times as much memory as EV watchers, and 10 times as much memory -as both Event or EV via AnyEvent. - -Summary: using EV through AnyEvent is faster than any other event -loop. The overhead AnyEvent adds can be very small, and you should avoid -POE like the plague if you want performance or reasonable memory usage. +And you should simply avoid POE like the plague if you want performance or +reasonable memory usage. =head1 FORK