ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent.pm
(Generate patch)

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.86 by root, Fri Apr 25 14:01:48 2008 UTC vs.
Revision 1.100 by elmex, Sun Apr 27 19:15:43 2008 UTC

66 66
67Of course, if you want lots of policy (this can arguably be somewhat 67Of course, if you want lots of policy (this can arguably be somewhat
68useful) and you want to force your users to use the one and only event 68useful) and you want to force your users to use the one and only event
69model, you should I<not> use this module. 69model, you should I<not> use this module.
70 70
71#TODO#
72
73Net::IRC3
74AnyEvent::HTTPD
75AnyEvent::DNS
76IO::AnyEvent
77Net::FPing
78Net::XMPP2
79Coro
80
81AnyEvent::IRC
82AnyEvent::HTTPD
83AnyEvent::DNS
84AnyEvent::Handle
85AnyEvent::Socket
86AnyEvent::FPing
87AnyEvent::XMPP
88AnyEvent::SNMP
89Coro
71 90
72=head1 DESCRIPTION 91=head1 DESCRIPTION
73 92
74L<AnyEvent> provides an identical interface to multiple event loops. This 93L<AnyEvent> provides an identical interface to multiple event loops. This
75allows module authors to utilise an event loop without forcing module 94allows module authors to utilise an event loop without forcing module
457might chose the wrong one unless you load the correct one yourself. 476might chose the wrong one unless you load the correct one yourself.
458 477
459You can chose to use a rather inefficient pure-perl implementation by 478You can chose to use a rather inefficient pure-perl implementation by
460loading the C<AnyEvent::Impl::Perl> module, which gives you similar 479loading the C<AnyEvent::Impl::Perl> module, which gives you similar
461behaviour everywhere, but letting AnyEvent chose is generally better. 480behaviour everywhere, but letting AnyEvent chose is generally better.
481
482=head1 OTHER MODULES
483
484L<AnyEvent> itself comes with useful utility modules:
485
486To make it easier to do non-blocking IO the modules L<AnyEvent::Handle>
487and L<AnyEvent::Socket> are provided. L<AnyEvent::Handle> provides
488read and write buffers and manages watchers for reads and writes.
489L<AnyEvent::Socket> provides means to do non-blocking connects.
490
491Aside from those there are these modules that support AnyEvent (and use it
492for non-blocking IO):
493
494=over 4
495
496=item L<AnyEvent::FastPing>
497
498=item L<Net::IRC3>
499
500=item L<Net::XMPP2>
501
502=back
462 503
463=cut 504=cut
464 505
465package AnyEvent; 506package AnyEvent;
466 507
894 }); 935 });
895 936
896 $quit->wait; 937 $quit->wait;
897 938
898 939
899=head1 BENCHMARK 940=head1 BENCHMARKS
900 941
901To give you an idea of the performance and overheads that AnyEvent adds 942To give you an idea of the performance and overheads that AnyEvent adds
902over the event loops themselves (and to give you an impression of the 943over the event loops themselves and to give you an impression of the speed
903speed of various event loops), here is a benchmark of various supported 944of various event loops I prepared some benchmarks.
904event models natively and with anyevent. The benchmark creates a lot of 945
905timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to 946=head2 BENCHMARKING ANYEVENT OVERHEAD
947
948Here is a benchmark of various supported event models used natively and
949through anyevent. The benchmark creates a lot of timers (with a zero
950timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
906become writable, which it is), lets them fire exactly once and destroys 951which it is), lets them fire exactly once and destroys them again.
907them again.
908 952
909Rewriting the benchmark to use many different sockets instead of using 953Source code for this benchmark is found as F<eg/bench> in the AnyEvent
910the same filehandle for all I/O watchers results in a much longer runtime 954distribution.
911(socket creation is expensive), but qualitatively the same figures, so it
912was not used.
913 955
914=head2 Explanation of the columns 956=head3 Explanation of the columns
915 957
916I<watcher> is the number of event watchers created/destroyed. Since 958I<watcher> is the number of event watchers created/destroyed. Since
917different event models feature vastly different performances, each event 959different event models feature vastly different performances, each event
918loop was given a number of watchers so that overall runtime is acceptable 960loop was given a number of watchers so that overall runtime is acceptable
919and similar between tested event loop (and keep them from crashing): Glib 961and similar between tested event loop (and keep them from crashing): Glib
935signal the end of this phase. 977signal the end of this phase.
936 978
937I<destroy> is the time, in microseconds, that it takes to destroy a single 979I<destroy> is the time, in microseconds, that it takes to destroy a single
938watcher. 980watcher.
939 981
940=head2 Results 982=head3 Results
941 983
942 name watchers bytes create invoke destroy comment 984 name watchers bytes create invoke destroy comment
943 EV/EV 400000 244 0.56 0.46 0.31 EV native interface 985 EV/EV 400000 244 0.56 0.46 0.31 EV native interface
944 EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers 986 EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers
945 CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal 987 CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal
946 Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation 988 Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation
947 Event/Event 16000 516 31.88 31.30 0.85 Event native interface 989 Event/Event 16000 516 31.88 31.30 0.85 Event native interface
948 Event/Any 16000 936 39.17 33.63 1.43 Event + AnyEvent watchers 990 Event/Any 16000 590 35.75 31.42 1.08 Event + AnyEvent watchers
949 Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour 991 Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour
950 Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers 992 Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers
951 POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event 993 POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event
952 POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select 994 POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select
953 995
954=head2 Discussion 996=head3 Discussion
955 997
956The benchmark does I<not> measure scalability of the event loop very 998The benchmark does I<not> measure scalability of the event loop very
957well. For example, a select-based event loop (such as the pure perl one) 999well. For example, a select-based event loop (such as the pure perl one)
958can never compete with an event loop that uses epoll when the number of 1000can never compete with an event loop that uses epoll when the number of
959file descriptors grows high. In this benchmark, all events become ready at 1001file descriptors grows high. In this benchmark, all events become ready at
960the same time, so select/poll-based implementations get an unnatural speed 1002the same time, so select/poll-based implementations get an unnatural speed
961boost. 1003boost.
962 1004
1005Also, note that the number of watchers usually has a nonlinear effect on
1006overall speed, that is, creating twice as many watchers doesn't take twice
1007the time - usually it takes longer. This puts event loops tested with a
1008higher number of watchers at a disadvantage.
1009
1010To put the range of results into perspective, consider that on the
1011benchmark machine, handling an event takes roughly 1600 CPU cycles with
1012EV, 3100 CPU cycles with AnyEvent's pure perl loop and almost 3000000 CPU
1013cycles with POE.
1014
963C<EV> is the sole leader regarding speed and memory use, which are both 1015C<EV> is the sole leader regarding speed and memory use, which are both
964maximal/minimal, respectively. Even when going through AnyEvent, it uses 1016maximal/minimal, respectively. Even when going through AnyEvent, it uses
965far less memory than any other event loop and is still faster than Event 1017far less memory than any other event loop and is still faster than Event
966natively. 1018natively.
967 1019
970interpreter and the backend itself). Nevertheless this shows that it 1022interpreter and the backend itself). Nevertheless this shows that it
971adds very little overhead in itself. Like any select-based backend its 1023adds very little overhead in itself. Like any select-based backend its
972performance becomes really bad with lots of file descriptors (and few of 1024performance becomes really bad with lots of file descriptors (and few of
973them active), of course, but this was not subject of this benchmark. 1025them active), of course, but this was not subject of this benchmark.
974 1026
975The C<Event> module has a relatively high setup and callback invocation cost, 1027The C<Event> module has a relatively high setup and callback invocation
976but overall scores on the third place. 1028cost, but overall scores in on the third place.
977 1029
978C<Glib>'s memory usage is quite a bit bit higher, but it features a 1030C<Glib>'s memory usage is quite a bit higher, but it features a
979faster callback invocation and overall ends up in the same class as 1031faster callback invocation and overall ends up in the same class as
980C<Event>. However, Glib scales extremely badly, doubling the number of 1032C<Event>. However, Glib scales extremely badly, doubling the number of
981watchers increases the processing time by more than a factor of four, 1033watchers increases the processing time by more than a factor of four,
982making it completely unusable when using larger numbers of watchers 1034making it completely unusable when using larger numbers of watchers
983(note that only a single file descriptor was used in the benchmark, so 1035(note that only a single file descriptor was used in the benchmark, so
986The C<Tk> adaptor works relatively well. The fact that it crashes with 1038The C<Tk> adaptor works relatively well. The fact that it crashes with
987more than 2000 watchers is a big setback, however, as correctness takes 1039more than 2000 watchers is a big setback, however, as correctness takes
988precedence over speed. Nevertheless, its performance is surprising, as the 1040precedence over speed. Nevertheless, its performance is surprising, as the
989file descriptor is dup()ed for each watcher. This shows that the dup() 1041file descriptor is dup()ed for each watcher. This shows that the dup()
990employed by some adaptors is not a big performance issue (it does incur a 1042employed by some adaptors is not a big performance issue (it does incur a
991hidden memory cost inside the kernel, though, that is not reflected in the 1043hidden memory cost inside the kernel which is not reflected in the figures
992figures above). 1044above).
993 1045
994C<POE>, regardless of underlying event loop (whether using its pure perl 1046C<POE>, regardless of underlying event loop (whether using its pure
995select-based backend or the Event module) shows abysmal performance and 1047perl select-based backend or the Event module, the POE-EV backend
1048couldn't be tested because it wasn't working) shows abysmal performance
996memory usage: Watchers use almost 30 times as much memory as EV watchers, 1049and memory usage: Watchers use almost 30 times as much memory as
997and 10 times as much memory as both Event or EV via AnyEvent. Watcher 1050EV watchers, and 10 times as much memory as Event (the high memory
1051requirements are caused by requiring a session for each watcher). Watcher
998invocation is almost 900 times slower than with AnyEvent's pure perl 1052invocation speed is almost 900 times slower than with AnyEvent's pure perl
999implementation. The design of the POE adaptor class in AnyEvent can not 1053implementation. The design of the POE adaptor class in AnyEvent can not
1000really account for this, as session creation overhead is small compared 1054really account for this, as session creation overhead is small compared
1001to execution of the state machine, which is coded pretty optimally within 1055to execution of the state machine, which is coded pretty optimally within
1002L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow. 1056L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.
1003 1057
1004=head2 Summary 1058=head3 Summary
1005 1059
1060=over 4
1061
1006Using EV through AnyEvent is faster than any other event loop, but most 1062=item * Using EV through AnyEvent is faster than any other event loop
1007event loops have acceptable performance with or without AnyEvent. 1063(even when used without AnyEvent), but most event loops have acceptable
1064performance with or without AnyEvent.
1008 1065
1009The overhead AnyEvent adds is usually much smaller than the overhead of 1066=item * The overhead AnyEvent adds is usually much smaller than the overhead of
1010the actual event loop, only with extremely fast event loops such as the EV 1067the actual event loop, only with extremely fast event loops such as EV
1011adds AnyEvent significant overhead. 1068adds AnyEvent significant overhead.
1012 1069
1013And you should simply avoid POE like the plague if you want performance or 1070=item * You should avoid POE like the plague if you want performance or
1014reasonable memory usage. 1071reasonable memory usage.
1072
1073=back
1074
1075=head2 BENCHMARKING THE LARGE SERVER CASE
1076
1077This benchmark atcually benchmarks the event loop itself. It works by
1078creating a number of "servers": each server consists of a socketpair, a
1079timeout watcher that gets reset on activity (but never fires), and an I/O
1080watcher waiting for input on one side of the socket. Each time the socket
1081watcher reads a byte it will write that byte to a random other "server".
1082
1083The effect is that there will be a lot of I/O watchers, only part of which
1084are active at any one point (so there is a constant number of active
1085fds for each loop iterstaion, but which fds these are is random). The
1086timeout is reset each time something is read because that reflects how
1087most timeouts work (and puts extra pressure on the event loops).
1088
1089In this benchmark, we use 10000 socketpairs (20000 sockets), of which 100
1090(1%) are active. This mirrors the activity of large servers with many
1091connections, most of which are idle at any one point in time.
1092
1093Source code for this benchmark is found as F<eg/bench2> in the AnyEvent
1094distribution.
1095
1096=head3 Explanation of the columns
1097
1098I<sockets> is the number of sockets, and twice the number of "servers" (as
1099each server has a read and write socket end).
1100
1101I<create> is the time it takes to create a socketpair (which is
1102nontrivial) and two watchers: an I/O watcher and a timeout watcher.
1103
1104I<request>, the most important value, is the time it takes to handle a
1105single "request", that is, reading the token from the pipe and forwarding
1106it to another server. This includes deleting the old timeout and creating
1107a new one that moves the timeout into the future.
1108
1109=head3 Results
1110
1111 name sockets create request
1112 EV 20000 69.01 11.16
1113 Perl 20000 73.32 35.87
1114 Event 20000 212.62 257.32
1115 Glib 20000 651.16 1896.30
1116 POE 20000 349.67 12317.24 uses POE::Loop::Event
1117
1118=head3 Discussion
1119
1120This benchmark I<does> measure scalability and overall performance of the
1121particular event loop.
1122
1123EV is again fastest. Since it is using epoll on my system, the setup time
1124is relatively high, though.
1125
1126Perl surprisingly comes second. It is much faster than the C-based event
1127loops Event and Glib.
1128
1129Event suffers from high setup time as well (look at its code and you will
1130understand why). Callback invocation also has a high overhead compared to
1131the C<< $_->() for .. >>-style loop that the Perl event loop uses. Event
1132uses select or poll in basically all documented configurations.
1133
1134Glib is hit hard by its quadratic behaviour w.r.t. many watchers. It
1135clearly fails to perform with many filehandles or in busy servers.
1136
1137POE is still completely out of the picture, taking over 1000 times as long
1138as EV, and over 100 times as long as the Perl implementation, even though
1139it uses a C-based event loop in this case.
1140
1141=head3 Summary
1142
1143=over 4
1144
1145=item * The pure perl implementation performs extremely well, considering
1146that it uses select.
1147
1148=item * Avoid Glib or POE in large projects where performance matters.
1149
1150=back
1151
1152=head2 BENCHMARKING SMALL SERVERS
1153
1154While event loops should scale (and select-based ones do not...) even to
1155large servers, most programs we (or I :) actually write have only a few
1156I/O watchers.
1157
1158In this benchmark, I use the same benchmark program as in the large server
1159case, but it uses only eight "servers", of which three are active at any
1160one time. This should reflect performance for a small server relatively
1161well.
1162
1163The columns are identical to the previous table.
1164
1165=head3 Results
1166
1167 name sockets create request
1168 EV 16 20.00 6.54
1169 Perl 16 25.75 12.62
1170 Event 16 81.27 35.86
1171 Glib 16 32.63 15.48
1172 POE 16 261.87 276.28 uses POE::Loop::Event
1173
1174=head3 Discussion
1175
1176The benchmark tries to test the performance of a typical small
1177server. While knowing how various event loops perform is interesting, keep
1178in mind that their overhead in this case is usually not as important, due
1179to the small absolute number of watchers (that is, you need efficiency and
1180speed most when you have lots of watchers, not when you only have a few of
1181them).
1182
1183EV is again fastest.
1184
1185The C-based event loops Event and Glib come in second this time, as the
1186overhead of running an iteration is much smaller in C than in Perl (little
1187code to execute in the inner loop, and perl's function calling overhead is
1188high, and updating all the data structures is costly).
1189
1190The pure perl event loop is much slower, but still competitive.
1191
1192POE also performs much better in this case, but is is still far behind the
1193others.
1194
1195=head3 Summary
1196
1197=over 4
1198
1199=item * C-based event loops perform very well with small number of
1200watchers, as the management overhead dominates.
1201
1202=back
1015 1203
1016 1204
1017=head1 FORK 1205=head1 FORK
1018 1206
1019Most event libraries are not fork-safe. The ones who are usually are 1207Most event libraries are not fork-safe. The ones who are usually are

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines