ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent.pm
(Generate patch)

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.90 by root, Fri Apr 25 14:24:29 2008 UTC vs.
Revision 1.99 by root, Sun Apr 27 17:09:33 2008 UTC

66 66
67Of course, if you want lots of policy (this can arguably be somewhat 67Of course, if you want lots of policy (this can arguably be somewhat
68useful) and you want to force your users to use the one and only event 68useful) and you want to force your users to use the one and only event
69model, you should I<not> use this module. 69model, you should I<not> use this module.
70 70
71#TODO#
72
73Net::IRC3
74AnyEvent::HTTPD
75AnyEvent::DNS
76IO::AnyEvent
77Net::FPing
78Net::XMPP2
79Coro
80
81AnyEvent::IRC
82AnyEvent::HTTPD
83AnyEvent::DNS
84AnyEvent::Handle
85AnyEvent::Socket
86AnyEvent::FPing
87AnyEvent::XMPP
88AnyEvent::SNMP
89Coro
71 90
72=head1 DESCRIPTION 91=head1 DESCRIPTION
73 92
74L<AnyEvent> provides an identical interface to multiple event loops. This 93L<AnyEvent> provides an identical interface to multiple event loops. This
75allows module authors to utilise an event loop without forcing module 94allows module authors to utilise an event loop without forcing module
894 }); 913 });
895 914
896 $quit->wait; 915 $quit->wait;
897 916
898 917
899=head1 BENCHMARK 918=head1 BENCHMARKS
900 919
901To give you an idea of the performance and overheads that AnyEvent adds 920To give you an idea of the performance and overheads that AnyEvent adds
902over the event loops themselves (and to give you an impression of the 921over the event loops themselves and to give you an impression of the speed
903speed of various event loops), here is a benchmark of various supported 922of various event loops I prepared some benchmarks.
904event models natively and with anyevent. The benchmark creates a lot of 923
905timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to 924=head2 BENCHMARKING ANYEVENT OVERHEAD
925
926Here is a benchmark of various supported event models used natively and
927through anyevent. The benchmark creates a lot of timers (with a zero
928timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
906become writable, which it is), lets them fire exactly once and destroys 929which it is), lets them fire exactly once and destroys them again.
907them again.
908 930
909Rewriting the benchmark to use many different sockets instead of using 931Source code for this benchmark is found as F<eg/bench> in the AnyEvent
910the same filehandle for all I/O watchers results in a much longer runtime 932distribution.
911(socket creation is expensive), but qualitatively the same figures, so it
912was not used.
913 933
914=head2 Explanation of the columns 934=head3 Explanation of the columns
915 935
916I<watcher> is the number of event watchers created/destroyed. Since 936I<watcher> is the number of event watchers created/destroyed. Since
917different event models feature vastly different performances, each event 937different event models feature vastly different performances, each event
918loop was given a number of watchers so that overall runtime is acceptable 938loop was given a number of watchers so that overall runtime is acceptable
919and similar between tested event loop (and keep them from crashing): Glib 939and similar between tested event loop (and keep them from crashing): Glib
935signal the end of this phase. 955signal the end of this phase.
936 956
937I<destroy> is the time, in microseconds, that it takes to destroy a single 957I<destroy> is the time, in microseconds, that it takes to destroy a single
938watcher. 958watcher.
939 959
940=head2 Results 960=head3 Results
941 961
942 name watchers bytes create invoke destroy comment 962 name watchers bytes create invoke destroy comment
943 EV/EV 400000 244 0.56 0.46 0.31 EV native interface 963 EV/EV 400000 244 0.56 0.46 0.31 EV native interface
944 EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers 964 EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers
945 CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal 965 CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal
946 Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation 966 Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation
947 Event/Event 16000 516 31.88 31.30 0.85 Event native interface 967 Event/Event 16000 516 31.88 31.30 0.85 Event native interface
948 Event/Any 16000 936 39.17 33.63 1.43 Event + AnyEvent watchers 968 Event/Any 16000 590 35.75 31.42 1.08 Event + AnyEvent watchers
949 Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour 969 Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour
950 Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers 970 Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers
951 POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event 971 POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event
952 POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select 972 POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select
953 973
954=head2 Discussion 974=head3 Discussion
955 975
956The benchmark does I<not> measure scalability of the event loop very 976The benchmark does I<not> measure scalability of the event loop very
957well. For example, a select-based event loop (such as the pure perl one) 977well. For example, a select-based event loop (such as the pure perl one)
958can never compete with an event loop that uses epoll when the number of 978can never compete with an event loop that uses epoll when the number of
959file descriptors grows high. In this benchmark, all events become ready at 979file descriptors grows high. In this benchmark, all events become ready at
960the same time, so select/poll-based implementations get an unnatural speed 980the same time, so select/poll-based implementations get an unnatural speed
961boost. 981boost.
982
983Also, note that the number of watchers usually has a nonlinear effect on
984overall speed, that is, creating twice as many watchers doesn't take twice
985the time - usually it takes longer. This puts event loops tested with a
986higher number of watchers at a disadvantage.
987
988To put the range of results into perspective, consider that on the
989benchmark machine, handling an event takes roughly 1600 CPU cycles with
990EV, 3100 CPU cycles with AnyEvent's pure perl loop and almost 3000000 CPU
991cycles with POE.
962 992
963C<EV> is the sole leader regarding speed and memory use, which are both 993C<EV> is the sole leader regarding speed and memory use, which are both
964maximal/minimal, respectively. Even when going through AnyEvent, it uses 994maximal/minimal, respectively. Even when going through AnyEvent, it uses
965far less memory than any other event loop and is still faster than Event 995far less memory than any other event loop and is still faster than Event
966natively. 996natively.
1001implementation. The design of the POE adaptor class in AnyEvent can not 1031implementation. The design of the POE adaptor class in AnyEvent can not
1002really account for this, as session creation overhead is small compared 1032really account for this, as session creation overhead is small compared
1003to execution of the state machine, which is coded pretty optimally within 1033to execution of the state machine, which is coded pretty optimally within
1004L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow. 1034L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.
1005 1035
1006=head2 Summary 1036=head3 Summary
1007 1037
1008=over 4 1038=over 4
1009 1039
1010=item * Using EV through AnyEvent is faster than any other event loop 1040=item * Using EV through AnyEvent is faster than any other event loop
1011(even when used without AnyEvent), but most event loops have acceptable 1041(even when used without AnyEvent), but most event loops have acceptable
1015the actual event loop, only with extremely fast event loops such as EV 1045the actual event loop, only with extremely fast event loops such as EV
1016adds AnyEvent significant overhead. 1046adds AnyEvent significant overhead.
1017 1047
1018=item * You should avoid POE like the plague if you want performance or 1048=item * You should avoid POE like the plague if you want performance or
1019reasonable memory usage. 1049reasonable memory usage.
1050
1051=back
1052
1053=head2 BENCHMARKING THE LARGE SERVER CASE
1054
1055This benchmark atcually benchmarks the event loop itself. It works by
1056creating a number of "servers": each server consists of a socketpair, a
1057timeout watcher that gets reset on activity (but never fires), and an I/O
1058watcher waiting for input on one side of the socket. Each time the socket
1059watcher reads a byte it will write that byte to a random other "server".
1060
1061The effect is that there will be a lot of I/O watchers, only part of which
1062are active at any one point (so there is a constant number of active
1063fds for each loop iterstaion, but which fds these are is random). The
1064timeout is reset each time something is read because that reflects how
1065most timeouts work (and puts extra pressure on the event loops).
1066
1067In this benchmark, we use 10000 socketpairs (20000 sockets), of which 100
1068(1%) are active. This mirrors the activity of large servers with many
1069connections, most of which are idle at any one point in time.
1070
1071Source code for this benchmark is found as F<eg/bench2> in the AnyEvent
1072distribution.
1073
1074=head3 Explanation of the columns
1075
1076I<sockets> is the number of sockets, and twice the number of "servers" (as
1077each server has a read and write socket end).
1078
1079I<create> is the time it takes to create a socketpair (which is
1080nontrivial) and two watchers: an I/O watcher and a timeout watcher.
1081
1082I<request>, the most important value, is the time it takes to handle a
1083single "request", that is, reading the token from the pipe and forwarding
1084it to another server. This includes deleting the old timeout and creating
1085a new one that moves the timeout into the future.
1086
1087=head3 Results
1088
1089 name sockets create request
1090 EV 20000 69.01 11.16
1091 Perl 20000 73.32 35.87
1092 Event 20000 212.62 257.32
1093 Glib 20000 651.16 1896.30
1094 POE 20000 349.67 12317.24 uses POE::Loop::Event
1095
1096=head3 Discussion
1097
1098This benchmark I<does> measure scalability and overall performance of the
1099particular event loop.
1100
1101EV is again fastest. Since it is using epoll on my system, the setup time
1102is relatively high, though.
1103
1104Perl surprisingly comes second. It is much faster than the C-based event
1105loops Event and Glib.
1106
1107Event suffers from high setup time as well (look at its code and you will
1108understand why). Callback invocation also has a high overhead compared to
1109the C<< $_->() for .. >>-style loop that the Perl event loop uses. Event
1110uses select or poll in basically all documented configurations.
1111
1112Glib is hit hard by its quadratic behaviour w.r.t. many watchers. It
1113clearly fails to perform with many filehandles or in busy servers.
1114
1115POE is still completely out of the picture, taking over 1000 times as long
1116as EV, and over 100 times as long as the Perl implementation, even though
1117it uses a C-based event loop in this case.
1118
1119=head3 Summary
1120
1121=over 4
1122
1123=item * The pure perl implementation performs extremely well, considering
1124that it uses select.
1125
1126=item * Avoid Glib or POE in large projects where performance matters.
1127
1128=back
1129
1130=head2 BENCHMARKING SMALL SERVERS
1131
1132While event loops should scale (and select-based ones do not...) even to
1133large servers, most programs we (or I :) actually write have only a few
1134I/O watchers.
1135
1136In this benchmark, I use the same benchmark program as in the large server
1137case, but it uses only eight "servers", of which three are active at any
1138one time. This should reflect performance for a small server relatively
1139well.
1140
1141The columns are identical to the previous table.
1142
1143=head3 Results
1144
1145 name sockets create request
1146 EV 16 20.00 6.54
1147 Perl 16 25.75 12.62
1148 Event 16 81.27 35.86
1149 Glib 16 32.63 15.48
1150 POE 16 261.87 276.28 uses POE::Loop::Event
1151
1152=head3 Discussion
1153
1154The benchmark tries to test the performance of a typical small
1155server. While knowing how various event loops perform is interesting, keep
1156in mind that their overhead in this case is usually not as important, due
1157to the small absolute number of watchers (that is, you need efficiency and
1158speed most when you have lots of watchers, not when you only have a few of
1159them).
1160
1161EV is again fastest.
1162
1163The C-based event loops Event and Glib come in second this time, as the
1164overhead of running an iteration is much smaller in C than in Perl (little
1165code to execute in the inner loop, and perl's function calling overhead is
1166high, and updating all the data structures is costly).
1167
1168The pure perl event loop is much slower, but still competitive.
1169
1170POE also performs much better in this case, but is is still far behind the
1171others.
1172
1173=head3 Summary
1174
1175=over 4
1176
1177=item * C-based event loops perform very well with small number of
1178watchers, as the management overhead dominates.
1020 1179
1021=back 1180=back
1022 1181
1023 1182
1024=head1 FORK 1183=head1 FORK

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines