ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent.pm
(Generate patch)

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.83 by root, Fri Apr 25 13:39:08 2008 UTC vs.
Revision 1.95 by root, Sat Apr 26 11:06:45 2008 UTC

141=head2 I/O WATCHERS 141=head2 I/O WATCHERS
142 142
143You can create an I/O watcher by calling the C<< AnyEvent->io >> method 143You can create an I/O watcher by calling the C<< AnyEvent->io >> method
144with the following mandatory key-value pairs as arguments: 144with the following mandatory key-value pairs as arguments:
145 145
146C<fh> the Perl I<file handle> (I<not> file descriptor) to watch for 146C<fh> the Perl I<file handle> (I<not> file descriptor) to watch
147events. C<poll> must be a string that is either C<r> or C<w>, which 147for events. C<poll> must be a string that is either C<r> or C<w>,
148creates a watcher waiting for "r"eadable or "w"ritable events, 148which creates a watcher waiting for "r"eadable or "w"ritable events,
149respectively. C<cb> is the callback to invoke each time the file handle 149respectively. C<cb> is the callback to invoke each time the file handle
150becomes ready. 150becomes ready.
151 151
152Although the callback might get passed parameters, their value and
153presence is undefined and you cannot rely on them. Portable AnyEvent
154callbacks cannot use arguments passed to I/O watcher callbacks.
155
152The I/O watcher might use the underlying file descriptor or a copy of it. 156The I/O watcher might use the underlying file descriptor or a copy of it.
153It is not allowed to close a file handle as long as any watcher is active 157You must not close a file handle as long as any watcher is active on the
154on the underlying file descriptor. 158underlying file descriptor.
155 159
156Some event loops issue spurious readyness notifications, so you should 160Some event loops issue spurious readyness notifications, so you should
157always use non-blocking calls when reading/writing from/to your file 161always use non-blocking calls when reading/writing from/to your file
158handles. 162handles.
159 163
170 174
171You can create a time watcher by calling the C<< AnyEvent->timer >> 175You can create a time watcher by calling the C<< AnyEvent->timer >>
172method with the following mandatory arguments: 176method with the following mandatory arguments:
173 177
174C<after> specifies after how many seconds (fractional values are 178C<after> specifies after how many seconds (fractional values are
175supported) should the timer activate. C<cb> the callback to invoke in that 179supported) the callback should be invoked. C<cb> is the callback to invoke
176case. 180in that case.
181
182Although the callback might get passed parameters, their value and
183presence is undefined and you cannot rely on them. Portable AnyEvent
184callbacks cannot use arguments passed to time watcher callbacks.
177 185
178The timer callback will be invoked at most once: if you want a repeating 186The timer callback will be invoked at most once: if you want a repeating
179timer you have to create a new watcher (this is a limitation by both Tk 187timer you have to create a new watcher (this is a limitation by both Tk
180and Glib). 188and Glib).
181 189
226 234
227You can watch for signals using a signal watcher, C<signal> is the signal 235You can watch for signals using a signal watcher, C<signal> is the signal
228I<name> without any C<SIG> prefix, C<cb> is the Perl callback to 236I<name> without any C<SIG> prefix, C<cb> is the Perl callback to
229be invoked whenever a signal occurs. 237be invoked whenever a signal occurs.
230 238
239Although the callback might get passed parameters, their value and
240presence is undefined and you cannot rely on them. Portable AnyEvent
241callbacks cannot use arguments passed to signal watcher callbacks.
242
231Multiple signal occurances can be clumped together into one callback 243Multiple signal occurances can be clumped together into one callback
232invocation, and callback invocation will be synchronous. synchronous means 244invocation, and callback invocation will be synchronous. synchronous means
233that it might take a while until the signal gets handled by the process, 245that it might take a while until the signal gets handled by the process,
234but it is guarenteed not to interrupt any other callbacks. 246but it is guarenteed not to interrupt any other callbacks.
235 247
249 261
250The child process is specified by the C<pid> argument (if set to C<0>, it 262The child process is specified by the C<pid> argument (if set to C<0>, it
251watches for any child process exit). The watcher will trigger as often 263watches for any child process exit). The watcher will trigger as often
252as status change for the child are received. This works by installing a 264as status change for the child are received. This works by installing a
253signal handler for C<SIGCHLD>. The callback will be called with the pid 265signal handler for C<SIGCHLD>. The callback will be called with the pid
254and exit status (as returned by waitpid). 266and exit status (as returned by waitpid), so unlike other watcher types,
267you I<can> rely on child watcher callback arguments.
255 268
256There is a slight catch to child watchers, however: you usually start them 269There is a slight catch to child watchers, however: you usually start them
257I<after> the child process was created, and this means the process could 270I<after> the child process was created, and this means the process could
258have exited already (and no SIGCHLD will be sent anymore). 271have exited already (and no SIGCHLD will be sent anymore).
259 272
881 }); 894 });
882 895
883 $quit->wait; 896 $quit->wait;
884 897
885 898
886=head1 BENCHMARK 899=head1 BENCHMARKS
887 900
888To give you an idea of the performance and overheads that AnyEvent adds 901To give you an idea of the performance and overheads that AnyEvent adds
889over the event loops themselves (and to give you an impression of the 902over the event loops themselves and to give you an impression of the speed
890speed of various event loops), here is a benchmark of various supported 903of various event loops I prepared some benchmarks.
891event models natively and with anyevent. The benchmark creates a lot of 904
892timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to 905=head2 BENCHMARKING ANYEVENT OVERHEAD
906
907Here is a benchmark of various supported event models used natively and
908through anyevent. The benchmark creates a lot of timers (with a zero
909timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
893become writable, which it is), lets them fire exactly once and destroys 910which it is), lets them fire exactly once and destroys them again.
894them again.
895 911
896Rewriting the benchmark to use many different sockets instead of using 912Source code for this benchmark is found as F<eg/bench> in the AnyEvent
897the same filehandle for all I/O watchers results in a much longer runtime 913distribution.
898(socket creation is expensive), but qualitatively the same figures, so it
899was not used.
900 914
901=head2 Explanation of the columns 915=head3 Explanation of the columns
902 916
903I<watcher> is the number of event watchers created/destroyed. Since 917I<watcher> is the number of event watchers created/destroyed. Since
904different event models feature vastly different performances, each event 918different event models feature vastly different performances, each event
905loop was given a number of watchers so that overall runtime is acceptable 919loop was given a number of watchers so that overall runtime is acceptable
906and similar between tested event loop (and keep them from crashing): Glib 920and similar between tested event loop (and keep them from crashing): Glib
922signal the end of this phase. 936signal the end of this phase.
923 937
924I<destroy> is the time, in microseconds, that it takes to destroy a single 938I<destroy> is the time, in microseconds, that it takes to destroy a single
925watcher. 939watcher.
926 940
927=head2 Results 941=head3 Results
928 942
929 name watchers bytes create invoke destroy comment 943 name watchers bytes create invoke destroy comment
930 EV/EV 400000 244 0.56 0.46 0.31 EV native interface 944 EV/EV 400000 244 0.56 0.46 0.31 EV native interface
931 EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers 945 EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers
932 CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal 946 CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal
936 Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour 950 Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour
937 Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers 951 Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers
938 POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event 952 POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event
939 POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select 953 POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select
940 954
941=head2 Discussion 955=head3 Discussion
942 956
943The benchmark does I<not> measure scalability of the event loop very 957The benchmark does I<not> measure scalability of the event loop very
944well. For example, a select-based event loop (such as the pure perl one) 958well. For example, a select-based event loop (such as the pure perl one)
945can never compete with an event loop that uses epoll when the number of 959can never compete with an event loop that uses epoll when the number of
946file descriptors grows high. In this benchmark, all events become ready at 960file descriptors grows high. In this benchmark, all events become ready at
947the same time, so select/poll-based implementations get an unnatural speed 961the same time, so select/poll-based implementations get an unnatural speed
948boost. 962boost.
949 963
964Also, note that the number of watchers usually has a nonlinear effect on
965overall speed, that is, creating twice as many watchers doesn't take twice
966the time - usually it takes longer. This puts event loops tested with a
967higher number of watchers at a disadvantage.
968
950C<EV> is the sole leader regarding speed and memory use, which are both 969C<EV> is the sole leader regarding speed and memory use, which are both
951maximal/minimal, respectively. Even when going through AnyEvent, there are 970maximal/minimal, respectively. Even when going through AnyEvent, it uses
952only two event loops that use slightly less memory (the C<Event> module 971far less memory than any other event loop and is still faster than Event
953natively and the pure perl backend), and no faster event models, not even 972natively.
954C<Event> natively.
955 973
956The pure perl implementation is hit in a few sweet spots (both the 974The pure perl implementation is hit in a few sweet spots (both the
957zero timeout and the use of a single fd hit optimisations in the perl 975constant timeout and the use of a single fd hit optimisations in the perl
958interpreter and the backend itself, and all watchers become ready at the 976interpreter and the backend itself). Nevertheless this shows that it
959same time). Nevertheless this shows that it adds very little overhead in 977adds very little overhead in itself. Like any select-based backend its
960itself. Like any select-based backend its performance becomes really bad 978performance becomes really bad with lots of file descriptors (and few of
961with lots of file descriptors (and few of them active), of course, but 979them active), of course, but this was not subject of this benchmark.
962this was not subject of this benchmark.
963 980
964The C<Event> module has a relatively high setup and callback invocation cost, 981The C<Event> module has a relatively high setup and callback invocation
965but overall scores on the third place. 982cost, but overall scores in on the third place.
966 983
967C<Glib>'s memory usage is quite a bit bit higher, but it features a 984C<Glib>'s memory usage is quite a bit higher, but it features a
968faster callback invocation and overall ends up in the same class as 985faster callback invocation and overall ends up in the same class as
969C<Event>. However, Glib scales extremely badly, doubling the number of 986C<Event>. However, Glib scales extremely badly, doubling the number of
970watchers increases the processing time by more than a factor of four, 987watchers increases the processing time by more than a factor of four,
971making it completely unusable when using larger numbers of watchers 988making it completely unusable when using larger numbers of watchers
972(note that only a single file descriptor was used in the benchmark, so 989(note that only a single file descriptor was used in the benchmark, so
975The C<Tk> adaptor works relatively well. The fact that it crashes with 992The C<Tk> adaptor works relatively well. The fact that it crashes with
976more than 2000 watchers is a big setback, however, as correctness takes 993more than 2000 watchers is a big setback, however, as correctness takes
977precedence over speed. Nevertheless, its performance is surprising, as the 994precedence over speed. Nevertheless, its performance is surprising, as the
978file descriptor is dup()ed for each watcher. This shows that the dup() 995file descriptor is dup()ed for each watcher. This shows that the dup()
979employed by some adaptors is not a big performance issue (it does incur a 996employed by some adaptors is not a big performance issue (it does incur a
980hidden memory cost inside the kernel, though, that is not reflected in the 997hidden memory cost inside the kernel which is not reflected in the figures
981figures above). 998above).
982 999
983C<POE>, regardless of underlying event loop (wether using its pure perl 1000C<POE>, regardless of underlying event loop (whether using its pure
984select-based backend or the Event module) shows abysmal performance and 1001perl select-based backend or the Event module, the POE-EV backend
1002couldn't be tested because it wasn't working) shows abysmal performance
985memory usage: Watchers use almost 30 times as much memory as EV watchers, 1003and memory usage: Watchers use almost 30 times as much memory as
986and 10 times as much memory as both Event or EV via AnyEvent. Watcher 1004EV watchers, and 10 times as much memory as Event (the high memory
1005requirements are caused by requiring a session for each watcher). Watcher
987invocation is almost 900 times slower than with AnyEvent's pure perl 1006invocation speed is almost 900 times slower than with AnyEvent's pure perl
988implementation. The design of the POE adaptor class in AnyEvent can not 1007implementation. The design of the POE adaptor class in AnyEvent can not
989really account for this, as session creation overhead is small compared 1008really account for this, as session creation overhead is small compared
990to execution of the state machine, which is coded pretty optimally within 1009to execution of the state machine, which is coded pretty optimally within
991L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow. 1010L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.
992 1011
993=head2 Summary 1012=head3 Summary
994 1013
1014=over 4
1015
995Using EV through AnyEvent is faster than any other event loop, but most 1016=item * Using EV through AnyEvent is faster than any other event loop
996event loops have acceptable performance with or without AnyEvent. 1017(even when used without AnyEvent), but most event loops have acceptable
1018performance with or without AnyEvent.
997 1019
998The overhead AnyEvent adds is usually much smaller than the overhead of 1020=item * The overhead AnyEvent adds is usually much smaller than the overhead of
999the actual event loop, only with extremely fast event loops such as the EV 1021the actual event loop, only with extremely fast event loops such as EV
1000adds AnyEvent significant overhead. 1022adds AnyEvent significant overhead.
1001 1023
1002And you should simply avoid POE like the plague if you want performance or 1024=item * You should avoid POE like the plague if you want performance or
1003reasonable memory usage. 1025reasonable memory usage.
1026
1027=back
1028
1029=head2 BENCHMARKING THE LARGE SERVER CASE
1030
1031This benchmark atcually benchmarks the event loop itself. It works by
1032creating a number of "servers": each server consists of a socketpair, a
1033timeout watcher that gets reset on activity (but never fires), and an I/O
1034watcher waiting for input on one side of the socket. Each time the socket
1035watcher reads a byte it will write that byte to a random other "server".
1036
1037The effect is that there will be a lot of I/O watchers, only part of which
1038are active at any one point (so there is a constant number of active
1039fds for each loop iterstaion, but which fds these are is random). The
1040timeout is reset each time something is read because that reflects how
1041most timeouts work (and puts extra pressure on the event loops).
1042
1043In this benchmark, we use 10000 socketpairs (20000 sockets), of which 100
1044(1%) are active. This mirrors the activity of large servers with many
1045connections, most of which are idle at any one point in time.
1046
1047Source code for this benchmark is found as F<eg/bench2> in the AnyEvent
1048distribution.
1049
1050=head3 Explanation of the columns
1051
1052I<sockets> is the number of sockets, and twice the number of "servers" (as
1053each server has a read and write socket end).
1054
1055I<create> is the time it takes to create a socketpair (which is
1056nontrivial) and two watchers: an I/O watcher and a timeout watcher.
1057
1058I<request>, the most important value, is the time it takes to handle a
1059single "request", that is, reading the token from the pipe and forwarding
1060it to another server. This includes deleting the old timeout and creating
1061a new one that moves the timeout into the future.
1062
1063=head3 Results
1064
1065 name sockets create request
1066 EV 20000 69.01 11.16
1067 Perl 20000 75.28 112.76
1068 Event 20000 212.62 257.32
1069 Glib 20000 651.16 1896.30
1070 POE 20000 349.67 12317.24 uses POE::Loop::Event
1071
1072=head3 Discussion
1073
1074This benchmark I<does> measure scalability and overall performance of the
1075particular event loop.
1076
1077EV is again fastest. Since it is using epoll on my system, the setup time
1078is relatively high, though.
1079
1080Perl surprisingly comes second. It is much faster than the C-based event
1081loops Event and Glib.
1082
1083Event suffers from high setup time as well (look at its code and you will
1084understand why). Callback invocation also has a high overhead compared to
1085the C<< $_->() for .. >>-style loop that the Perl event loop uses. Event
1086uses select or poll in basically all documented configurations.
1087
1088Glib is hit hard by its quadratic behaviour w.r.t. many watchers. It
1089clearly fails to perform with many filehandles or in busy servers.
1090
1091POE is still completely out of the picture, taking over 1000 times as long
1092as EV, and over 100 times as long as the Perl implementation, even though
1093it uses a C-based event loop in this case.
1094
1095=head3 Summary
1096
1097=over 4
1098
1099=item * The pure perl implementation performs extremely well, considering
1100that it uses select.
1101
1102=item * Avoid Glib or POE in large projects where performance matters.
1103
1104=back
1105
1106=head2 BENCHMARKING SMALL SERVERS
1107
1108While event loops should scale (and select-based ones do not...) even to
1109large servers, most programs we (or I :) actually write have only a few
1110I/O watchers.
1111
1112In this benchmark, I use the same benchmark program as in the large server
1113case, but it uses only eight "servers", of which three are active at any
1114one time. This should reflect performance for a small server relatively
1115well.
1116
1117The columns are identical to the previous table.
1118
1119=head3 Results
1120
1121 name sockets create request
1122 EV 16 20.00 6.54
1123 Event 16 81.27 35.86
1124 Glib 16 32.63 15.48
1125 Perl 16 24.62 162.37
1126 POE 16 261.87 276.28 uses POE::Loop::Event
1127
1128=head3 Discussion
1129
1130The benchmark tries to test the performance of a typical small
1131server. While knowing how various event loops perform is interesting, keep
1132in mind that their overhead in this case is usually not as important, due
1133to the small absolute number of watchers.
1134
1135EV is again fastest.
1136
1137The C-based event loops Event and Glib come in second this time, as the
1138overhead of running an iteration is much smaller in C than in Perl (little
1139code to execute in the inner loop, and perl's function calling overhead is
1140high, and updating all the data structures is costly).
1141
1142The pure perl event loop is much slower, but still competitive.
1143
1144POE also performs much better in this case, but is is stillf ar behind the
1145others.
1146
1147=head3 Summary
1148
1149=over 4
1150
1151=item * C-based event loops perform very well with small number of
1152watchers, as the management overhead dominates.
1153
1154=back
1004 1155
1005 1156
1006=head1 FORK 1157=head1 FORK
1007 1158
1008Most event libraries are not fork-safe. The ones who are usually are 1159Most event libraries are not fork-safe. The ones who are usually are

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines