[ViewVC] Diff of: cvs/AnyEvent/lib/AnyEvent.pm

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.83 by root, Fri Apr 25 13:39:08 2008 UTC vs.
Revision 1.95 by root, Sat Apr 26 11:06:45 2008 UTC

…		…
141	=head2 I/O WATCHERS	141	=head2 I/O WATCHERS
142		142
143	You can create an I/O watcher by calling the C<< AnyEvent->io >> method	143	You can create an I/O watcher by calling the C<< AnyEvent->io >> method
144	with the following mandatory key-value pairs as arguments:	144	with the following mandatory key-value pairs as arguments:
145		145
146	C<fh> the Perl I<file handle> (I<not> file descriptor) to watch for	146	C<fh> the Perl I<file handle> (I<not> file descriptor) to watch
147	events. C<poll> must be a string that is either C<r> or C<w>, which	147	for events. C<poll> must be a string that is either C<r> or C<w>,
148	creates a watcher waiting for "r"eadable or "w"ritable events,	148	which creates a watcher waiting for "r"eadable or "w"ritable events,
149	respectively. C<cb> is the callback to invoke each time the file handle	149	respectively. C<cb> is the callback to invoke each time the file handle
150	becomes ready.	150	becomes ready.
151		151
		152	Although the callback might get passed parameters, their value and
		153	presence is undefined and you cannot rely on them. Portable AnyEvent
		154	callbacks cannot use arguments passed to I/O watcher callbacks.
		155
152	The I/O watcher might use the underlying file descriptor or a copy of it.	156	The I/O watcher might use the underlying file descriptor or a copy of it.
153	It is not allowed to close a file handle as long as any watcher is active	157	You must not close a file handle as long as any watcher is active on the
154	on the underlying file descriptor.	158	underlying file descriptor.
155		159
156	Some event loops issue spurious readyness notifications, so you should	160	Some event loops issue spurious readyness notifications, so you should
157	always use non-blocking calls when reading/writing from/to your file	161	always use non-blocking calls when reading/writing from/to your file
158	handles.	162	handles.
159		163
…		…
170		174
171	You can create a time watcher by calling the C<< AnyEvent->timer >>	175	You can create a time watcher by calling the C<< AnyEvent->timer >>
172	method with the following mandatory arguments:	176	method with the following mandatory arguments:
173		177
174	C<after> specifies after how many seconds (fractional values are	178	C<after> specifies after how many seconds (fractional values are
175	supported) should the timer activate. C<cb> the callback to invoke in that	179	supported) the callback should be invoked. C<cb> is the callback to invoke
176	case.	180	in that case.
		181
		182	Although the callback might get passed parameters, their value and
		183	presence is undefined and you cannot rely on them. Portable AnyEvent
		184	callbacks cannot use arguments passed to time watcher callbacks.
177		185
178	The timer callback will be invoked at most once: if you want a repeating	186	The timer callback will be invoked at most once: if you want a repeating
179	timer you have to create a new watcher (this is a limitation by both Tk	187	timer you have to create a new watcher (this is a limitation by both Tk
180	and Glib).	188	and Glib).
181		189
…		…
226		234
227	You can watch for signals using a signal watcher, C<signal> is the signal	235	You can watch for signals using a signal watcher, C<signal> is the signal
228	I<name> without any C<SIG> prefix, C<cb> is the Perl callback to	236	I<name> without any C<SIG> prefix, C<cb> is the Perl callback to
229	be invoked whenever a signal occurs.	237	be invoked whenever a signal occurs.
230		238
		239	Although the callback might get passed parameters, their value and
		240	presence is undefined and you cannot rely on them. Portable AnyEvent
		241	callbacks cannot use arguments passed to signal watcher callbacks.
		242
231	Multiple signal occurances can be clumped together into one callback	243	Multiple signal occurances can be clumped together into one callback
232	invocation, and callback invocation will be synchronous. synchronous means	244	invocation, and callback invocation will be synchronous. synchronous means
233	that it might take a while until the signal gets handled by the process,	245	that it might take a while until the signal gets handled by the process,
234	but it is guarenteed not to interrupt any other callbacks.	246	but it is guarenteed not to interrupt any other callbacks.
235		247
…		…
249		261
250	The child process is specified by the C<pid> argument (if set to C<0>, it	262	The child process is specified by the C<pid> argument (if set to C<0>, it
251	watches for any child process exit). The watcher will trigger as often	263	watches for any child process exit). The watcher will trigger as often
252	as status change for the child are received. This works by installing a	264	as status change for the child are received. This works by installing a
253	signal handler for C<SIGCHLD>. The callback will be called with the pid	265	signal handler for C<SIGCHLD>. The callback will be called with the pid
254	and exit status (as returned by waitpid).	266	and exit status (as returned by waitpid), so unlike other watcher types,
		267	you I<can> rely on child watcher callback arguments.
255		268
256	There is a slight catch to child watchers, however: you usually start them	269	There is a slight catch to child watchers, however: you usually start them
257	I<after> the child process was created, and this means the process could	270	I<after> the child process was created, and this means the process could
258	have exited already (and no SIGCHLD will be sent anymore).	271	have exited already (and no SIGCHLD will be sent anymore).
259		272
…		…
881	});	894	});
882		895
883	$quit->wait;	896	$quit->wait;
884		897
885		898
886	=head1 BENCHMARK	899	=head1 BENCHMARKS
887		900
888	To give you an idea of the performance and overheads that AnyEvent adds	901	To give you an idea of the performance and overheads that AnyEvent adds
889	over the event loops themselves (and to give you an impression of the	902	over the event loops themselves and to give you an impression of the speed
890	speed of various event loops), here is a benchmark of various supported	903	of various event loops I prepared some benchmarks.
891	event models natively and with anyevent. The benchmark creates a lot of	904
892	timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to	905	=head2 BENCHMARKING ANYEVENT OVERHEAD
		906
		907	Here is a benchmark of various supported event models used natively and
		908	through anyevent. The benchmark creates a lot of timers (with a zero
		909	timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
893	become writable, which it is), lets them fire exactly once and destroys	910	which it is), lets them fire exactly once and destroys them again.
894	them again.
895		911
896	Rewriting the benchmark to use many different sockets instead of using	912	Source code for this benchmark is found as F<eg/bench> in the AnyEvent
897	the same filehandle for all I/O watchers results in a much longer runtime	913	distribution.
898	(socket creation is expensive), but qualitatively the same figures, so it
899	was not used.
900		914
901	=head2 Explanation of the columns	915	=head3 Explanation of the columns
902		916
903	I<watcher> is the number of event watchers created/destroyed. Since	917	I<watcher> is the number of event watchers created/destroyed. Since
904	different event models feature vastly different performances, each event	918	different event models feature vastly different performances, each event
905	loop was given a number of watchers so that overall runtime is acceptable	919	loop was given a number of watchers so that overall runtime is acceptable
906	and similar between tested event loop (and keep them from crashing): Glib	920	and similar between tested event loop (and keep them from crashing): Glib
…		…
922	signal the end of this phase.	936	signal the end of this phase.
923		937
924	I<destroy> is the time, in microseconds, that it takes to destroy a single	938	I<destroy> is the time, in microseconds, that it takes to destroy a single
925	watcher.	939	watcher.
926		940
927	=head2 Results	941	=head3 Results
928		942
929	name watchers bytes create invoke destroy comment	943	name watchers bytes create invoke destroy comment
930	EV/EV 400000 244 0.56 0.46 0.31 EV native interface	944	EV/EV 400000 244 0.56 0.46 0.31 EV native interface
931	EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers	945	EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers
932	CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal	946	CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal
…		…
936	Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour	950	Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour
937	Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers	951	Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers
938	POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event	952	POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event
939	POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select	953	POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select
940		954
941	=head2 Discussion	955	=head3 Discussion
942		956
943	The benchmark does I<not> measure scalability of the event loop very	957	The benchmark does I<not> measure scalability of the event loop very
944	well. For example, a select-based event loop (such as the pure perl one)	958	well. For example, a select-based event loop (such as the pure perl one)
945	can never compete with an event loop that uses epoll when the number of	959	can never compete with an event loop that uses epoll when the number of
946	file descriptors grows high. In this benchmark, all events become ready at	960	file descriptors grows high. In this benchmark, all events become ready at
947	the same time, so select/poll-based implementations get an unnatural speed	961	the same time, so select/poll-based implementations get an unnatural speed
948	boost.	962	boost.
949		963
		964	Also, note that the number of watchers usually has a nonlinear effect on
		965	overall speed, that is, creating twice as many watchers doesn't take twice
		966	the time - usually it takes longer. This puts event loops tested with a
		967	higher number of watchers at a disadvantage.
		968
950	C<EV> is the sole leader regarding speed and memory use, which are both	969	C<EV> is the sole leader regarding speed and memory use, which are both
951	maximal/minimal, respectively. Even when going through AnyEvent, there are	970	maximal/minimal, respectively. Even when going through AnyEvent, it uses
952	only two event loops that use slightly less memory (the C<Event> module	971	far less memory than any other event loop and is still faster than Event
953	natively and the pure perl backend), and no faster event models, not even	972	natively.
954	C<Event> natively.
955		973
956	The pure perl implementation is hit in a few sweet spots (both the	974	The pure perl implementation is hit in a few sweet spots (both the
957	zero timeout and the use of a single fd hit optimisations in the perl	975	constant timeout and the use of a single fd hit optimisations in the perl
958	interpreter and the backend itself, and all watchers become ready at the	976	interpreter and the backend itself). Nevertheless this shows that it
959	same time). Nevertheless this shows that it adds very little overhead in	977	adds very little overhead in itself. Like any select-based backend its
960	itself. Like any select-based backend its performance becomes really bad	978	performance becomes really bad with lots of file descriptors (and few of
961	with lots of file descriptors (and few of them active), of course, but	979	them active), of course, but this was not subject of this benchmark.
962	this was not subject of this benchmark.
963		980
964	The C<Event> module has a relatively high setup and callback invocation cost,	981	The C<Event> module has a relatively high setup and callback invocation
965	but overall scores on the third place.	982	cost, but overall scores in on the third place.
966		983
967	C<Glib>'s memory usage is quite a bit bit higher, but it features a	984	C<Glib>'s memory usage is quite a bit higher, but it features a
968	faster callback invocation and overall ends up in the same class as	985	faster callback invocation and overall ends up in the same class as
969	C<Event>. However, Glib scales extremely badly, doubling the number of	986	C<Event>. However, Glib scales extremely badly, doubling the number of
970	watchers increases the processing time by more than a factor of four,	987	watchers increases the processing time by more than a factor of four,
971	making it completely unusable when using larger numbers of watchers	988	making it completely unusable when using larger numbers of watchers
972	(note that only a single file descriptor was used in the benchmark, so	989	(note that only a single file descriptor was used in the benchmark, so
…		…
975	The C<Tk> adaptor works relatively well. The fact that it crashes with	992	The C<Tk> adaptor works relatively well. The fact that it crashes with
976	more than 2000 watchers is a big setback, however, as correctness takes	993	more than 2000 watchers is a big setback, however, as correctness takes
977	precedence over speed. Nevertheless, its performance is surprising, as the	994	precedence over speed. Nevertheless, its performance is surprising, as the
978	file descriptor is dup()ed for each watcher. This shows that the dup()	995	file descriptor is dup()ed for each watcher. This shows that the dup()
979	employed by some adaptors is not a big performance issue (it does incur a	996	employed by some adaptors is not a big performance issue (it does incur a
980	hidden memory cost inside the kernel, though, that is not reflected in the	997	hidden memory cost inside the kernel which is not reflected in the figures
981	figures above).	998	above).
982		999
983	C<POE>, regardless of underlying event loop (wether using its pure perl	1000	C<POE>, regardless of underlying event loop (whether using its pure
984	select-based backend or the Event module) shows abysmal performance and	1001	perl select-based backend or the Event module, the POE-EV backend
		1002	couldn't be tested because it wasn't working) shows abysmal performance
985	memory usage: Watchers use almost 30 times as much memory as EV watchers,	1003	and memory usage: Watchers use almost 30 times as much memory as
986	and 10 times as much memory as both Event or EV via AnyEvent. Watcher	1004	EV watchers, and 10 times as much memory as Event (the high memory
		1005	requirements are caused by requiring a session for each watcher). Watcher
987	invocation is almost 900 times slower than with AnyEvent's pure perl	1006	invocation speed is almost 900 times slower than with AnyEvent's pure perl
988	implementation. The design of the POE adaptor class in AnyEvent can not	1007	implementation. The design of the POE adaptor class in AnyEvent can not
989	really account for this, as session creation overhead is small compared	1008	really account for this, as session creation overhead is small compared
990	to execution of the state machine, which is coded pretty optimally within	1009	to execution of the state machine, which is coded pretty optimally within
991	L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.	1010	L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.
992		1011
993	=head2 Summary	1012	=head3 Summary
994		1013
		1014	=over 4
		1015
995	Using EV through AnyEvent is faster than any other event loop, but most	1016	=item * Using EV through AnyEvent is faster than any other event loop
996	event loops have acceptable performance with or without AnyEvent.	1017	(even when used without AnyEvent), but most event loops have acceptable
		1018	performance with or without AnyEvent.
997		1019
998	The overhead AnyEvent adds is usually much smaller than the overhead of	1020	=item * The overhead AnyEvent adds is usually much smaller than the overhead of
999	the actual event loop, only with extremely fast event loops such as the EV	1021	the actual event loop, only with extremely fast event loops such as EV
1000	adds AnyEvent significant overhead.	1022	adds AnyEvent significant overhead.
1001		1023
1002	And you should simply avoid POE like the plague if you want performance or	1024	=item * You should avoid POE like the plague if you want performance or
1003	reasonable memory usage.	1025	reasonable memory usage.
		1026
		1027	=back
		1028
		1029	=head2 BENCHMARKING THE LARGE SERVER CASE
		1030
		1031	This benchmark atcually benchmarks the event loop itself. It works by
		1032	creating a number of "servers": each server consists of a socketpair, a
		1033	timeout watcher that gets reset on activity (but never fires), and an I/O
		1034	watcher waiting for input on one side of the socket. Each time the socket
		1035	watcher reads a byte it will write that byte to a random other "server".
		1036
		1037	The effect is that there will be a lot of I/O watchers, only part of which
		1038	are active at any one point (so there is a constant number of active
		1039	fds for each loop iterstaion, but which fds these are is random). The
		1040	timeout is reset each time something is read because that reflects how
		1041	most timeouts work (and puts extra pressure on the event loops).
		1042
		1043	In this benchmark, we use 10000 socketpairs (20000 sockets), of which 100
		1044	(1%) are active. This mirrors the activity of large servers with many
		1045	connections, most of which are idle at any one point in time.
		1046
		1047	Source code for this benchmark is found as F<eg/bench2> in the AnyEvent
		1048	distribution.
		1049
		1050	=head3 Explanation of the columns
		1051
		1052	I<sockets> is the number of sockets, and twice the number of "servers" (as
		1053	each server has a read and write socket end).
		1054
		1055	I<create> is the time it takes to create a socketpair (which is
		1056	nontrivial) and two watchers: an I/O watcher and a timeout watcher.
		1057
		1058	I<request>, the most important value, is the time it takes to handle a
		1059	single "request", that is, reading the token from the pipe and forwarding
		1060	it to another server. This includes deleting the old timeout and creating
		1061	a new one that moves the timeout into the future.
		1062
		1063	=head3 Results
		1064
		1065	name sockets create request
		1066	EV 20000 69.01 11.16
		1067	Perl 20000 75.28 112.76
		1068	Event 20000 212.62 257.32
		1069	Glib 20000 651.16 1896.30
		1070	POE 20000 349.67 12317.24 uses POE::Loop::Event
		1071
		1072	=head3 Discussion
		1073
		1074	This benchmark I<does> measure scalability and overall performance of the
		1075	particular event loop.
		1076
		1077	EV is again fastest. Since it is using epoll on my system, the setup time
		1078	is relatively high, though.
		1079
		1080	Perl surprisingly comes second. It is much faster than the C-based event
		1081	loops Event and Glib.
		1082
		1083	Event suffers from high setup time as well (look at its code and you will
		1084	understand why). Callback invocation also has a high overhead compared to
		1085	the C<< $_->() for .. >>-style loop that the Perl event loop uses. Event
		1086	uses select or poll in basically all documented configurations.
		1087
		1088	Glib is hit hard by its quadratic behaviour w.r.t. many watchers. It
		1089	clearly fails to perform with many filehandles or in busy servers.
		1090
		1091	POE is still completely out of the picture, taking over 1000 times as long
		1092	as EV, and over 100 times as long as the Perl implementation, even though
		1093	it uses a C-based event loop in this case.
		1094
		1095	=head3 Summary
		1096
		1097	=over 4
		1098
		1099	=item * The pure perl implementation performs extremely well, considering
		1100	that it uses select.
		1101
		1102	=item * Avoid Glib or POE in large projects where performance matters.
		1103
		1104	=back
		1105
		1106	=head2 BENCHMARKING SMALL SERVERS
		1107
		1108	While event loops should scale (and select-based ones do not...) even to
		1109	large servers, most programs we (or I :) actually write have only a few
		1110	I/O watchers.
		1111
		1112	In this benchmark, I use the same benchmark program as in the large server
		1113	case, but it uses only eight "servers", of which three are active at any
		1114	one time. This should reflect performance for a small server relatively
		1115	well.
		1116
		1117	The columns are identical to the previous table.
		1118
		1119	=head3 Results
		1120
		1121	name sockets create request
		1122	EV 16 20.00 6.54
		1123	Event 16 81.27 35.86
		1124	Glib 16 32.63 15.48
		1125	Perl 16 24.62 162.37
		1126	POE 16 261.87 276.28 uses POE::Loop::Event
		1127
		1128	=head3 Discussion
		1129
		1130	The benchmark tries to test the performance of a typical small
		1131	server. While knowing how various event loops perform is interesting, keep
		1132	in mind that their overhead in this case is usually not as important, due
		1133	to the small absolute number of watchers.
		1134
		1135	EV is again fastest.
		1136
		1137	The C-based event loops Event and Glib come in second this time, as the
		1138	overhead of running an iteration is much smaller in C than in Perl (little
		1139	code to execute in the inner loop, and perl's function calling overhead is
		1140	high, and updating all the data structures is costly).
		1141
		1142	The pure perl event loop is much slower, but still competitive.
		1143
		1144	POE also performs much better in this case, but is is stillf ar behind the
		1145	others.
		1146
		1147	=head3 Summary
		1148
		1149	=over 4
		1150
		1151	=item * C-based event loops perform very well with small number of
		1152	watchers, as the management overhead dominates.
		1153
		1154	=back
1004		1155
1005		1156
1006	=head1 FORK	1157	=head1 FORK
1007		1158
1008	Most event libraries are not fork-safe. The ones who are usually are	1159	Most event libraries are not fork-safe. The ones who are usually are

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent/lib/AnyEvent.pm (file contents): Revision 1.83 by root, Fri Apr 25 13:39:08 2008 UTC vs. Revision 1.95 by root, Sat Apr 26 11:06:45 2008 UTC

Diff Legend

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.83 by root, Fri Apr 25 13:39:08 2008 UTC vs.
Revision 1.95 by root, Sat Apr 26 11:06:45 2008 UTC