[ViewVC] Diff of: cvs/AnyEvent/lib/AnyEvent.pm

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.90 by root, Fri Apr 25 14:24:29 2008 UTC vs.
Revision 1.103 by root, Tue Apr 29 07:15:49 2008 UTC

…		…
65	technically possible.	65	technically possible.
66		66
67	Of course, if you want lots of policy (this can arguably be somewhat	67	Of course, if you want lots of policy (this can arguably be somewhat
68	useful) and you want to force your users to use the one and only event	68	useful) and you want to force your users to use the one and only event
69	model, you should I<not> use this module.	69	model, you should I<not> use this module.
70
71		70
72	=head1 DESCRIPTION	71	=head1 DESCRIPTION
73		72
74	L<AnyEvent> provides an identical interface to multiple event loops. This	73	L<AnyEvent> provides an identical interface to multiple event loops. This
75	allows module authors to utilise an event loop without forcing module	74	allows module authors to utilise an event loop without forcing module
…		…
457	might chose the wrong one unless you load the correct one yourself.	456	might chose the wrong one unless you load the correct one yourself.
458		457
459	You can chose to use a rather inefficient pure-perl implementation by	458	You can chose to use a rather inefficient pure-perl implementation by
460	loading the C<AnyEvent::Impl::Perl> module, which gives you similar	459	loading the C<AnyEvent::Impl::Perl> module, which gives you similar
461	behaviour everywhere, but letting AnyEvent chose is generally better.	460	behaviour everywhere, but letting AnyEvent chose is generally better.
		461
		462	=head1 OTHER MODULES
		463
		464	The following is a non-exhaustive list of additional modules that use
		465	AnyEvent and can therefore be mixed easily with other AnyEvent modules
		466	in the same program. Some of the modules come with AnyEvent, some are
		467	available via CPAN.
		468
		469	=over 4
		470
		471	=item L<AnyEvent::Util>
		472
		473	Contains various utility functions that replace often-used but blocking
		474	functions such as C<inet_aton> by event-/callback-based versions.
		475
		476	=item L<AnyEvent::Handle>
		477
		478	Provide read and write buffers and manages watchers for reads and writes.
		479
		480	=item L<AnyEvent::Socket>
		481
		482	Provides a means to do non-blocking connects, accepts etc.
		483
		484	=item L<AnyEvent::HTTPD>
		485
		486	Provides a simple web application server framework.
		487
		488	=item L<AnyEvent::DNS>
		489
		490	Provides asynchronous DNS resolver capabilities, beyond what
		491	L<AnyEvent::Util> offers.
		492
		493	=item L<AnyEvent::FastPing>
		494
		495	The fastest ping in the west.
		496
		497	=item L<Net::IRC3>
		498
		499	AnyEvent based IRC client module family.
		500
		501	=item L<Net::XMPP2>
		502
		503	AnyEvent based XMPP (Jabber protocol) module family.
		504
		505	=item L<Net::FCP>
		506
		507	AnyEvent-based implementation of the Freenet Client Protocol, birthplace
		508	of AnyEvent.
		509
		510	=item L<Event::ExecFlow>
		511
		512	High level API for event-based execution flow control.
		513
		514	=item L<Coro>
		515
		516	Has special support for AnyEvent.
		517
		518	=item L<IO::Lambda>
		519
		520	The lambda approach to I/O - don't ask, look there. Can use AnyEvent.
		521
		522	=item L<IO::AIO>
		523
		524	Truly asynchronous I/O, should be in the toolbox of every event
		525	programmer. Can be trivially made to use AnyEvent.
		526
		527	=item L<BDB>
		528
		529	Truly asynchronous Berkeley DB access. Can be trivially made to use
		530	AnyEvent.
		531
		532	=back
462		533
463	=cut	534	=cut
464		535
465	package AnyEvent;	536	package AnyEvent;
466		537
…		…
894	});	965	});
895		966
896	$quit->wait;	967	$quit->wait;
897		968
898		969
899	=head1 BENCHMARK	970	=head1 BENCHMARKS
900		971
901	To give you an idea of the performance and overheads that AnyEvent adds	972	To give you an idea of the performance and overheads that AnyEvent adds
902	over the event loops themselves (and to give you an impression of the	973	over the event loops themselves and to give you an impression of the speed
903	speed of various event loops), here is a benchmark of various supported	974	of various event loops I prepared some benchmarks.
904	event models natively and with anyevent. The benchmark creates a lot of	975
905	timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to	976	=head2 BENCHMARKING ANYEVENT OVERHEAD
		977
		978	Here is a benchmark of various supported event models used natively and
		979	through anyevent. The benchmark creates a lot of timers (with a zero
		980	timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
906	become writable, which it is), lets them fire exactly once and destroys	981	which it is), lets them fire exactly once and destroys them again.
907	them again.
908		982
909	Rewriting the benchmark to use many different sockets instead of using	983	Source code for this benchmark is found as F<eg/bench> in the AnyEvent
910	the same filehandle for all I/O watchers results in a much longer runtime	984	distribution.
911	(socket creation is expensive), but qualitatively the same figures, so it
912	was not used.
913		985
914	=head2 Explanation of the columns	986	=head3 Explanation of the columns
915		987
916	I<watcher> is the number of event watchers created/destroyed. Since	988	I<watcher> is the number of event watchers created/destroyed. Since
917	different event models feature vastly different performances, each event	989	different event models feature vastly different performances, each event
918	loop was given a number of watchers so that overall runtime is acceptable	990	loop was given a number of watchers so that overall runtime is acceptable
919	and similar between tested event loop (and keep them from crashing): Glib	991	and similar between tested event loop (and keep them from crashing): Glib
…		…
935	signal the end of this phase.	1007	signal the end of this phase.
936		1008
937	I<destroy> is the time, in microseconds, that it takes to destroy a single	1009	I<destroy> is the time, in microseconds, that it takes to destroy a single
938	watcher.	1010	watcher.
939		1011
940	=head2 Results	1012	=head3 Results
941		1013
942	name watchers bytes create invoke destroy comment	1014	name watchers bytes create invoke destroy comment
943	EV/EV 400000 244 0.56 0.46 0.31 EV native interface	1015	EV/EV 400000 244 0.56 0.46 0.31 EV native interface
944	EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers	1016	EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers
945	CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal	1017	CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal
946	Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation	1018	Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation
947	Event/Event 16000 516 31.88 31.30 0.85 Event native interface	1019	Event/Event 16000 516 31.88 31.30 0.85 Event native interface
948	Event/Any 16000 936 39.17 33.63 1.43 Event + AnyEvent watchers	1020	Event/Any 16000 590 35.75 31.42 1.08 Event + AnyEvent watchers
949	Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour	1021	Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour
950	Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers	1022	Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers
951	POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event	1023	POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event
952	POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select	1024	POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select
953		1025
954	=head2 Discussion	1026	=head3 Discussion
955		1027
956	The benchmark does I<not> measure scalability of the event loop very	1028	The benchmark does I<not> measure scalability of the event loop very
957	well. For example, a select-based event loop (such as the pure perl one)	1029	well. For example, a select-based event loop (such as the pure perl one)
958	can never compete with an event loop that uses epoll when the number of	1030	can never compete with an event loop that uses epoll when the number of
959	file descriptors grows high. In this benchmark, all events become ready at	1031	file descriptors grows high. In this benchmark, all events become ready at
960	the same time, so select/poll-based implementations get an unnatural speed	1032	the same time, so select/poll-based implementations get an unnatural speed
961	boost.	1033	boost.
		1034
		1035	Also, note that the number of watchers usually has a nonlinear effect on
		1036	overall speed, that is, creating twice as many watchers doesn't take twice
		1037	the time - usually it takes longer. This puts event loops tested with a
		1038	higher number of watchers at a disadvantage.
		1039
		1040	To put the range of results into perspective, consider that on the
		1041	benchmark machine, handling an event takes roughly 1600 CPU cycles with
		1042	EV, 3100 CPU cycles with AnyEvent's pure perl loop and almost 3000000 CPU
		1043	cycles with POE.
962		1044
963	C<EV> is the sole leader regarding speed and memory use, which are both	1045	C<EV> is the sole leader regarding speed and memory use, which are both
964	maximal/minimal, respectively. Even when going through AnyEvent, it uses	1046	maximal/minimal, respectively. Even when going through AnyEvent, it uses
965	far less memory than any other event loop and is still faster than Event	1047	far less memory than any other event loop and is still faster than Event
966	natively.	1048	natively.
…		…
989	file descriptor is dup()ed for each watcher. This shows that the dup()	1071	file descriptor is dup()ed for each watcher. This shows that the dup()
990	employed by some adaptors is not a big performance issue (it does incur a	1072	employed by some adaptors is not a big performance issue (it does incur a
991	hidden memory cost inside the kernel which is not reflected in the figures	1073	hidden memory cost inside the kernel which is not reflected in the figures
992	above).	1074	above).
993		1075
994	C<POE>, regardless of underlying event loop (whether using its pure	1076	C<POE>, regardless of underlying event loop (whether using its pure perl
995	perl select-based backend or the Event module, the POE-EV backend	1077	select-based backend or the Event module, the POE-EV backend couldn't
996	couldn't be tested because it wasn't working) shows abysmal performance	1078	be tested because it wasn't working) shows abysmal performance and
997	and memory usage: Watchers use almost 30 times as much memory as	1079	memory usage with AnyEvent: Watchers use almost 30 times as much memory
998	EV watchers, and 10 times as much memory as Event (the high memory	1080	as EV watchers, and 10 times as much memory as Event (the high memory
999	requirements are caused by requiring a session for each watcher). Watcher	1081	requirements are caused by requiring a session for each watcher). Watcher
1000	invocation speed is almost 900 times slower than with AnyEvent's pure perl	1082	invocation speed is almost 900 times slower than with AnyEvent's pure perl
		1083	implementation.
		1084
1001	implementation. The design of the POE adaptor class in AnyEvent can not	1085	The design of the POE adaptor class in AnyEvent can not really account
1002	really account for this, as session creation overhead is small compared	1086	for the performance issues, though, as session creation overhead is
1003	to execution of the state machine, which is coded pretty optimally within	1087	small compared to execution of the state machine, which is coded pretty
1004	L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.	1088	optimally within L<AnyEvent::Impl::POE> (and while everybody agrees that
		1089	using multiple sessions is not a good approach, especially regarding
		1090	memory usage, even the author of POE could not come up with a faster
		1091	design).
1005		1092
1006	=head2 Summary	1093	=head3 Summary
1007		1094
1008	=over 4	1095	=over 4
1009		1096
1010	=item * Using EV through AnyEvent is faster than any other event loop	1097	=item * Using EV through AnyEvent is faster than any other event loop
1011	(even when used without AnyEvent), but most event loops have acceptable	1098	(even when used without AnyEvent), but most event loops have acceptable
…		…
1015	the actual event loop, only with extremely fast event loops such as EV	1102	the actual event loop, only with extremely fast event loops such as EV
1016	adds AnyEvent significant overhead.	1103	adds AnyEvent significant overhead.
1017		1104
1018	=item * You should avoid POE like the plague if you want performance or	1105	=item * You should avoid POE like the plague if you want performance or
1019	reasonable memory usage.	1106	reasonable memory usage.
		1107
		1108	=back
		1109
		1110	=head2 BENCHMARKING THE LARGE SERVER CASE
		1111
		1112	This benchmark atcually benchmarks the event loop itself. It works by
		1113	creating a number of "servers": each server consists of a socketpair, a
		1114	timeout watcher that gets reset on activity (but never fires), and an I/O
		1115	watcher waiting for input on one side of the socket. Each time the socket
		1116	watcher reads a byte it will write that byte to a random other "server".
		1117
		1118	The effect is that there will be a lot of I/O watchers, only part of which
		1119	are active at any one point (so there is a constant number of active
		1120	fds for each loop iterstaion, but which fds these are is random). The
		1121	timeout is reset each time something is read because that reflects how
		1122	most timeouts work (and puts extra pressure on the event loops).
		1123
		1124	In this benchmark, we use 10000 socketpairs (20000 sockets), of which 100
		1125	(1%) are active. This mirrors the activity of large servers with many
		1126	connections, most of which are idle at any one point in time.
		1127
		1128	Source code for this benchmark is found as F<eg/bench2> in the AnyEvent
		1129	distribution.
		1130
		1131	=head3 Explanation of the columns
		1132
		1133	I<sockets> is the number of sockets, and twice the number of "servers" (as
		1134	each server has a read and write socket end).
		1135
		1136	I<create> is the time it takes to create a socketpair (which is
		1137	nontrivial) and two watchers: an I/O watcher and a timeout watcher.
		1138
		1139	I<request>, the most important value, is the time it takes to handle a
		1140	single "request", that is, reading the token from the pipe and forwarding
		1141	it to another server. This includes deleting the old timeout and creating
		1142	a new one that moves the timeout into the future.
		1143
		1144	=head3 Results
		1145
		1146	name sockets create request
		1147	EV 20000 69.01 11.16
		1148	Perl 20000 73.32 35.87
		1149	Event 20000 212.62 257.32
		1150	Glib 20000 651.16 1896.30
		1151	POE 20000 349.67 12317.24 uses POE::Loop::Event
		1152
		1153	=head3 Discussion
		1154
		1155	This benchmark I<does> measure scalability and overall performance of the
		1156	particular event loop.
		1157
		1158	EV is again fastest. Since it is using epoll on my system, the setup time
		1159	is relatively high, though.
		1160
		1161	Perl surprisingly comes second. It is much faster than the C-based event
		1162	loops Event and Glib.
		1163
		1164	Event suffers from high setup time as well (look at its code and you will
		1165	understand why). Callback invocation also has a high overhead compared to
		1166	the C<< $_->() for .. >>-style loop that the Perl event loop uses. Event
		1167	uses select or poll in basically all documented configurations.
		1168
		1169	Glib is hit hard by its quadratic behaviour w.r.t. many watchers. It
		1170	clearly fails to perform with many filehandles or in busy servers.
		1171
		1172	POE is still completely out of the picture, taking over 1000 times as long
		1173	as EV, and over 100 times as long as the Perl implementation, even though
		1174	it uses a C-based event loop in this case.
		1175
		1176	=head3 Summary
		1177
		1178	=over 4
		1179
		1180	=item * The pure perl implementation performs extremely well.
		1181
		1182	=item * Avoid Glib or POE in large projects where performance matters.
		1183
		1184	=back
		1185
		1186	=head2 BENCHMARKING SMALL SERVERS
		1187
		1188	While event loops should scale (and select-based ones do not...) even to
		1189	large servers, most programs we (or I :) actually write have only a few
		1190	I/O watchers.
		1191
		1192	In this benchmark, I use the same benchmark program as in the large server
		1193	case, but it uses only eight "servers", of which three are active at any
		1194	one time. This should reflect performance for a small server relatively
		1195	well.
		1196
		1197	The columns are identical to the previous table.
		1198
		1199	=head3 Results
		1200
		1201	name sockets create request
		1202	EV 16 20.00 6.54
		1203	Perl 16 25.75 12.62
		1204	Event 16 81.27 35.86
		1205	Glib 16 32.63 15.48
		1206	POE 16 261.87 276.28 uses POE::Loop::Event
		1207
		1208	=head3 Discussion
		1209
		1210	The benchmark tries to test the performance of a typical small
		1211	server. While knowing how various event loops perform is interesting, keep
		1212	in mind that their overhead in this case is usually not as important, due
		1213	to the small absolute number of watchers (that is, you need efficiency and
		1214	speed most when you have lots of watchers, not when you only have a few of
		1215	them).
		1216
		1217	EV is again fastest.
		1218
		1219	Perl again comes second. It is noticably faster than the C-based event
		1220	loops Event and Glib, although the difference is too small to really
		1221	matter.
		1222
		1223	POE also performs much better in this case, but is is still far behind the
		1224	others.
		1225
		1226	=head3 Summary
		1227
		1228	=over 4
		1229
		1230	=item * C-based event loops perform very well with small number of
		1231	watchers, as the management overhead dominates.
1020		1232
1021	=back	1233	=back
1022		1234
1023		1235
1024	=head1 FORK	1236	=head1 FORK

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent/lib/AnyEvent.pm (file contents): Revision 1.90 by root, Fri Apr 25 14:24:29 2008 UTC vs. Revision 1.103 by root, Tue Apr 29 07:15:49 2008 UTC

Diff Legend

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.90 by root, Fri Apr 25 14:24:29 2008 UTC vs.
Revision 1.103 by root, Tue Apr 29 07:15:49 2008 UTC