[ViewVC] Diff of: cvs/AnyEvent/lib/AnyEvent.pm

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.86 by root, Fri Apr 25 14:01:48 2008 UTC vs.
Revision 1.100 by elmex, Sun Apr 27 19:15:43 2008 UTC

…		…
66		66
67	Of course, if you want lots of policy (this can arguably be somewhat	67	Of course, if you want lots of policy (this can arguably be somewhat
68	useful) and you want to force your users to use the one and only event	68	useful) and you want to force your users to use the one and only event
69	model, you should I<not> use this module.	69	model, you should I<not> use this module.
70		70
		71	#TODO#
		72
		73	Net::IRC3
		74	AnyEvent::HTTPD
		75	AnyEvent::DNS
		76	IO::AnyEvent
		77	Net::FPing
		78	Net::XMPP2
		79	Coro
		80
		81	AnyEvent::IRC
		82	AnyEvent::HTTPD
		83	AnyEvent::DNS
		84	AnyEvent::Handle
		85	AnyEvent::Socket
		86	AnyEvent::FPing
		87	AnyEvent::XMPP
		88	AnyEvent::SNMP
		89	Coro
71		90
72	=head1 DESCRIPTION	91	=head1 DESCRIPTION
73		92
74	L<AnyEvent> provides an identical interface to multiple event loops. This	93	L<AnyEvent> provides an identical interface to multiple event loops. This
75	allows module authors to utilise an event loop without forcing module	94	allows module authors to utilise an event loop without forcing module
…		…
457	might chose the wrong one unless you load the correct one yourself.	476	might chose the wrong one unless you load the correct one yourself.
458		477
459	You can chose to use a rather inefficient pure-perl implementation by	478	You can chose to use a rather inefficient pure-perl implementation by
460	loading the C<AnyEvent::Impl::Perl> module, which gives you similar	479	loading the C<AnyEvent::Impl::Perl> module, which gives you similar
461	behaviour everywhere, but letting AnyEvent chose is generally better.	480	behaviour everywhere, but letting AnyEvent chose is generally better.
		481
		482	=head1 OTHER MODULES
		483
		484	L<AnyEvent> itself comes with useful utility modules:
		485
		486	To make it easier to do non-blocking IO the modules L<AnyEvent::Handle>
		487	and L<AnyEvent::Socket> are provided. L<AnyEvent::Handle> provides
		488	read and write buffers and manages watchers for reads and writes.
		489	L<AnyEvent::Socket> provides means to do non-blocking connects.
		490
		491	Aside from those there are these modules that support AnyEvent (and use it
		492	for non-blocking IO):
		493
		494	=over 4
		495
		496	=item L<AnyEvent::FastPing>
		497
		498	=item L<Net::IRC3>
		499
		500	=item L<Net::XMPP2>
		501
		502	=back
462		503
463	=cut	504	=cut
464		505
465	package AnyEvent;	506	package AnyEvent;
466		507
…		…
894	});	935	});
895		936
896	$quit->wait;	937	$quit->wait;
897		938
898		939
899	=head1 BENCHMARK	940	=head1 BENCHMARKS
900		941
901	To give you an idea of the performance and overheads that AnyEvent adds	942	To give you an idea of the performance and overheads that AnyEvent adds
902	over the event loops themselves (and to give you an impression of the	943	over the event loops themselves and to give you an impression of the speed
903	speed of various event loops), here is a benchmark of various supported	944	of various event loops I prepared some benchmarks.
904	event models natively and with anyevent. The benchmark creates a lot of	945
905	timers (with a zero timeout) and I/O watchers (watching STDOUT, a pty, to	946	=head2 BENCHMARKING ANYEVENT OVERHEAD
		947
		948	Here is a benchmark of various supported event models used natively and
		949	through anyevent. The benchmark creates a lot of timers (with a zero
		950	timeout) and I/O watchers (watching STDOUT, a pty, to become writable,
906	become writable, which it is), lets them fire exactly once and destroys	951	which it is), lets them fire exactly once and destroys them again.
907	them again.
908		952
909	Rewriting the benchmark to use many different sockets instead of using	953	Source code for this benchmark is found as F<eg/bench> in the AnyEvent
910	the same filehandle for all I/O watchers results in a much longer runtime	954	distribution.
911	(socket creation is expensive), but qualitatively the same figures, so it
912	was not used.
913		955
914	=head2 Explanation of the columns	956	=head3 Explanation of the columns
915		957
916	I<watcher> is the number of event watchers created/destroyed. Since	958	I<watcher> is the number of event watchers created/destroyed. Since
917	different event models feature vastly different performances, each event	959	different event models feature vastly different performances, each event
918	loop was given a number of watchers so that overall runtime is acceptable	960	loop was given a number of watchers so that overall runtime is acceptable
919	and similar between tested event loop (and keep them from crashing): Glib	961	and similar between tested event loop (and keep them from crashing): Glib
…		…
935	signal the end of this phase.	977	signal the end of this phase.
936		978
937	I<destroy> is the time, in microseconds, that it takes to destroy a single	979	I<destroy> is the time, in microseconds, that it takes to destroy a single
938	watcher.	980	watcher.
939		981
940	=head2 Results	982	=head3 Results
941		983
942	name watchers bytes create invoke destroy comment	984	name watchers bytes create invoke destroy comment
943	EV/EV 400000 244 0.56 0.46 0.31 EV native interface	985	EV/EV 400000 244 0.56 0.46 0.31 EV native interface
944	EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers	986	EV/Any 100000 244 2.50 0.46 0.29 EV + AnyEvent watchers
945	CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal	987	CoroEV/Any 100000 244 2.49 0.44 0.29 coroutines + Coro::Signal
946	Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation	988	Perl/Any 100000 513 4.92 0.87 1.12 pure perl implementation
947	Event/Event 16000 516 31.88 31.30 0.85 Event native interface	989	Event/Event 16000 516 31.88 31.30 0.85 Event native interface
948	Event/Any 16000 936 39.17 33.63 1.43 Event + AnyEvent watchers	990	Event/Any 16000 590 35.75 31.42 1.08 Event + AnyEvent watchers
949	Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour	991	Glib/Any 16000 1357 98.22 12.41 54.00 quadratic behaviour
950	Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers	992	Tk/Any 2000 1860 26.97 67.98 14.00 SEGV with >> 2000 watchers
951	POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event	993	POE/Event 2000 6644 108.64 736.02 14.73 via POE::Loop::Event
952	POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select	994	POE/Select 2000 6343 94.13 809.12 565.96 via POE::Loop::Select
953		995
954	=head2 Discussion	996	=head3 Discussion
955		997
956	The benchmark does I<not> measure scalability of the event loop very	998	The benchmark does I<not> measure scalability of the event loop very
957	well. For example, a select-based event loop (such as the pure perl one)	999	well. For example, a select-based event loop (such as the pure perl one)
958	can never compete with an event loop that uses epoll when the number of	1000	can never compete with an event loop that uses epoll when the number of
959	file descriptors grows high. In this benchmark, all events become ready at	1001	file descriptors grows high. In this benchmark, all events become ready at
960	the same time, so select/poll-based implementations get an unnatural speed	1002	the same time, so select/poll-based implementations get an unnatural speed
961	boost.	1003	boost.
962		1004
		1005	Also, note that the number of watchers usually has a nonlinear effect on
		1006	overall speed, that is, creating twice as many watchers doesn't take twice
		1007	the time - usually it takes longer. This puts event loops tested with a
		1008	higher number of watchers at a disadvantage.
		1009
		1010	To put the range of results into perspective, consider that on the
		1011	benchmark machine, handling an event takes roughly 1600 CPU cycles with
		1012	EV, 3100 CPU cycles with AnyEvent's pure perl loop and almost 3000000 CPU
		1013	cycles with POE.
		1014
963	C<EV> is the sole leader regarding speed and memory use, which are both	1015	C<EV> is the sole leader regarding speed and memory use, which are both
964	maximal/minimal, respectively. Even when going through AnyEvent, it uses	1016	maximal/minimal, respectively. Even when going through AnyEvent, it uses
965	far less memory than any other event loop and is still faster than Event	1017	far less memory than any other event loop and is still faster than Event
966	natively.	1018	natively.
967		1019
…		…
970	interpreter and the backend itself). Nevertheless this shows that it	1022	interpreter and the backend itself). Nevertheless this shows that it
971	adds very little overhead in itself. Like any select-based backend its	1023	adds very little overhead in itself. Like any select-based backend its
972	performance becomes really bad with lots of file descriptors (and few of	1024	performance becomes really bad with lots of file descriptors (and few of
973	them active), of course, but this was not subject of this benchmark.	1025	them active), of course, but this was not subject of this benchmark.
974		1026
975	The C<Event> module has a relatively high setup and callback invocation cost,	1027	The C<Event> module has a relatively high setup and callback invocation
976	but overall scores on the third place.	1028	cost, but overall scores in on the third place.
977		1029
978	C<Glib>'s memory usage is quite a bit bit higher, but it features a	1030	C<Glib>'s memory usage is quite a bit higher, but it features a
979	faster callback invocation and overall ends up in the same class as	1031	faster callback invocation and overall ends up in the same class as
980	C<Event>. However, Glib scales extremely badly, doubling the number of	1032	C<Event>. However, Glib scales extremely badly, doubling the number of
981	watchers increases the processing time by more than a factor of four,	1033	watchers increases the processing time by more than a factor of four,
982	making it completely unusable when using larger numbers of watchers	1034	making it completely unusable when using larger numbers of watchers
983	(note that only a single file descriptor was used in the benchmark, so	1035	(note that only a single file descriptor was used in the benchmark, so
…		…
986	The C<Tk> adaptor works relatively well. The fact that it crashes with	1038	The C<Tk> adaptor works relatively well. The fact that it crashes with
987	more than 2000 watchers is a big setback, however, as correctness takes	1039	more than 2000 watchers is a big setback, however, as correctness takes
988	precedence over speed. Nevertheless, its performance is surprising, as the	1040	precedence over speed. Nevertheless, its performance is surprising, as the
989	file descriptor is dup()ed for each watcher. This shows that the dup()	1041	file descriptor is dup()ed for each watcher. This shows that the dup()
990	employed by some adaptors is not a big performance issue (it does incur a	1042	employed by some adaptors is not a big performance issue (it does incur a
991	hidden memory cost inside the kernel, though, that is not reflected in the	1043	hidden memory cost inside the kernel which is not reflected in the figures
992	figures above).	1044	above).
993		1045
994	C<POE>, regardless of underlying event loop (whether using its pure perl	1046	C<POE>, regardless of underlying event loop (whether using its pure
995	select-based backend or the Event module) shows abysmal performance and	1047	perl select-based backend or the Event module, the POE-EV backend
		1048	couldn't be tested because it wasn't working) shows abysmal performance
996	memory usage: Watchers use almost 30 times as much memory as EV watchers,	1049	and memory usage: Watchers use almost 30 times as much memory as
997	and 10 times as much memory as both Event or EV via AnyEvent. Watcher	1050	EV watchers, and 10 times as much memory as Event (the high memory
		1051	requirements are caused by requiring a session for each watcher). Watcher
998	invocation is almost 900 times slower than with AnyEvent's pure perl	1052	invocation speed is almost 900 times slower than with AnyEvent's pure perl
999	implementation. The design of the POE adaptor class in AnyEvent can not	1053	implementation. The design of the POE adaptor class in AnyEvent can not
1000	really account for this, as session creation overhead is small compared	1054	really account for this, as session creation overhead is small compared
1001	to execution of the state machine, which is coded pretty optimally within	1055	to execution of the state machine, which is coded pretty optimally within
1002	L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.	1056	L<AnyEvent::Impl::POE>. POE simply seems to be abysmally slow.
1003		1057
1004	=head2 Summary	1058	=head3 Summary
1005		1059
		1060	=over 4
		1061
1006	Using EV through AnyEvent is faster than any other event loop, but most	1062	=item * Using EV through AnyEvent is faster than any other event loop
1007	event loops have acceptable performance with or without AnyEvent.	1063	(even when used without AnyEvent), but most event loops have acceptable
		1064	performance with or without AnyEvent.
1008		1065
1009	The overhead AnyEvent adds is usually much smaller than the overhead of	1066	=item * The overhead AnyEvent adds is usually much smaller than the overhead of
1010	the actual event loop, only with extremely fast event loops such as the EV	1067	the actual event loop, only with extremely fast event loops such as EV
1011	adds AnyEvent significant overhead.	1068	adds AnyEvent significant overhead.
1012		1069
1013	And you should simply avoid POE like the plague if you want performance or	1070	=item * You should avoid POE like the plague if you want performance or
1014	reasonable memory usage.	1071	reasonable memory usage.
		1072
		1073	=back
		1074
		1075	=head2 BENCHMARKING THE LARGE SERVER CASE
		1076
		1077	This benchmark atcually benchmarks the event loop itself. It works by
		1078	creating a number of "servers": each server consists of a socketpair, a
		1079	timeout watcher that gets reset on activity (but never fires), and an I/O
		1080	watcher waiting for input on one side of the socket. Each time the socket
		1081	watcher reads a byte it will write that byte to a random other "server".
		1082
		1083	The effect is that there will be a lot of I/O watchers, only part of which
		1084	are active at any one point (so there is a constant number of active
		1085	fds for each loop iterstaion, but which fds these are is random). The
		1086	timeout is reset each time something is read because that reflects how
		1087	most timeouts work (and puts extra pressure on the event loops).
		1088
		1089	In this benchmark, we use 10000 socketpairs (20000 sockets), of which 100
		1090	(1%) are active. This mirrors the activity of large servers with many
		1091	connections, most of which are idle at any one point in time.
		1092
		1093	Source code for this benchmark is found as F<eg/bench2> in the AnyEvent
		1094	distribution.
		1095
		1096	=head3 Explanation of the columns
		1097
		1098	I<sockets> is the number of sockets, and twice the number of "servers" (as
		1099	each server has a read and write socket end).
		1100
		1101	I<create> is the time it takes to create a socketpair (which is
		1102	nontrivial) and two watchers: an I/O watcher and a timeout watcher.
		1103
		1104	I<request>, the most important value, is the time it takes to handle a
		1105	single "request", that is, reading the token from the pipe and forwarding
		1106	it to another server. This includes deleting the old timeout and creating
		1107	a new one that moves the timeout into the future.
		1108
		1109	=head3 Results
		1110
		1111	name sockets create request
		1112	EV 20000 69.01 11.16
		1113	Perl 20000 73.32 35.87
		1114	Event 20000 212.62 257.32
		1115	Glib 20000 651.16 1896.30
		1116	POE 20000 349.67 12317.24 uses POE::Loop::Event
		1117
		1118	=head3 Discussion
		1119
		1120	This benchmark I<does> measure scalability and overall performance of the
		1121	particular event loop.
		1122
		1123	EV is again fastest. Since it is using epoll on my system, the setup time
		1124	is relatively high, though.
		1125
		1126	Perl surprisingly comes second. It is much faster than the C-based event
		1127	loops Event and Glib.
		1128
		1129	Event suffers from high setup time as well (look at its code and you will
		1130	understand why). Callback invocation also has a high overhead compared to
		1131	the C<< $_->() for .. >>-style loop that the Perl event loop uses. Event
		1132	uses select or poll in basically all documented configurations.
		1133
		1134	Glib is hit hard by its quadratic behaviour w.r.t. many watchers. It
		1135	clearly fails to perform with many filehandles or in busy servers.
		1136
		1137	POE is still completely out of the picture, taking over 1000 times as long
		1138	as EV, and over 100 times as long as the Perl implementation, even though
		1139	it uses a C-based event loop in this case.
		1140
		1141	=head3 Summary
		1142
		1143	=over 4
		1144
		1145	=item * The pure perl implementation performs extremely well, considering
		1146	that it uses select.
		1147
		1148	=item * Avoid Glib or POE in large projects where performance matters.
		1149
		1150	=back
		1151
		1152	=head2 BENCHMARKING SMALL SERVERS
		1153
		1154	While event loops should scale (and select-based ones do not...) even to
		1155	large servers, most programs we (or I :) actually write have only a few
		1156	I/O watchers.
		1157
		1158	In this benchmark, I use the same benchmark program as in the large server
		1159	case, but it uses only eight "servers", of which three are active at any
		1160	one time. This should reflect performance for a small server relatively
		1161	well.
		1162
		1163	The columns are identical to the previous table.
		1164
		1165	=head3 Results
		1166
		1167	name sockets create request
		1168	EV 16 20.00 6.54
		1169	Perl 16 25.75 12.62
		1170	Event 16 81.27 35.86
		1171	Glib 16 32.63 15.48
		1172	POE 16 261.87 276.28 uses POE::Loop::Event
		1173
		1174	=head3 Discussion
		1175
		1176	The benchmark tries to test the performance of a typical small
		1177	server. While knowing how various event loops perform is interesting, keep
		1178	in mind that their overhead in this case is usually not as important, due
		1179	to the small absolute number of watchers (that is, you need efficiency and
		1180	speed most when you have lots of watchers, not when you only have a few of
		1181	them).
		1182
		1183	EV is again fastest.
		1184
		1185	The C-based event loops Event and Glib come in second this time, as the
		1186	overhead of running an iteration is much smaller in C than in Perl (little
		1187	code to execute in the inner loop, and perl's function calling overhead is
		1188	high, and updating all the data structures is costly).
		1189
		1190	The pure perl event loop is much slower, but still competitive.
		1191
		1192	POE also performs much better in this case, but is is still far behind the
		1193	others.
		1194
		1195	=head3 Summary
		1196
		1197	=over 4
		1198
		1199	=item * C-based event loops perform very well with small number of
		1200	watchers, as the management overhead dominates.
		1201
		1202	=back
1015		1203
1016		1204
1017	=head1 FORK	1205	=head1 FORK
1018		1206
1019	Most event libraries are not fork-safe. The ones who are usually are	1207	Most event libraries are not fork-safe. The ones who are usually are

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent/lib/AnyEvent.pm (file contents): Revision 1.86 by root, Fri Apr 25 14:01:48 2008 UTC vs. Revision 1.100 by elmex, Sun Apr 27 19:15:43 2008 UTC

Diff Legend

Comparing AnyEvent/lib/AnyEvent.pm (file contents):
Revision 1.86 by root, Fri Apr 25 14:01:48 2008 UTC vs.
Revision 1.100 by elmex, Sun Apr 27 19:15:43 2008 UTC