[ViewVC] Diff of: cvs/AnyEvent/lib/AnyEvent/Intro.pod

Comparing AnyEvent/lib/AnyEvent/Intro.pod (file contents):
Revision 1.10 by root, Mon Jun 2 06:04:08 2008 UTC vs.
Revision 1.27 by root, Fri Jun 18 14:28:50 2010 UTC

…		…
		1	=head1 NAME
		2
		3	AnyEvent::Intro - an introductory tutorial to AnyEvent
		4
1	=head1 Introduction to AnyEvent	5	=head1 Introduction to AnyEvent
2		6
3	This is a tutorial that will introduce you to the features of AnyEvent.	7	This is a tutorial that will introduce you to the features of AnyEvent.
4		8
5	The first part introduces the core AnyEvent module (after swamping you a	9	The first part introduces the core AnyEvent module (after swamping you a
6	bit in evangelism), which might already provide all you ever need.	10	bit in evangelism), which might already provide all you ever need: If you
		11	are only interested in AnyEvent's event handling capabilities, read no
		12	further.
7		13
8	The second part focuses on network programming using sockets, for which	14	The second part focuses on network programming using sockets, for which
9	AnyEvent offers a lot of support you can use.	15	AnyEvent offers a lot of support you can use, and a lot of workarounds
		16	around portability quirks.
10		17
11		18
12	=head1 What is AnyEvent?	19	=head1 What is AnyEvent?
13		20
14	If you don't care for the whys and want to see code, skip this section!	21	If you don't care for the whys and want to see code, skip this section!
16	AnyEvent is first of all just a framework to do event-based	23	AnyEvent is first of all just a framework to do event-based
17	programming. Typically such frameworks are an all-or-nothing thing: If you	24	programming. Typically such frameworks are an all-or-nothing thing: If you
18	use one such framework, you can't (easily, or even at all) use another in	25	use one such framework, you can't (easily, or even at all) use another in
19	the same program.	26	the same program.
20		27
21	AnyEvent is different - it is a thin abstraction layer above all kinds	28	AnyEvent is different - it is a thin abstraction layer on top of other
		29	event loops, just like DBI is an abstraction of many different database
22	of event loops. Its main purpose is to move the choice of the underlying	30	APIs. Its main purpose is to move the choice of the underlying framework
23	framework (the event loop) from the module author to the program author	31	(the event loop) from the module author to the program author using the
24	using the module.	32	module.
25		33
26	That means you can write code that uses events to control what it	34	That means you can write code that uses events to control what it
27	does, without forcing other code in the same program to use the same	35	does, without forcing other code in the same program to use the same
28	underlying framework as you do - i.e. you can create a Perl module	36	underlying framework as you do - i.e. you can create a Perl module
29	that is event-based using AnyEvent, and users of that module can still	37	that is event-based using AnyEvent, and users of that module can still
30	choose between using L<Gtk2>, L<Tk>, L<Event> or no event loop at	38	choose between using L<Gtk2>, L<Tk>, L<Event> (or run inside Irssi or
31	all: AnyEvent comes with its own event loop implementation, so your	39	rxvt-unicode) or any other supported event loop. AnyEvent even comes with
32	code works regardless of other modules that might or might not be	40	its own pure-perl event loop implementation, so your code works regardless
33	installed. The latter is important, as AnyEvent does not have any	41	of other modules that might or might not be installed. The latter is
34	dependencies to other modules, which makes it easy to install, for	42	important, as AnyEvent does not have any hard dependencies to other
35	example, when you lack a C compiler.	43	modules, which makes it easy to install, for example, when you lack a C
		44	compiler. No mater what environment, AnyEvent will just cope with it.
36		45
37	A typical problem with Perl modules such as L<Net::IRC> is that they	46	A typical limitation of existing Perl modules such as L<Net::IRC> is that
38	come with their own event loop: In L<Net::IRC>, the program who uses it	47	they come with their own event loop: In L<Net::IRC>, the program who uses
39	needs to start the event loop of L<Net::IRC>. That means that one cannot	48	it needs to start the event loop of L<Net::IRC>. That means that one
40	integrate this module into a L<Gtk2> GUI for instance, as that module,	49	cannot integrate this module into a L<Gtk2> GUI for instance, as that
41	too, enforces the use of its own event loop (namely L<Glib>).	50	module, too, enforces the use of its own event loop (namely L<Glib>).
42		51
43	Another example is L<LWP>: it provides no event interface at all. It's a	52	Another example is L<LWP>: it provides no event interface at all. It's
44	pure blocking HTTP (and FTP etc.) client library, which usually means that	53	a pure blocking HTTP (and FTP etc.) client library, which usually means
45	you either have to start a thread or have to fork for a HTTP request, or	54	that you either have to start another process or have to fork for a HTTP
46	use L<Coro::LWP>, if you want to do something else while waiting for the	55	request, or use threads (e.g. L<Coro::LWP>), if you want to do something
47	request to finish.	56	else while waiting for the request to finish.
48		57
49	The motivation behind these designs is often that a module doesn't want to	58	The motivation behind these designs is often that a module doesn't want
50	depend on some complicated XS-module (Net::IRC), or that it doesn't want	59	to depend on some complicated XS-module (Net::IRC), or that it doesn't
51	to force the user to use some specific event loop at all (LWP).	60	want to force the user to use some specific event loop at all (LWP), out
		61	of fear of severly limiting the usefulness of the module: If your module
		62	requires Glib, it will not run in a Tk program.
52		63
53	L<AnyEvent> solves this dilemma, by B<not> forcing module authors to either	64	L<AnyEvent> solves this dilemma, by B<not> forcing module authors to
		65	either:
54		66
55	=over 4	67	=over 4
56		68
57	=item write their own event loop (because guarantees to offer one	69	=item - write their own event loop (because it guarantees the availability
58	everywhere - even on windows).	70	of an event loop everywhere - even on windows with no extra modules
		71	installed).
59		72
60	=item choose one fixed event loop (because AnyEvent works with all	73	=item - choose one specific event loop (because AnyEvent works with most
61	important event loops available for Perl, and adding others is trivial).	74	event loops available for Perl).
62		75
63	=back	76	=back
64		77
65	If the module author uses L<AnyEvent> for all his event needs (IO events,	78	If the module author uses L<AnyEvent> for all his (or her) event needs
66	timers, signals, ...) then all other modules can just use his module and	79	(IO events, timers, signals, ...) then all other modules can just use
67	don't have to choose an event loop or adapt to his event loop. The choice	80	his module and don't have to choose an event loop or adapt to his event
68	of the event loop is ultimately made by the program author who uses all	81	loop. The choice of the event loop is ultimately made by the program
69	the modules and writes the main program. And even there he doesn't have to	82	author who uses all the modules and writes the main program. And even
70	choose, he can just let L<AnyEvent> choose the best available event loop	83	there he doesn't have to choose, he can just let L<AnyEvent> choose the
71	for him.	84	most efficient event loop available on the system.
72		85
73	Read more about this in the main documentation of the L<AnyEvent> module.	86	Read more about this in the main documentation of the L<AnyEvent> module.
74		87
75		88
76	=head1 Introduction to Event-Based Programming	89	=head1 Introduction to Event-Based Programming
…		…
101	);	114	);
102		115
103	# do something else here	116	# do something else here
104		117
105	Looks more complicated, and surely is, but the advantage of using events	118	Looks more complicated, and surely is, but the advantage of using events
106	is that your program can do something else instead of waiting for	119	is that your program can do something else instead of waiting for input
		120	(side note: combining AnyEvent with a thread package such as Coro can
		121	recoup much of the simplicity, effectively getting the best of two
		122	worlds).
		123
107	input. Waiting as in the first example is also called "blocking" because	124	Waiting as done in the first example is also called "blocking" the process
108	you "block" your process from executing anything else while you do so.	125	because you "block"/keep your process from executing anything else while
		126	you do so.
109		127
110	The second example avoids blocking, by only registering interest in a read	128	The second example avoids blocking by only registering interest in a read
111	event, which is fast and doesn't block your process. Only when read data	129	event, which is fast and doesn't block your process. Only when data is
112	is available will the callback be called, which can then proceed to read	130	available for reading will the callback be called, which can then proceed
113	the data.	131	to read the data.
114		132
115	The "interest" is represented by an object returned by C<< AnyEvent->io	133	The "interest" is represented by an object returned by C<< AnyEvent->io
116	>> called a "watcher" object - called like that because it "watches" your	134	>> called a "watcher" object - called this because it "watches" your
117	file handle (or other event sources) for the event you are interested in.	135	file handle (or other event sources) for the event you are interested in.
118		136
119	In the example above, we create an I/O watcher by calling the C<<	137	In the example above, we create an I/O watcher by calling the C<<
120	AnyEvent->io >> method. Disinterest in some event is simply expressed by	138	AnyEvent->io >> method. Disinterest in some event is simply expressed
121	forgetting about the watcher, for example, by C<undef>'ing the variable it	139	by forgetting about the watcher, for example, by C<undef>'ing the only
122	is stored in. AnyEvent will automatically clean up the watcher if it is no	140	variable it is stored in. AnyEvent will automatically clean up the watcher
123	longer used, much like Perl closes your file handles if you no longer use	141	if it is no longer used, much like Perl closes your file handles if you no
124	them anywhere.	142	longer use them anywhere.
		143
		144	=head3 A short note on callbacks
		145
		146	A common issue that hits people is the problem of passing parameters
		147	to callbacks. Programmers used to languages such as C or C++ are often
		148	used to a style where one passes the address of a function (a function
		149	reference) and some data value, e.g.:
		150
		151	sub callback {
		152	my ($arg) = @_;
		153
		154	$arg->method;
		155	}
		156
		157	my $arg = ...;
		158
		159	call_me_back_later \&callback, $arg;
		160
		161	This is clumsy, as the place where behaviour is specified (when the
		162	callback is registered) is often far away from the place where behaviour
		163	is implemented. It also doesn't use Perl syntax to invoke the code. There
		164	is also an abstraction penalty to pay as one has to I<name> the callback,
		165	which often is unnecessary and leads to nonsensical or duplicated names.
		166
		167	In Perl, one can specify behaviour much more directly by using
		168	I<closures>. Closures are code blocks that take a reference to the
		169	enclosing scope(s) when they are created. This means lexical variables in
		170	scope at the time of creating the closure can simply be used inside the
		171	closure:
		172
		173	my $arg = ...;
		174
		175	call_me_back_later sub { $arg->method };
		176
		177	Under most circumstances, closures are faster, use fewer resources and
		178	result in much clearer code then the traditional approach. Faster,
		179	because parameter passing and storing them in local variables in Perl
		180	is relatively slow. Fewer resources, because closures take references
		181	to existing variables without having to create new ones, and clearer
		182	code because it is immediately obvious that the second example calls the
		183	C<method> method when the callback is invoked.
		184
		185	Apart from these, the strongest argument for using closures with AnyEvent
		186	is that AnyEvent does not allow passing parameters to the callback, so
		187	closures are the only way to achieve that in most cases :->
		188
		189
		190	=head3 A hint on debugging
		191
		192	AnyEvent does, by default, not do any argument checking. This can lead to
		193	strange and unexpected results especially if you are trying to learn your
		194	ways with AnyEvent.
		195
		196	AnyEvent supports a special "strict" mode - off by default - which does very
		197	strict argument checking, at the expense of being somewhat slower. During
		198	development, however, this mode is very useful.
		199
		200	You can enable this strict mode either by having an environment variable
		201	C<PERL_ANYEVENT_STRICT> with a true value in your environment:
		202
		203	PERL_ANYEVENT_STRICT=1 perl test.pl
		204
		205	Or you can write C<use AnyEvent::Strict> in your program, which has the
		206	same effect (do not do this in production, however).
		207
125		208
126	=head2 Condition Variables	209	=head2 Condition Variables
127		210
128	However, the above is not a fully working program, and will not work	211	Back to the I/O watcher example: The code is not yet a fully working
129	as-is. The reason is that your callback will not be invoked out of the	212	program, and will not work as-is. The reason is that your callback will
130	blue, you have to run the event loop. Also, event-based programs sometimes	213	not be invoked out of the blue, you have to run the event loop. Also,
131	have to block, too, as when there simply is nothing else to do and	214	event-based programs sometimes have to block, too, as when there simply is
132	everything waits for some events, it needs to block the process as well.	215	nothing else to do and everything waits for some events, it needs to block
		216	the process as well until new events arrive.
133		217
134	In AnyEvent, this is done using condition variables. Condition variables	218	In AnyEvent, this is done using condition variables. Condition variables
135	are named "condition variables" because they represent a condition that is	219	are named "condition variables" because they represent a condition that is
136	initially false and needs to be fulfilled.	220	initially false and needs to be fulfilled.
137		221
138	You can also call them "merge points", "sync points", "rendezvous ports"	222	You can also call them "merge points", "sync points", "rendezvous ports"
139	or even callbacks and many other things (and they are often called like	223	or even callbacks and many other things (and they are often called these
140	this in other frameworks). The important point is that you can create them	224	names in other frameworks). The important point is that you can create them
141	freely and later wait for them to become true.	225	freely and later wait for them to become true.
142		226
143	Condition variables have two sides - one side is the "producer" of the	227	Condition variables have two sides - one side is the "producer" of the
144	condition (whatever code detects the condition), the other side is the	228	condition (whatever code detects and flags the condition), the other side
145	"consumer" (the code that waits for that condition).	229	is the "consumer" (the code that waits for that condition).
146		230
147	In our example in the previous section, the producer is the event callback	231	In our example in the previous section, the producer is the event callback
148	and there is no consumer yet - let's change that now:	232	and there is no consumer yet - let's change that right now:
149		233
150	use AnyEvent;	234	use AnyEvent;
151		235
152	$\| = 1; print "enter your name> ";	236	$\| = 1; print "enter your name> ";
153		237
…		…
174	print "your name is $name\n";	258	print "your name is $name\n";
175		259
176	This program creates an AnyEvent condvar by calling the C<<	260	This program creates an AnyEvent condvar by calling the C<<
177	AnyEvent->condvar >> method. It then creates a watcher as usual, but	261	AnyEvent->condvar >> method. It then creates a watcher as usual, but
178	inside the callback it C<send>'s the C<$name_ready> condition variable,	262	inside the callback it C<send>'s the C<$name_ready> condition variable,
179	which causes anybody waiting on it to continue.	263	which causes whoever is waiting on it to continue.
180		264
181	The "anybody" in this case is the code that follows, which calls C<<	265	The "whoever" in this case is the code that follows, which calls C<<
182	$name_ready->recv >>: The producer calls C<send>, the consumer calls	266	$name_ready->recv >>: The producer calls C<send>, the consumer calls
183	C<recv>.	267	C<recv>.
184		268
185	If there is no C<$name> available yet, then the call to C<<	269	If there is no C<$name> available yet, then the call to C<<
186	$name_ready->recv >> will halt your program until the condition becomes	270	$name_ready->recv >> will halt your program until the condition becomes
…		…
196		280
197	my $name_ready = AnyEvent->condvar;	281	my $name_ready = AnyEvent->condvar;
198		282
199	my $wait_for_input = AnyEvent->io (	283	my $wait_for_input = AnyEvent->io (
200	fh => \*STDIN, poll => "r",	284	fh => \*STDIN, poll => "r",
201	cb => sub { $name_ready->send (scalar = <STDIN>) }	285	cb => sub { $name_ready->send (scalar <STDIN>) }
202	);	286	);
203		287
204	# do something else here	288	# do something else here
205		289
206	# now wait and fetch the name	290	# now wait and fetch the name
…		…
263		347
264	Instead of waiting for a condition variable, the program enters the Gtk2	348	Instead of waiting for a condition variable, the program enters the Gtk2
265	main loop by calling C<< Gtk2->main >>, which will block the program and	349	main loop by calling C<< Gtk2->main >>, which will block the program and
266	wait for events to arrive.	350	wait for events to arrive.
267		351
268	This also shows that AnyEvent is quite flexible - you didn't have anything	352	This also shows that AnyEvent is quite flexible - you didn't have to do
269	to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just	353	anything to make the AnyEvent watcher use Gtk2 (actually Glib) - it just
270	worked.	354	worked.
271		355
272	Admittedly, the example is a bit silly - who would want to read names	356	Admittedly, the example is a bit silly - who would want to read names
273	form standard input in a Gtk+ application. But imagine that instead of	357	from standard input in a Gtk+ application. But imagine that instead of
274	doing that, you would make a HTTP request in the background and display	358	doing that, you would make a HTTP request in the background and display
275	it's results. In fact, with event-based programming you can make many	359	it's results. In fact, with event-based programming you can make many
276	http-requests in parallel in your program and still provide feedback to	360	HTTP requests in parallel in your program and still provide feedback to
277	the user and stay interactive.	361	the user and stay interactive.
278		362
279	In the next part you will see how to do just that - by implementing an	363	And in the next part you will see how to do just that - by implementing an
280	HTTP request, on our own, with the utility modules AnyEvent comes with.	364	HTTP request, on our own, with the utility modules AnyEvent comes with.
281		365
282	Before that, however, let's briefly look at how you would write your	366	Before that, however, let's briefly look at how you would write your
283	program with using only AnyEvent, without ever calling some other event	367	program using only AnyEvent, without ever calling some other event
284	loop's run function.	368	loop's run function.
285		369
286	In the example using condition variables, we used that, and in fact, this	370	In the example using condition variables, we used those to start waiting
287	is the solution:	371	for events, and in fact, condition variables are the solution:
288		372
289	my $quit_program = AnyEvent->condvar;	373	my $quit_program = AnyEvent->condvar;
290		374
291	# create AnyEvent watchers (or not) here	375	# create AnyEvent watchers (or not) here
292		376
293	$quit_program->recv;	377	$quit_program->recv;
294		378
295	If any of your watcher callbacks decide to quit, they can simply call	379	If any of your watcher callbacks decide to quit (this is often
		380	called an "unloop" in other frameworks), they can simply call C<<
296	C<< $quit_program->send >>. Of course, they could also decide not to and	381	$quit_program->send >>. Of course, they could also decide not to and
297	simply call C<exit> instead, or they could decide not to quit, ever (e.g.	382	simply call C<exit> instead, or they could decide not to quit, ever (e.g.
298	in a long-running daemon program).	383	in a long-running daemon program).
299		384
300	In that case, you can simply use:	385	If you don't need some clean quit functionality and just want to run the
		386	event loop, you can simply do this:
301		387
302	AnyEvent->condvar->recv;	388	AnyEvent->condvar->recv;
303		389
304	And this is, in fact, closest to the idea of a main loop run function that	390	And this is, in fact, closest to the idea of a main loop run function that
305	AnyEvent offers.	391	AnyEvent offers.
306		392
307	=head2 Timers and other event sources	393	=head2 Timers and other event sources
308		394
309	So far, we have only used I/O watchers. These are useful mainly to find	395	So far, we have only used I/O watchers. These are useful mainly to find
310	out whether a Socket has data to read, or space to write more data. On sane	396	out whether a socket has data to read, or space to write more data. On sane
311	operating systems this also works for console windows/terminals (typically	397	operating systems this also works for console windows/terminals (typically
312	on standard input), serial lines, all sorts of other devices, basically	398	on standard input), serial lines, all sorts of other devices, basically
313	almost everything that has a file descriptor but isn't a file itself. (As	399	almost everything that has a file descriptor but isn't a file itself. (As
314	usual, "sane" excludes windows - on that platform you would need different	400	usual, "sane" excludes windows - on that platform you would need different
315	functions for all of these, complicating code immensely - think "socket	401	functions for all of these, complicating code immensely - think "socket
…		…
337		423
338	# now wait till our time has come	424	# now wait till our time has come
339	$cv->recv;	425	$cv->recv;
340		426
341	Unlike I/O watchers, timers are only interested in the amount of seconds	427	Unlike I/O watchers, timers are only interested in the amount of seconds
342	they have to wait. When that amount of time has passed, AnyEvent will	428	they have to wait. When (at least) that amount of time has passed,
343	invoke your callback.	429	AnyEvent will invoke your callback.
344		430
345	Unlike I/O watchers, which will call your callback as many times as there	431	Unlike I/O watchers, which will call your callback as many times as there
346	is data available, timers are one-shot: after they have "fired" once and	432	is data available, timers are normally one-shot: after they have "fired"
347	invoked your callback, they are dead and no longer do anything.	433	once and invoked your callback, they are dead and no longer do anything.
348		434
349	To get a repeating timer, such as a timer firing roughly once per second,	435	To get a repeating timer, such as a timer firing roughly once per second,
350	you have to recreate it:	436	you can specify an C<interval> parameter:
351		437
352	use AnyEvent;	438	my $once_per_second = AnyEvent->timer (
353		439	after => 0, # first invoke ASAP
354	my $time_watcher;	440	interval => 1, # then invoke every second
355		441	cb => sub { # the callback to invoke
356	sub once_per_second {	442	$cv->send;
357	print "tick\n";
358		443	},
359	# (re-)create the watcher
360	$time_watcher = AnyEvent->timer (
361	after => 1,
362	cb => \&once_per_second,
363	);	444	);
364	}
365
366	# now start the timer
367	once_per_second;
368
369	Having to recreate your timer is a restriction put on AnyEvent that is
370	present in most event libraries it uses. It is so annoying that some
371	future version might work around this limitation, but right now, it's the
372	only way to do repeating timers.
373
374	Fortunately most timers aren't really repeating but specify timeouts of
375	some sort.
376		445
377	=head3 More esoteric sources	446	=head3 More esoteric sources
378		447
379	AnyEvent also has some other, more esoteric event sources you can tap	448	AnyEvent also has some other, more esoteric event sources you can tap
380	into: signal and child watchers.	449	into: signal, child and idle watchers.
381		450
382	Signal watchers can be used to wait for "signal events", which simply	451	Signal watchers can be used to wait for "signal events", which simply
383	means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>).	452	means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>).
384		453
385	Process watchers wait for a child process to exit. They are useful when	454	Child-process watchers wait for a child process to exit. They are useful
386	you fork a separate process and need to know when it exits, but you do not	455	when you fork a separate process and need to know when it exits, but you
387	wait for that by blocking.	456	do not wait for that by blocking.
388		457
		458	Idle watchers invoke their callback when the event loop has handled all
		459	outstanding events, polled for new events and didn't find any, i.e., when
		460	your process is otherwise idle. They are useful if you want to do some
		461	non-trivial data processing that can be done when your program doesn't
		462	have anything better to do.
		463
389	Both watcher types are described in detail in the main L<AnyEvent> manual	464	All these watcher types are described in detail in the main L<AnyEvent>
390	page.	465	manual page.
391		466
		467	Sometimes you also need to know what the current time is: C<<
		468	AnyEvent->now >> returns the time the event toolkit uses to schedule
		469	relative timers, and is usually what you want. It is often cached (which
		470	means it can be a bit outdated). In that case, you can use the more costly
		471	C<< AnyEvent->time >> method which will ask your operating system for the
		472	current time, which is slower, but also more up to date.
392		473
393	=head1 Network programming and AnyEvent	474	=head1 Network programming and AnyEvent
394		475
395	So far you have seen how to register event watchers and handle events.	476	So far you have seen how to register event watchers and handle events.
396		477
397	This is a great foundation to write network clients and servers, and might be	478	This is a great foundation to write network clients and servers, and might
398	all that your module (or program) ever requires, but writing your own I/O	479	be all that your module (or program) ever requires, but writing your own
399	buffering again and again becomes tedious, not to mention that it attracts	480	I/O buffering again and again becomes tedious, not to mention that it
400	errors.	481	attracts errors.
401		482
402	While the core L<AnyEvent> module is still small and self-contained,	483	While the core L<AnyEvent> module is still small and self-contained,
403	the distribution comes with some very useful utility modules such as	484	the distribution comes with some very useful utility modules such as
404	L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can	485	L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can
405	make your life as non-blocking network programmer a lot easier.	486	make your life as non-blocking network programmer a lot easier.
…		…
413	a great way to do other DNS resolution tasks, such as reverse lookups of	494	a great way to do other DNS resolution tasks, such as reverse lookups of
414	IP addresses for log files.	495	IP addresses for log files.
415		496
416	=head2 L<AnyEvent::Handle>	497	=head2 L<AnyEvent::Handle>
417		498
418	This module handles non-blocking IO on file handles in an event based	499	This module handles non-blocking IO on (socket-, pipe- etc.) file handles
419	manner. It provides a wrapper object around your file handle that provides	500	in an event based manner. It provides a wrapper object around your file
420	queueing and buffering of incoming and outgoing data for you.	501	handle that provides queueing and buffering of incoming and outgoing data
		502	for you.
421		503
422	It also implements the most common data formats, such as text lines, or	504	It also implements the most common data formats, such as text lines, or
423	fixed and variable-width data blocks.	505	fixed and variable-width data blocks.
424		506
425	=head2 L<AnyEvent::Socket>	507	=head2 L<AnyEvent::Socket>
…		…
442	successful? That unsuccessful TCP connects might never be reported back	524	successful? That unsuccessful TCP connects might never be reported back
443	to your program? That C<WSAEINPROGRESS> means your C<connect> call was	525	to your program? That C<WSAEINPROGRESS> means your C<connect> call was
444	ignored instead of being in progress? AnyEvent::Socket works around all of	526	ignored instead of being in progress? AnyEvent::Socket works around all of
445	these Windows/Perl bugs for you).	527	these Windows/Perl bugs for you).
446		528
447	=head2 First experiments with non-blocking connects: a parallel finger	529	=head2 Implementing a parallel finger client with non-blocking connects
448	client.	530	and AnyEvent::Socket
449		531
450	The finger protocol is one of the simplest protocols in use on the	532	The finger protocol is one of the simplest protocols in use on the
451	internet. Or in use in the past, as almost nobody uses it anymore.	533	internet. Or in use in the past, as almost nobody uses it anymore.
452		534
453	It works by connecting to the finger port on another host, writing a	535	It works by connecting to the finger port on another host, writing a
454	single line with a user name and then reading the finger response, as	536	single line with a user name and then reading the finger response, as
455	specified by that user. OK, RFC 1288 specifies a vastly more complex	537	specified by that user. OK, RFC 1288 specifies a vastly more complex
456	protocol, but it basically boils down to this:	538	protocol, but it basically boils down to this:
457		539
458	# telnet idsoftware.com finger	540	# telnet kernel.org finger
459	Trying 192.246.40.37...	541	Trying 204.152.191.37...
460	Connected to idsoftware.com (192.246.40.37).	542	Connected to kernel.org (204.152.191.37).
461	Escape character is '^]'.	543	Escape character is '^]'.
462	johnc	544
463	Welcome to id Software's Finger Service V1.5!	545	The latest stable version of the Linux kernel is: [...]
464
465	[...]
466	Now on the web:
467	[...]
468
469	Connection closed by foreign host.	546	Connection closed by foreign host.
470		547
471	Yeah, I<was> used indeed, but at least the finger daemon still works, so
472	let's write a little AnyEvent function that makes a finger request:	548	So let's write a little AnyEvent function that makes a finger request:
473		549
474	use AnyEvent;	550	use AnyEvent;
475	use AnyEvent::Socket;	551	use AnyEvent::Socket;
476		552
477	sub finger($$) {	553	sub finger($$) {
…		…
509		585
510	# pass $cv to the caller	586	# pass $cv to the caller
511	$cv	587	$cv
512	}	588	}
513		589
514	That's a mouthful! Let's dissect this function a bit, first the overall function:	590	That's a mouthful! Let's dissect this function a bit, first the overall
		591	function and execution flow:
515		592
516	sub finger($$) {	593	sub finger($$) {
517	my ($user, $host) = @_;	594	my ($user, $host) = @_;
518		595
519	# use a condvar to return results	596	# use a condvar to return results
…		…
525	};	602	};
526		603
527	$cv	604	$cv
528	}	605	}
529		606
530	This isn't too complicated, just a function with two parameters, which	607	This isn't too complicated, just a function with two parameters, that
531	creates a condition variable, returns it, and while it does that,	608	creates a condition variable, returns it, and while it does that,
532	initiates a TCP connect to C<$host>. The condition variable	609	initiates a TCP connect to C<$host>. The condition variable will be used
533	will be used by the caller to receive the finger response.	610	by the caller to receive the finger response, but one could equally well
		611	pass a third argument, a callback, to the function.
534		612
535	Since we are event-based programmers, we do not wait for the connect to	613	Since we are programming event'ish, we do not wait for the connect to
536	finish - it could block your program for a minute or longer! Instead,	614	finish - it could block the program for a minute or longer!
		615
537	we pass the callback it should invoke when the connect is done to	616	Instead, we pass the callback it should invoke when the connect is done to
538	C<tcp_connect>. If it is successful, our callback gets called with the	617	C<tcp_connect>. If it is successful, that callback gets called with the
539	socket handle as first argument, otherwise, nothing will be passed to our	618	socket handle as first argument, otherwise, nothing will be passed to our
540	callback.	619	callback. The important point is that it will always be called as soon as
		620	the outcome of the TCP connect is known.
541		621
		622	This style of programming is also called "continuation style": the
		623	"continuation" is simply the way the program continues - normally at the
		624	next line after some statement (the exception is loops or things like
		625	C<return>). When we are interested in events, however, we instead specify
		626	the "continuation" of our program by passing a closure, which makes that
		627	closure the "continuation" of the program.
		628
		629	The C<tcp_connect> call is like saying "return now, and when the
		630	connection is established or it failed, continue there".
		631
542	Let's look at our callback in more detail:	632	Now let's look at the callback/closure in more detail:
543		633
544	# the callback gets the socket handle - or nothing	634	# the callback receives the socket handle - or nothing
545	my ($fh) = @_	635	my ($fh) = @_
546	or return $cv->send;	636	or return $cv->send;
547		637
548	The first thing the callback does is indeed save the socket handle in	638	The first thing the callback does is indeed save the socket handle in
549	C<$fh>. When there was an error (no arguments), then our instinct as	639	C<$fh>. When there was an error (no arguments), then our instinct as
550	expert Perl programmers would tell us to die:	640	expert Perl programmers would tell us to C<die>:
551		641
552	my ($fh) = @_	642	my ($fh) = @_
553	or die "$host: $!";	643	or die "$host: $!";
554		644
555	While this would give good feedback to the user, our program would	645	While this would give good feedback to the user (if he happens to watch
556	probably freeze here, as we never report the results to anybody, certainly	646	standard error), our program would probably stop working here, as we never
557	not the caller of our C<finger> function!	647	report the results to anybody, certainly not the caller of our C<finger>
		648	function, and most event loops continue even after a C<die>!
558		649
559	This is why we instead return, but also call C<< $cv->send >> without any	650	This is why we instead C<return>, but also call C<< $cv->send >> without
560	arguments to signal to our consumer that something bad has happened. The	651	any arguments to signal to the condvar consumer that something bad has
561	return value of C<< $cv->send >> is irrelevant, as is the return value of	652	happened. The return value of C<< $cv->send >> is irrelevant, as is
562	our callback. The return statement is simply used for the side effect of,	653	the return value of our callback. The C<return> statement is simply
563	well, returning immediately from the callback.	654	used for the side effect of, well, returning immediately from the
		655	callback. Checking for errors and handling them this way is very common,
		656	which is why this compact idiom is so handy.
564		657
565	As the next step in the finger protocol, we send the username to the	658	As the next step in the finger protocol, we send the username to the
566	finger daemon on the other side of our connection:	659	finger daemon on the other side of our connection (the kernel.org finger
		660	service doesn't actually wait for a username, but the net is running out
		661	of finger servers fast):
567		662
568	syswrite $fh, "$user\015\012";	663	syswrite $fh, "$user\015\012";
569		664
570	Note that this isn't 100% clean - the socket could, for whatever reasons,	665	Note that this isn't 100% clean socket programming - the socket could,
571	not accept our data. When writing a small amount of data like in this	666	for whatever reasons, not accept our data. When writing a small amount
572	example it doesn't matter, but for real-world cases you might need to	667	of data like in this example it doesn't matter, as a socket buffer is
573	implement some kind of write buffering - or use L<AnyEvent::Handle>, which	668	almost always big enough for a mere "username", but for real-world
574	handles these matters for you.	669	cases you might need to implement some kind of write buffering - or use
		670	L<AnyEvent::Handle>, which handles these matters for you, as shown in the
		671	next section.
575		672
576	What we do have to do is to implement our own read buffer - the response	673	What we I<do> have to do is to implement our own read buffer - the response
577	data could arrive late or in multiple chunks, and we cannot just wait for	674	data could arrive late or in multiple chunks, and we cannot just wait for
578	it (event-based programming, you know?).	675	it (event-based programming, you know?).
579		676
580	To do that, we register a read watcher on the socket which waits for data:	677	To do that, we register a read watcher on the socket which waits for data:
581		678
…		…
587	variable, but in a local one - if the callback returns, it would normally	684	variable, but in a local one - if the callback returns, it would normally
588	destroy the variable and its contents, which would in turn unregister our	685	destroy the variable and its contents, which would in turn unregister our
589	watcher.	686	watcher.
590		687
591	To avoid that, we C<undef>ine the variable in the watcher callback. This	688	To avoid that, we C<undef>ine the variable in the watcher callback. This
592	means that, when the C<tcp_connect> callback returns, that perl thinks	689	means that, when the C<tcp_connect> callback returns, perl thinks (quite
593	(quite correctly) that the read watcher is still in use - namely in the	690	correctly) that the read watcher is still in use - namely in the callback,
594	callback.	691	and thus keeps it alive even if nothing else in the program refers to it
		692	anymore (it is much like Baron Münchhausen keeping himself from dying by
		693	pulling himself out of a swamp).
		694
		695	The trick, however, is that instead of:
		696
		697	my $read_watcher = AnyEvent->io (...
		698
		699	The program does:
		700
		701	my $read_watcher; $read_watcher = AnyEvent->io (...
		702
		703	The reason for this is a quirk in the way Perl works: variable names
		704	declared with C<my> are only visible in the I<next> statement. If the
		705	whole C<< AnyEvent->io >> call, including the callback, would be done in
		706	a single statement, the callback could not refer to the C<$read_watcher>
		707	variable to undefine it, so it is done in two statements.
		708
		709	Whether you'd want to format it like this is of course a matter of style,
		710	this way emphasizes that the declaration and assignment really are one
		711	logical statement.
595		712
596	The callback itself calls C<sysread> for as many times as necessary, until	713	The callback itself calls C<sysread> for as many times as necessary, until
597	C<sysread> returns an error or end-of-file:	714	C<sysread> returns either an error or end-of-file:
598		715
599	cb => sub {	716	cb => sub {
600	my $len = sysread $fh, $response, 1024, length $response;	717	my $len = sysread $fh, $response, 1024, length $response;
601		718
602	if ($len <= 0) {	719	if ($len <= 0) {
603		720
604	Note that C<sysread> has the ability to append data it reads to a scalar,	721	Note that C<sysread> has the ability to append data it reads to a scalar,
605	which is what we make good use of in this example.	722	by specifying an offset, a feature of which we make good use of in this
		723	example.
606		724
607	When C<sysread> indicates we are done, the callback C<undef>ines	725	When C<sysread> indicates we are done, the callback C<undef>ines
608	the watcher and then C<send>'s the response data to the condition	726	the watcher and then C<send>'s the response data to the condition
609	variable. All this has the following effects:	727	variable. All this has the following effects:
610		728
…		…
623	But the main advantage is that we can not only run this finger function in	741	But the main advantage is that we can not only run this finger function in
624	the background, we even can run multiple sessions in parallel, like this:	742	the background, we even can run multiple sessions in parallel, like this:
625		743
626	my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets	744	my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets
627	my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736	745	my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736
628	my $f3 = finger "johnc", "idsoftware.com"; # finger john	746	my $f3 = finger "hpa" , "kernel.org"; # finger hpa
629		747
630	print "trouble tickets:\n", $f1->recv, "\n";	748	print "trouble tickets:\n" , $f1->recv, "\n";
631	print "trouble ticket #1736:\n", $f2->recv, "\n";	749	print "trouble ticket #1736:\n", $f2->recv, "\n";
632	print "john carmacks finger file: ", $f3->recv, "\n";	750	print "kernel release info: " , $f3->recv, "\n";
633		751
634	It doesn't look like it, but in fact all three requests run in	752	It doesn't look like it, but in fact all three requests run in
635	parallel. The code waits for the first finger request to finish first, but	753	parallel. The code waits for the first finger request to finish first, but
636	that doesn't keep it from executing in parallel, because when the first	754	that doesn't keep it from executing them parallel: when the first C<recv>
637	C<recv> call sees that the data isn't ready yet, it serves events for all	755	call sees that the data isn't ready yet, it serves events for all three
638	three requests automatically.	756	requests automatically, until the first request has finished.
		757
		758	The second C<recv> call might either find the data is already there, or it
		759	will continue handling events until that is the case, and so on.
639		760
640	By taking advantage of network latencies, which allows us to serve other	761	By taking advantage of network latencies, which allows us to serve other
641	requests and events while we wait for an event on one socket, the overall	762	requests and events while we wait for an event on one socket, the overall
642	time to do these three requests will be greatly reduces, typically all	763	time to do these three requests will be greatly reduced, typically all
643	three are done in the same time as the slowest of them would use.	764	three are done in the same time as the slowest of them would need to finish.
644		765
645	By the way, you do not actually have to wait in the C<recv> method on an	766	By the way, you do not actually have to wait in the C<recv> method on an
646	AnyEvent condition variable, you can also register a callback:	767	AnyEvent condition variable - after all, waiting is evil - you can also
		768	register a callback:
647		769
648	$cv->cb (sub {	770	$cv->cb (sub {
649	my $response = shift->recv;	771	my $response = shift->recv;
650	# ...	772	# ...
651	});	773	});
…		…
656	response:	778	response:
657		779
658	sub finger($$$) {	780	sub finger($$$) {
659	my ($user, $host, $cb) = @_;	781	my ($user, $host, $cb) = @_;
660		782
661	What you use is a matter of taste - if you expect your function to be	783	How you implement it is a matter of taste - if you expect your function to
662	used mainly in an event-based program you would normally prefer to pass a	784	be used mainly in an event-based program you would normally prefer to pass
663	callback directly.	785	a callback directly. If you write a module and expect your users to use
		786	it "synchronously" often (for example, a simple http-get script would not
		787	really care much for events), then you would use a condition variable and
		788	tell them "simply C<< ->recv >> the data".
664		789
665	=head3 Criticism and fix	790	=head3 Problems with the implementation and how to fix them
666		791
667	To make this example more real-world-ready, we would not only implement	792	To make this example more real-world-ready, we would not only implement
668	some write buffering (for the paranoid), but we would also have to handle	793	some write buffering (for the paranoid, or maybe denial-of-service aware
669	timeouts and maybe protocol errors.	794	security expert), but we would also have to handle timeouts and maybe
		795	protocol errors.
670		796
671	This quickly gets unwieldy, which is why we introduce L<AnyEvent::Handle>	797	Doing this quickly gets unwieldy, which is why we introduce
672	in the next section, which takes care of all these details for us.	798	L<AnyEvent::Handle> in the next section, which takes care of all these
		799	details for you and let's you concentrate on the actual protocol.
673		800
674		801
675	=head2 First experiments with AnyEvent::Handle	802	=head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
676		803
677	Now let's start with something simple: a program that reads from standard	804	The L<AnyEvent::Handle> module has been hyped quite a bit in this document
678	input in a non-blocking way, that is, in a way that lets your program do	805	so far, so let's see what it really offers.
679	other things while it is waiting for input.
680		806
681	First, the full program listing:	807	As finger is such a simple protocol, let's try something slightly more
		808	complicated: HTTP/1.0.
682		809
683	#!/usr/bin/perl	810	An HTTP GET request works by sending a single request line that indicates
		811	what you want the server to do and the URI you want to act it on, followed
		812	by as many "header" lines (C<Header: data>, same as e-mail headers) as
		813	required for the request, ended by an empty line.
684		814
685	use AnyEvent;	815	The response is formatted very similarly, first a line with the response
686	use AnyEvent::Handle;	816	status, then again as many header lines as required, then an empty line,
		817	followed by any data that the server might send.
687		818
688	my $end_prog = AnyEvent->condvar;	819	Again, let's try it out with C<telnet> (I condensed the output a bit - if
		820	you want to see the full response, do it yourself).
689		821
690	my $handle =	822	# telnet www.google.com 80
691	AnyEvent::Handle->new (	823	Trying 209.85.135.99...
692	fh => \*STDIN,	824	Connected to www.google.com (209.85.135.99).
		825	Escape character is '^]'.
		826	GET /test HTTP/1.0
		827
		828	HTTP/1.0 404 Not Found
		829	Date: Mon, 02 Jun 2008 07:05:54 GMT
		830	Content-Type: text/html; charset=UTF-8
		831
		832	<html><head>
		833	[...]
		834	Connection closed by foreign host.
		835
		836	The C<GET ...> and the empty line were entered manually, the rest of the
		837	telnet output is google's response, in which case a C<404 not found> one.
		838
		839	So, here is how you would do it with C<AnyEvent::Handle>:
		840
		841	sub http_get {
		842	my ($host, $uri, $cb) = @_;
		843
		844	# store results here
		845	my ($response, $header, $body);
		846
		847	my $handle; $handle = new AnyEvent::Handle
		848	connect => [$host => 'http'],
693	on_eof => sub {	849	on_error => sub {
694	print "received EOF, exiting...\n";	850	$cb->("HTTP/1.0 500 $!");
695	$end_prog->broadcast;	851	$handle->destroy; # explicitly destroy handle
696	},	852	},
		853	on_eof => sub {
		854	$cb->($response, $header, $body);
		855	$handle->destroy; # explicitly destroy handle
		856	};
		857
		858	$handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
		859
		860	# now fetch response status line
		861	$handle->push_read (line => sub {
		862	my ($handle, $line) = @_;
		863	$response = $line;
		864	});
		865
		866	# then the headers
		867	$handle->push_read (line => "\015\012\015\012", sub {
		868	my ($handle, $line) = @_;
		869	$header = $line;
		870	});
		871
		872	# and finally handle any remaining data as body
		873	$handle->on_read (sub {
		874	$body .= $_[0]->rbuf;
		875	$_[0]->rbuf = "";
		876	});
		877	}
		878
		879	And now let's go through it step by step. First, as usual, the overall
		880	C<http_get> function structure:
		881
		882	sub http_get {
		883	my ($host, $uri, $cb) = @_;
		884
		885	# store results here
		886	my ($response, $header, $body);
		887
		888	my $handle; $handle = new AnyEvent::Handle
		889	... create handle object
		890
		891	... push data to write
		892
		893	... push what to expect to read queue
		894	}
		895
		896	Unlike in the finger example, this time the caller has to pass a callback
		897	to C<http_get>. Also, instead of passing a URL as one would expect, the
		898	caller has to provide the hostname and URI - normally you would use the
		899	C<URI> module to parse a URL and separate it into those parts, but that is
		900	left to the inspired reader :)
		901
		902	Since everything else is left to the caller, all C<http_get> does it to
		903	initiate the connection by creating the AnyEvent::Handle object (which
		904	calls C<tcp_connect> for us) and leave everything else to it's callback.
		905
		906	The handle object is created, unsurprisingly, by calling the C<new>
		907	method of L<AnyEvent::Handle>:
		908
		909	my $handle; $handle = new AnyEvent::Handle
		910	connect => [$host => 'http'],
697	on_error => sub {	911	on_error => sub {
698	print "error while reading from STDIN: $!\n";	912	$cb->("HTTP/1.0 500 $!");
699	$end_prog->broadcast;	913	$handle->destroy; # explicitly destroy handle
700	}	914	},
		915	on_eof => sub {
		916	$cb->($response, $header, $body);
		917	$handle->destroy; # explicitly destroy handle
		918	};
		919
		920	The C<connect> argument tells AnyEvent::Handle to call C<tcp_connect> for
		921	the specified host and service/port.
		922
		923	The C<on_error> callback will be called on any unexpected error, such as a
		924	refused connection, or unexpected connection while reading the header.
		925
		926	Instead of having an extra mechanism to signal errors, connection errors
		927	are signalled by crafting a special "response status line", like this:
		928
		929	HTTP/1.0 500 Connection refused
		930
		931	This means the caller cannot distinguish (easily) between
		932	locally-generated errors and server errors, but it simplifies error
		933	handling for the caller a lot.
		934
		935	The error callback also destroys the handle explicitly, because we are not
		936	interested in continuing after any errors. In AnyEvent::Handle callbacks
		937	you have to call C<destroy> explicitly to destroy a handle. Outside of
		938	those callbacks you cna just forget the object reference and it will be
		939	automatically cleaned up.
		940
		941	Last not least, we set an C<on_eof> callback that is called when the
		942	other side indicates it has stopped writing data, which we will use to
		943	gracefully shut down the handle and report the results. This callback is
		944	only called when the read queue is empty - if the read queue expects some
		945	data and the handle gets an EOF from the other side this will be an error
		946	- after all, you did expect more to come.
		947
		948	If you wanted to write a server using AnyEvent::Handle, you would use
		949	C<tcp_accept> and then create the AnyEvent::Handle with the C<fh>
		950	argument.
		951
		952	=head3 The write queue
		953
		954	The next line sends the actual request:
		955
		956	$handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
		957
		958	No headers will be sent (this is fine for simple requests), so the whole
		959	request is just a single line followed by an empty line to signal the end
		960	of the headers to the server.
		961
		962	The more interesting question is why the method is called C<push_write>
		963	and not just write. The reason is that you can I<always> add some write
		964	data without blocking, and to do this, AnyEvent::Handle needs some write
		965	queue internally - and C<push_write> simply pushes some data onto the end
		966	of that queue, just like Perl's C<push> pushes data onto the end of an
		967	array.
		968
		969	The deeper reason is that at some point in the future, there might
		970	be C<unshift_write> as well, and in any case, we will shortly meet
		971	C<push_read> and C<unshift_read>, and it's usually easiest to remember if
		972	all those functions have some symmetry in their name. So C<push> is used
		973	as the opposite of C<unshift> in AnyEvent::Handle, not as the opposite of
		974	C<pull> - just like in Perl.
		975
		976	Note that we call C<push_write> right after creating the AnyEvent::Handle
		977	object, before it has had time to actually connect to the server. This is
		978	fine, pushing the read and write requests will simply queue them in the
		979	handle object until the connection has been established. Alternatively, we
		980	could do this "on demand" in the C<on_connect> callback.
		981
		982	If C<push_write> is called with more than one argument, then you can even
		983	do I<formatted> I/O, which simply means your data will be transformed in
		984	some ways. For example, this would JSON-encode your data before pushing it
		985	to the write queue:
		986
		987	$handle->push_write (json => [1, 2, 3]);
		988
		989	Apart from that, this pretty much summarises the write queue, there is
		990	little else to it.
		991
		992	Reading the response is far more interesting, because it involves the more
		993	powerful and complex I<read queue>:
		994
		995	=head3 The read queue
		996
		997	The response consists of three parts: a single line with the response
		998	status, a single paragraph of headers ended by an empty line, and the
		999	request body, which is simply the remaining data on that connection.
		1000
		1001	For the first two, we push two read requests onto the read queue:
		1002
		1003	# now fetch response status line
		1004	$handle->push_read (line => sub {
		1005	my ($handle, $line) = @_;
		1006	$response = $line;
		1007	});
		1008
		1009	# then the headers
		1010	$handle->push_read (line => "\015\012\015\012", sub {
		1011	my ($handle, $line) = @_;
		1012	$header = $line;
		1013	});
		1014
		1015	While one can simply push a single callback to parse the data the
		1016	queue, I<formatted> I/O really comes to our advantage here, as there
		1017	is a ready-made "read line" read type. The first read expects a single
		1018	line, ended by C<\015\012> (the standard end-of-line marker in internet
		1019	protocols).
		1020
		1021	The second "line" is actually a single paragraph - instead of reading it
		1022	line by line we tell C<push_read> that the end-of-line marker is really
		1023	C<\015\012\015\012>, which is an empty line. The result is that the whole
		1024	header paragraph will be treated as a single line and read. The word
		1025	"line" is interpreted very freely, much like Perl itself does it.
		1026
		1027	Note that push read requests are pushed immediately after creating the
		1028	handle object - since AnyEvent::Handle provides a queue we can push as
		1029	many requests as we want, and AnyEvent::Handle will handle them in order.
		1030
		1031	There is, however, no read type for "the remaining data". For that, we
		1032	install our own C<on_read> callback:
		1033
		1034	# and finally handle any remaining data as body
		1035	$handle->on_read (sub {
		1036	$body .= $_[0]->rbuf;
		1037	$_[0]->rbuf = "";
		1038	});
		1039
		1040	This callback is invoked every time data arrives and the read queue is
		1041	empty - which in this example will only be the case when both response and
		1042	header have been read. The C<on_read> callback could actually have been
		1043	specified when constructing the object, but doing it this way preserves
		1044	logical ordering.
		1045
		1046	The read callback simply adds the current read buffer to it's C<$body>
		1047	variable and, most importantly, I<empties> the buffer by assigning the
		1048	empty string to it.
		1049
		1050	After AnyEvent::Handle has been so instructed, it will handle incoming
		1051	data according to these instructions - if all goes well, the callback will
		1052	be invoked with the response data, if not, it will get an error.
		1053
		1054	In general, you can implement pipelining (a semi-advanced feature of many
		1055	protocols) very easy with AnyEvent::Handle: If you have a protocol with a
		1056	request/response structure, your request methods/functions will all look
		1057	like this (simplified):
		1058
		1059	sub request {
		1060
		1061	# send the request to the server
		1062	$handle->push_write (...);
		1063
		1064	# push some response handlers
		1065	$handle->push_read (...);
		1066	}
		1067
		1068	This means you can queue as many requests as you want, and while
		1069	AnyEvent::Handle goes through its read queue to handle the response data,
		1070	the other side can work on the next request - queueing the request just
		1071	appends some data to the write queue and installs a handler to be called
		1072	later.
		1073
		1074	You might ask yourself how to handle decisions you can only make I<after>
		1075	you have received some data (such as handling a short error response or a
		1076	long and differently-formatted response). The answer to this problem is
		1077	C<unshift_read>, which we will introduce together with an example in the
		1078	coming sections.
		1079
		1080	=head3 Using C<http_get>
		1081
		1082	Finally, here is how you would use C<http_get>:
		1083
		1084	http_get "www.google.com", "/", sub {
		1085	my ($response, $header, $body) = @_;
		1086
		1087	print
		1088	$response, "\n",
		1089	$body;
		1090	};
		1091
		1092	And of course, you can run as many of these requests in parallel as you
		1093	want (and your memory supports).
		1094
		1095	=head3 HTTPS
		1096
		1097	Now, as promised, let's implement the same thing for HTTPS, or more
		1098	correctly, let's change our C<http_get> function into a function that
		1099	speaks HTTPS instead.
		1100
		1101	HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer
		1102	B<S>ecurity is the official name for what most people refer to as C<SSL>)
		1103	that contains standard HTTP protocol exchanges. The only other difference
		1104	to HTTP is that by default it uses port C<443> instead of port C<80>.
		1105
		1106	To implement these two differences we need two tiny changes, first, in the
		1107	C<connect> parameter, we replace C<http> by C<https> to connect to the
		1108	https port:
		1109
		1110	connect => [$host => 'https'],
		1111
		1112	The other change deals with TLS, which is something L<AnyEvent::Handle>
		1113	does for us, as long as I<you> made sure that the L<Net::SSLeay> module
		1114	is around. To enable TLS with L<AnyEvent::Handle>, we simply pass an
		1115	additional C<tls> parameter to the call to C<AnyEvent::Handle::new>:
		1116
		1117	tls => "connect",
		1118
		1119	Specifying C<tls> enables TLS, and the argument specifies whether
		1120	AnyEvent::Handle is the server side ("accept") or the client side
		1121	("connect") for the TLS connection, as unlike TCP, there is a clear
		1122	server/client relationship in TLS.
		1123
		1124	That's all.
		1125
		1126	Of course, all this should be handled transparently by C<http_get>
		1127	after parsing the URL. If you need this, see the part about exercising
		1128	your inspiration earlier in this document. You could also use the
		1129	L<AnyEvent::HTTP> module from CPAN, which implements all this and works
		1130	around a lot of quirks for you, too.
		1131
		1132	=head3 The read queue - revisited
		1133
		1134	HTTP always uses the same structure in its responses, but many protocols
		1135	require parsing responses differently depending on the response itself.
		1136
		1137	For example, in SMTP, you normally get a single response line:
		1138
		1139	220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
		1140
		1141	But SMTP also supports multi-line responses:
		1142
		1143	220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
		1144	220-hey guys
		1145	220 my response is longer than yours
		1146
		1147	To handle this, we need C<unshift_read>. As the name (hopefully) implies,
		1148	C<unshift_read> will not append your read request to the end of the read
		1149	queue, but instead it will prepend it to the queue.
		1150
		1151	This is useful in the situation above: Just push your response-line read
		1152	request when sending the SMTP command, and when handling it, you look at
		1153	the line to see if more is to come, and C<unshift_read> another reader
		1154	callback if required, like this:
		1155
		1156	my $response; # response lines end up in here
		1157
		1158	my $read_response; $read_response = sub {
		1159	my ($handle, $line) = @_;
		1160
		1161	$response .= "$line\n";
		1162
		1163	# check for continuation lines ("-" as 4th character")
		1164	if ($line =~ /^...-/) {
		1165	# if yes, then unshift another line read
		1166	$handle->unshift_read (line => $read_response);
		1167
		1168	} else {
		1169	# otherwise we are done
		1170
		1171	# free callback
		1172	undef $read_response;
		1173
		1174	print "we are don reading: $response\n";
701	);	1175	}
		1176	};
		1177
		1178	$handle->push_read (line => $read_response);
		1179
		1180	This recipe can be used for all similar parsing problems, for example in
		1181	NNTP, the response code to some commands indicates that more data will be
		1182	sent:
		1183
		1184	$handle->push_write ("article 42");
		1185
		1186	# read response line
		1187	$handle->push_read (line => sub {
		1188	my ($handle, $status) = @_;
		1189
		1190	# article data following?
		1191	if ($status =~ /^2/) {
		1192	# yes, read article body
		1193
		1194	$handle->unshift_read (line => "\012.\015\012", sub {
		1195	my ($handle, $body) = @_;
		1196
		1197	$finish->($status, $body);
		1198	});
		1199
		1200	} else {
		1201	# some error occured, no article data
		1202
		1203	$finish->($status);
		1204	}
		1205	}
		1206
		1207	=head3 Your own read queue handler
		1208
		1209	Sometimes, your protocol doesn't play nice and uses lines or chunks of
		1210	data not formatted in a way handled by AnyEvent::Handle out of the box. In
		1211	this case you have to implement your own read parser.
		1212
		1213	To make up a contorted example, imagine you are looking for an even
		1214	number of characters followed by a colon (":"). Also imagine that
		1215	AnyEvent::Handle had no C<regex> read type which could be used, so you'd
		1216	had to do it manually.
		1217
		1218	To implement a read handler for this, you would C<push_read> (or
		1219	C<unshift_read>) just a single code reference.
		1220
		1221	This code reference will then be called each time there is (new) data
		1222	available in the read buffer, and is expected to either successfully
		1223	eat/consume some of that data (and return true) or to return false to
		1224	indicate that it wants to be called again.
		1225
		1226	If the code reference returns true, then it will be removed from the
		1227	read queue (because it has parsed/consumed whatever it was supposed to
		1228	consume), otherwise it stays in the front of it.
		1229
		1230	The example above could be coded like this:
702		1231
703	$handle->push_read (sub {	1232	$handle->push_read (sub {
704	my ($handle) = @_;	1233	my ($handle) = @_;
705		1234
706	if ($handle->rbuf =~ s/^.?\bend\b.$//s) {	1235	# check for even number of characters + ":"
707	print "got 'end', existing...\n";	1236	# and remove the data if a match is found.
708	$end_prog->broadcast;	1237	# if not, return false (actually nothing)
		1238
		1239	$handle->{rbuf} =~ s/^( (?:..)* ) ://x
709	return 1	1240	or return;
		1241
		1242	# we got some data in $1, pass it to whoever wants it
		1243	$finish->($1);
		1244
		1245	# and return true to indicate we are done
710	}	1246	1
711
712	0
713	});	1247	});
714		1248
715	$end_prog->recv;	1249	This concludes our little tutorial.
716		1250
717	That's a mouthful, so let's go through it step by step:	1251	=head1 Where to go from here?
718		1252
719	#!/usr/bin/perl	1253	This introduction should have explained the key concepts of L<AnyEvent>
		1254	- event watchers and condition variables, L<AnyEvent::Socket> - basic
		1255	networking utilities, and L<AnyEvent::Handle> - a nice wrapper around
		1256	handles.
720		1257
721	use AnyEvent;	1258	You could either start coding stuff right away, look at those manual
722	use AnyEvent::Handle;	1259	pages for the gory details, or roam CPAN for other AnyEvent modules (such
		1260	as L<AnyEvent::IRC> or L<AnyEvent::HTTP>) to see more code examples (or
		1261	simply to use them).
723		1262
724	Nothing unexpected here, just load AnyEvent for the event functionality	1263	If you need a protocol that doesn't have an implementation using AnyEvent,
725	and AnyEvent::Handle for your file handling needs.	1264	remember that you can mix AnyEvent with one other event framework, such as
		1265	L<POE>, so you can always use AnyEvent for your own tasks plus modules of
		1266	one other event framework to fill any gaps.
726		1267
727	my $end_prog = AnyEvent->condvar;	1268	And last not least, you could also look at L<Coro>, especially
		1269	L<Coro::AnyEvent>, to see how you can turn event-based programming from
		1270	callback style back to the usual imperative style (also called "inversion
		1271	of control" - AnyEvent calls I<you>, but Coro lets I<you> call AnyEvent).
728		1272
729	Here the program creates a so-called 'condition variable': Condition	1273	=head1 Authors
730	variables are a great way to signal the completion of some event, or to
731	state that some condition became true (thus the name).
732
733	This condition variable represents the condition that the program wants to
734	terminate. Later in the program, we will 'recv' that condition (call the
735	C<recv> method on it), which will wait until the condition gets signalled
736	(which is done by calling the C<send> method on it).
737
738	The next step is to create the handle object:
739
740	my $handle =
741	AnyEvent::Handle->new (
742	fh => \*STDIN,
743	on_eof => sub {
744	print "received EOF, exiting...\n";
745	$end_prog->broadcast;
746	},
747
748	This handle object will read from standard input. Setting the C<on_eof>
749	callback should be done for every file handle, as that is a condition that
750	we always need to check for when working with file handles, to prevent
751	reading or writing to a closed file handle, or getting stuck indefinitely
752	in case of an error.
753
754	Speaking of errors:
755
756	on_error => sub {
757	print "error while reading from STDIN: $!\n";
758	$end_prog->broadcast;
759	}
760	);
761
762	The C<on_error> callback is also not required, but we set it here in case
763	any error happens when we read from the file handle. It is usually a good
764	idea to set this callback and at least print some diagnostic message: Even
765	in our small example an error can happen. More on this later...
766
767	$handle->push_read (sub {
768
769	Next we push a general read callback on the read queue, which
770	will wait until we have received all the data we wanted to
771	receive. L<AnyEvent::Handle> has two queues per file handle, a read and a
772	write queue. The write queue queues pending data that waits to be written
773	to the file handle. And the read queue queues reading callbacks. For more
774	details see the documentation L<AnyEvent::Handle> about the READ QUEUE and
775	WRITE QUEUE.
776
777	my ($handle) = @_;
778
779	if ($handle->rbuf =~ s/^.?\bend\b.$//s) {
780	print "got 'end', existing...\n";
781	$end_prog->broadcast;
782	return 1
783	}
784
785	0
786	});
787
788	The actual callback waits until the word 'end' has been seen in the data
789	received on standard input. Once we encounter the stop word 'end' we
790	remove everything from the read buffer and call the condition variable
791	we setup earlier, that signals our 'end of program' condition. And the
792	callback returns with a true value, that signals we are done with reading
793	all the data we were interested in (all data until the word 'end' has been
794	seen).
795
796	In all other cases, when the stop word has not been seen yet, we just
797	return a false value, to indicate that we are not finished yet.
798
799	The C<rbuf> method returns our read buffer, that we can directly modify as
800	lvalue. Alternatively we also could have written:
801
802	if ($handle->{rbuf} =~ s/^.?\bend\b.$//s) {
803
804	The last line will wait for the condition that our program wants to exit:
805
806	$end_prog->recv;
807
808	The call to C<recv> will setup an event loop for us and wait for IO, timer
809	or signal events and will handle them until the condition gets sent (by
810	calling its C<send> method).
811
812	The key points to learn from this example are:
813
814	=over 4
815
816	=item * Condition variables are used to start an event loop.
817
818	=item * How to registering some basic callbacks on AnyEvent::Handle's.
819
820	=item * How to process data in the read buffer.
821
822	=back
823
824	=head1 AUTHORS
825		1274
826	Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.	1275	Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.
827		1276

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent/lib/AnyEvent/Intro.pod (file contents): Revision 1.10 by root, Mon Jun 2 06:04:08 2008 UTC vs. Revision 1.27 by root, Fri Jun 18 14:28:50 2010 UTC

Diff Legend

Comparing AnyEvent/lib/AnyEvent/Intro.pod (file contents):
Revision 1.10 by root, Mon Jun 2 06:04:08 2008 UTC vs.
Revision 1.27 by root, Fri Jun 18 14:28:50 2010 UTC