ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent/Intro.pod
(Generate patch)

Comparing AnyEvent/lib/AnyEvent/Intro.pod (file contents):
Revision 1.10 by root, Mon Jun 2 06:04:08 2008 UTC vs.
Revision 1.27 by root, Fri Jun 18 14:28:50 2010 UTC

1=head1 NAME
2
3AnyEvent::Intro - an introductory tutorial to AnyEvent
4
1=head1 Introduction to AnyEvent 5=head1 Introduction to AnyEvent
2 6
3This is a tutorial that will introduce you to the features of AnyEvent. 7This is a tutorial that will introduce you to the features of AnyEvent.
4 8
5The first part introduces the core AnyEvent module (after swamping you a 9The first part introduces the core AnyEvent module (after swamping you a
6bit in evangelism), which might already provide all you ever need. 10bit in evangelism), which might already provide all you ever need: If you
11are only interested in AnyEvent's event handling capabilities, read no
12further.
7 13
8The second part focuses on network programming using sockets, for which 14The second part focuses on network programming using sockets, for which
9AnyEvent offers a lot of support you can use. 15AnyEvent offers a lot of support you can use, and a lot of workarounds
16around portability quirks.
10 17
11 18
12=head1 What is AnyEvent? 19=head1 What is AnyEvent?
13 20
14If you don't care for the whys and want to see code, skip this section! 21If you don't care for the whys and want to see code, skip this section!
16AnyEvent is first of all just a framework to do event-based 23AnyEvent is first of all just a framework to do event-based
17programming. Typically such frameworks are an all-or-nothing thing: If you 24programming. Typically such frameworks are an all-or-nothing thing: If you
18use one such framework, you can't (easily, or even at all) use another in 25use one such framework, you can't (easily, or even at all) use another in
19the same program. 26the same program.
20 27
21AnyEvent is different - it is a thin abstraction layer above all kinds 28AnyEvent is different - it is a thin abstraction layer on top of other
29event loops, just like DBI is an abstraction of many different database
22of event loops. Its main purpose is to move the choice of the underlying 30APIs. Its main purpose is to move the choice of the underlying framework
23framework (the event loop) from the module author to the program author 31(the event loop) from the module author to the program author using the
24using the module. 32module.
25 33
26That means you can write code that uses events to control what it 34That means you can write code that uses events to control what it
27does, without forcing other code in the same program to use the same 35does, without forcing other code in the same program to use the same
28underlying framework as you do - i.e. you can create a Perl module 36underlying framework as you do - i.e. you can create a Perl module
29that is event-based using AnyEvent, and users of that module can still 37that is event-based using AnyEvent, and users of that module can still
30choose between using L<Gtk2>, L<Tk>, L<Event> or no event loop at 38choose between using L<Gtk2>, L<Tk>, L<Event> (or run inside Irssi or
31all: AnyEvent comes with its own event loop implementation, so your 39rxvt-unicode) or any other supported event loop. AnyEvent even comes with
32code works regardless of other modules that might or might not be 40its own pure-perl event loop implementation, so your code works regardless
33installed. The latter is important, as AnyEvent does not have any 41of other modules that might or might not be installed. The latter is
34dependencies to other modules, which makes it easy to install, for 42important, as AnyEvent does not have any hard dependencies to other
35example, when you lack a C compiler. 43modules, which makes it easy to install, for example, when you lack a C
44compiler. No mater what environment, AnyEvent will just cope with it.
36 45
37A typical problem with Perl modules such as L<Net::IRC> is that they 46A typical limitation of existing Perl modules such as L<Net::IRC> is that
38come with their own event loop: In L<Net::IRC>, the program who uses it 47they come with their own event loop: In L<Net::IRC>, the program who uses
39needs to start the event loop of L<Net::IRC>. That means that one cannot 48it needs to start the event loop of L<Net::IRC>. That means that one
40integrate this module into a L<Gtk2> GUI for instance, as that module, 49cannot integrate this module into a L<Gtk2> GUI for instance, as that
41too, enforces the use of its own event loop (namely L<Glib>). 50module, too, enforces the use of its own event loop (namely L<Glib>).
42 51
43Another example is L<LWP>: it provides no event interface at all. It's a 52Another example is L<LWP>: it provides no event interface at all. It's
44pure blocking HTTP (and FTP etc.) client library, which usually means that 53a pure blocking HTTP (and FTP etc.) client library, which usually means
45you either have to start a thread or have to fork for a HTTP request, or 54that you either have to start another process or have to fork for a HTTP
46use L<Coro::LWP>, if you want to do something else while waiting for the 55request, or use threads (e.g. L<Coro::LWP>), if you want to do something
47request to finish. 56else while waiting for the request to finish.
48 57
49The motivation behind these designs is often that a module doesn't want to 58The motivation behind these designs is often that a module doesn't want
50depend on some complicated XS-module (Net::IRC), or that it doesn't want 59to depend on some complicated XS-module (Net::IRC), or that it doesn't
51to force the user to use some specific event loop at all (LWP). 60want to force the user to use some specific event loop at all (LWP), out
61of fear of severly limiting the usefulness of the module: If your module
62requires Glib, it will not run in a Tk program.
52 63
53L<AnyEvent> solves this dilemma, by B<not> forcing module authors to either 64L<AnyEvent> solves this dilemma, by B<not> forcing module authors to
65either:
54 66
55=over 4 67=over 4
56 68
57=item write their own event loop (because guarantees to offer one 69=item - write their own event loop (because it guarantees the availability
58everywhere - even on windows). 70of an event loop everywhere - even on windows with no extra modules
71installed).
59 72
60=item choose one fixed event loop (because AnyEvent works with all 73=item - choose one specific event loop (because AnyEvent works with most
61important event loops available for Perl, and adding others is trivial). 74event loops available for Perl).
62 75
63=back 76=back
64 77
65If the module author uses L<AnyEvent> for all his event needs (IO events, 78If the module author uses L<AnyEvent> for all his (or her) event needs
66timers, signals, ...) then all other modules can just use his module and 79(IO events, timers, signals, ...) then all other modules can just use
67don't have to choose an event loop or adapt to his event loop. The choice 80his module and don't have to choose an event loop or adapt to his event
68of the event loop is ultimately made by the program author who uses all 81loop. The choice of the event loop is ultimately made by the program
69the modules and writes the main program. And even there he doesn't have to 82author who uses all the modules and writes the main program. And even
70choose, he can just let L<AnyEvent> choose the best available event loop 83there he doesn't have to choose, he can just let L<AnyEvent> choose the
71for him. 84most efficient event loop available on the system.
72 85
73Read more about this in the main documentation of the L<AnyEvent> module. 86Read more about this in the main documentation of the L<AnyEvent> module.
74 87
75 88
76=head1 Introduction to Event-Based Programming 89=head1 Introduction to Event-Based Programming
101 ); 114 );
102 115
103 # do something else here 116 # do something else here
104 117
105Looks more complicated, and surely is, but the advantage of using events 118Looks more complicated, and surely is, but the advantage of using events
106is that your program can do something else instead of waiting for 119is that your program can do something else instead of waiting for input
120(side note: combining AnyEvent with a thread package such as Coro can
121recoup much of the simplicity, effectively getting the best of two
122worlds).
123
107input. Waiting as in the first example is also called "blocking" because 124Waiting as done in the first example is also called "blocking" the process
108you "block" your process from executing anything else while you do so. 125because you "block"/keep your process from executing anything else while
126you do so.
109 127
110The second example avoids blocking, by only registering interest in a read 128The second example avoids blocking by only registering interest in a read
111event, which is fast and doesn't block your process. Only when read data 129event, which is fast and doesn't block your process. Only when data is
112is available will the callback be called, which can then proceed to read 130available for reading will the callback be called, which can then proceed
113the data. 131to read the data.
114 132
115The "interest" is represented by an object returned by C<< AnyEvent->io 133The "interest" is represented by an object returned by C<< AnyEvent->io
116>> called a "watcher" object - called like that because it "watches" your 134>> called a "watcher" object - called this because it "watches" your
117file handle (or other event sources) for the event you are interested in. 135file handle (or other event sources) for the event you are interested in.
118 136
119In the example above, we create an I/O watcher by calling the C<< 137In the example above, we create an I/O watcher by calling the C<<
120AnyEvent->io >> method. Disinterest in some event is simply expressed by 138AnyEvent->io >> method. Disinterest in some event is simply expressed
121forgetting about the watcher, for example, by C<undef>'ing the variable it 139by forgetting about the watcher, for example, by C<undef>'ing the only
122is stored in. AnyEvent will automatically clean up the watcher if it is no 140variable it is stored in. AnyEvent will automatically clean up the watcher
123longer used, much like Perl closes your file handles if you no longer use 141if it is no longer used, much like Perl closes your file handles if you no
124them anywhere. 142longer use them anywhere.
143
144=head3 A short note on callbacks
145
146A common issue that hits people is the problem of passing parameters
147to callbacks. Programmers used to languages such as C or C++ are often
148used to a style where one passes the address of a function (a function
149reference) and some data value, e.g.:
150
151 sub callback {
152 my ($arg) = @_;
153
154 $arg->method;
155 }
156
157 my $arg = ...;
158
159 call_me_back_later \&callback, $arg;
160
161This is clumsy, as the place where behaviour is specified (when the
162callback is registered) is often far away from the place where behaviour
163is implemented. It also doesn't use Perl syntax to invoke the code. There
164is also an abstraction penalty to pay as one has to I<name> the callback,
165which often is unnecessary and leads to nonsensical or duplicated names.
166
167In Perl, one can specify behaviour much more directly by using
168I<closures>. Closures are code blocks that take a reference to the
169enclosing scope(s) when they are created. This means lexical variables in
170scope at the time of creating the closure can simply be used inside the
171closure:
172
173 my $arg = ...;
174
175 call_me_back_later sub { $arg->method };
176
177Under most circumstances, closures are faster, use fewer resources and
178result in much clearer code then the traditional approach. Faster,
179because parameter passing and storing them in local variables in Perl
180is relatively slow. Fewer resources, because closures take references
181to existing variables without having to create new ones, and clearer
182code because it is immediately obvious that the second example calls the
183C<method> method when the callback is invoked.
184
185Apart from these, the strongest argument for using closures with AnyEvent
186is that AnyEvent does not allow passing parameters to the callback, so
187closures are the only way to achieve that in most cases :->
188
189
190=head3 A hint on debugging
191
192AnyEvent does, by default, not do any argument checking. This can lead to
193strange and unexpected results especially if you are trying to learn your
194ways with AnyEvent.
195
196AnyEvent supports a special "strict" mode - off by default - which does very
197strict argument checking, at the expense of being somewhat slower. During
198development, however, this mode is very useful.
199
200You can enable this strict mode either by having an environment variable
201C<PERL_ANYEVENT_STRICT> with a true value in your environment:
202
203 PERL_ANYEVENT_STRICT=1 perl test.pl
204
205Or you can write C<use AnyEvent::Strict> in your program, which has the
206same effect (do not do this in production, however).
207
125 208
126=head2 Condition Variables 209=head2 Condition Variables
127 210
128However, the above is not a fully working program, and will not work 211Back to the I/O watcher example: The code is not yet a fully working
129as-is. The reason is that your callback will not be invoked out of the 212program, and will not work as-is. The reason is that your callback will
130blue, you have to run the event loop. Also, event-based programs sometimes 213not be invoked out of the blue, you have to run the event loop. Also,
131have to block, too, as when there simply is nothing else to do and 214event-based programs sometimes have to block, too, as when there simply is
132everything waits for some events, it needs to block the process as well. 215nothing else to do and everything waits for some events, it needs to block
216the process as well until new events arrive.
133 217
134In AnyEvent, this is done using condition variables. Condition variables 218In AnyEvent, this is done using condition variables. Condition variables
135are named "condition variables" because they represent a condition that is 219are named "condition variables" because they represent a condition that is
136initially false and needs to be fulfilled. 220initially false and needs to be fulfilled.
137 221
138You can also call them "merge points", "sync points", "rendezvous ports" 222You can also call them "merge points", "sync points", "rendezvous ports"
139or even callbacks and many other things (and they are often called like 223or even callbacks and many other things (and they are often called these
140this in other frameworks). The important point is that you can create them 224names in other frameworks). The important point is that you can create them
141freely and later wait for them to become true. 225freely and later wait for them to become true.
142 226
143Condition variables have two sides - one side is the "producer" of the 227Condition variables have two sides - one side is the "producer" of the
144condition (whatever code detects the condition), the other side is the 228condition (whatever code detects and flags the condition), the other side
145"consumer" (the code that waits for that condition). 229is the "consumer" (the code that waits for that condition).
146 230
147In our example in the previous section, the producer is the event callback 231In our example in the previous section, the producer is the event callback
148and there is no consumer yet - let's change that now: 232and there is no consumer yet - let's change that right now:
149 233
150 use AnyEvent; 234 use AnyEvent;
151 235
152 $| = 1; print "enter your name> "; 236 $| = 1; print "enter your name> ";
153 237
174 print "your name is $name\n"; 258 print "your name is $name\n";
175 259
176This program creates an AnyEvent condvar by calling the C<< 260This program creates an AnyEvent condvar by calling the C<<
177AnyEvent->condvar >> method. It then creates a watcher as usual, but 261AnyEvent->condvar >> method. It then creates a watcher as usual, but
178inside the callback it C<send>'s the C<$name_ready> condition variable, 262inside the callback it C<send>'s the C<$name_ready> condition variable,
179which causes anybody waiting on it to continue. 263which causes whoever is waiting on it to continue.
180 264
181The "anybody" in this case is the code that follows, which calls C<< 265The "whoever" in this case is the code that follows, which calls C<<
182$name_ready->recv >>: The producer calls C<send>, the consumer calls 266$name_ready->recv >>: The producer calls C<send>, the consumer calls
183C<recv>. 267C<recv>.
184 268
185If there is no C<$name> available yet, then the call to C<< 269If there is no C<$name> available yet, then the call to C<<
186$name_ready->recv >> will halt your program until the condition becomes 270$name_ready->recv >> will halt your program until the condition becomes
196 280
197 my $name_ready = AnyEvent->condvar; 281 my $name_ready = AnyEvent->condvar;
198 282
199 my $wait_for_input = AnyEvent->io ( 283 my $wait_for_input = AnyEvent->io (
200 fh => \*STDIN, poll => "r", 284 fh => \*STDIN, poll => "r",
201 cb => sub { $name_ready->send (scalar = <STDIN>) } 285 cb => sub { $name_ready->send (scalar <STDIN>) }
202 ); 286 );
203 287
204 # do something else here 288 # do something else here
205 289
206 # now wait and fetch the name 290 # now wait and fetch the name
263 347
264Instead of waiting for a condition variable, the program enters the Gtk2 348Instead of waiting for a condition variable, the program enters the Gtk2
265main loop by calling C<< Gtk2->main >>, which will block the program and 349main loop by calling C<< Gtk2->main >>, which will block the program and
266wait for events to arrive. 350wait for events to arrive.
267 351
268This also shows that AnyEvent is quite flexible - you didn't have anything 352This also shows that AnyEvent is quite flexible - you didn't have to do
269to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just 353anything to make the AnyEvent watcher use Gtk2 (actually Glib) - it just
270worked. 354worked.
271 355
272Admittedly, the example is a bit silly - who would want to read names 356Admittedly, the example is a bit silly - who would want to read names
273form standard input in a Gtk+ application. But imagine that instead of 357from standard input in a Gtk+ application. But imagine that instead of
274doing that, you would make a HTTP request in the background and display 358doing that, you would make a HTTP request in the background and display
275it's results. In fact, with event-based programming you can make many 359it's results. In fact, with event-based programming you can make many
276http-requests in parallel in your program and still provide feedback to 360HTTP requests in parallel in your program and still provide feedback to
277the user and stay interactive. 361the user and stay interactive.
278 362
279In the next part you will see how to do just that - by implementing an 363And in the next part you will see how to do just that - by implementing an
280HTTP request, on our own, with the utility modules AnyEvent comes with. 364HTTP request, on our own, with the utility modules AnyEvent comes with.
281 365
282Before that, however, let's briefly look at how you would write your 366Before that, however, let's briefly look at how you would write your
283program with using only AnyEvent, without ever calling some other event 367program using only AnyEvent, without ever calling some other event
284loop's run function. 368loop's run function.
285 369
286In the example using condition variables, we used that, and in fact, this 370In the example using condition variables, we used those to start waiting
287is the solution: 371for events, and in fact, condition variables are the solution:
288 372
289 my $quit_program = AnyEvent->condvar; 373 my $quit_program = AnyEvent->condvar;
290 374
291 # create AnyEvent watchers (or not) here 375 # create AnyEvent watchers (or not) here
292 376
293 $quit_program->recv; 377 $quit_program->recv;
294 378
295If any of your watcher callbacks decide to quit, they can simply call 379If any of your watcher callbacks decide to quit (this is often
380called an "unloop" in other frameworks), they can simply call C<<
296C<< $quit_program->send >>. Of course, they could also decide not to and 381$quit_program->send >>. Of course, they could also decide not to and
297simply call C<exit> instead, or they could decide not to quit, ever (e.g. 382simply call C<exit> instead, or they could decide not to quit, ever (e.g.
298in a long-running daemon program). 383in a long-running daemon program).
299 384
300In that case, you can simply use: 385If you don't need some clean quit functionality and just want to run the
386event loop, you can simply do this:
301 387
302 AnyEvent->condvar->recv; 388 AnyEvent->condvar->recv;
303 389
304And this is, in fact, closest to the idea of a main loop run function that 390And this is, in fact, closest to the idea of a main loop run function that
305AnyEvent offers. 391AnyEvent offers.
306 392
307=head2 Timers and other event sources 393=head2 Timers and other event sources
308 394
309So far, we have only used I/O watchers. These are useful mainly to find 395So far, we have only used I/O watchers. These are useful mainly to find
310out whether a Socket has data to read, or space to write more data. On sane 396out whether a socket has data to read, or space to write more data. On sane
311operating systems this also works for console windows/terminals (typically 397operating systems this also works for console windows/terminals (typically
312on standard input), serial lines, all sorts of other devices, basically 398on standard input), serial lines, all sorts of other devices, basically
313almost everything that has a file descriptor but isn't a file itself. (As 399almost everything that has a file descriptor but isn't a file itself. (As
314usual, "sane" excludes windows - on that platform you would need different 400usual, "sane" excludes windows - on that platform you would need different
315functions for all of these, complicating code immensely - think "socket 401functions for all of these, complicating code immensely - think "socket
337 423
338 # now wait till our time has come 424 # now wait till our time has come
339 $cv->recv; 425 $cv->recv;
340 426
341Unlike I/O watchers, timers are only interested in the amount of seconds 427Unlike I/O watchers, timers are only interested in the amount of seconds
342they have to wait. When that amount of time has passed, AnyEvent will 428they have to wait. When (at least) that amount of time has passed,
343invoke your callback. 429AnyEvent will invoke your callback.
344 430
345Unlike I/O watchers, which will call your callback as many times as there 431Unlike I/O watchers, which will call your callback as many times as there
346is data available, timers are one-shot: after they have "fired" once and 432is data available, timers are normally one-shot: after they have "fired"
347invoked your callback, they are dead and no longer do anything. 433once and invoked your callback, they are dead and no longer do anything.
348 434
349To get a repeating timer, such as a timer firing roughly once per second, 435To get a repeating timer, such as a timer firing roughly once per second,
350you have to recreate it: 436you can specify an C<interval> parameter:
351 437
352 use AnyEvent; 438 my $once_per_second = AnyEvent->timer (
353 439 after => 0, # first invoke ASAP
354 my $time_watcher; 440 interval => 1, # then invoke every second
355 441 cb => sub { # the callback to invoke
356 sub once_per_second { 442 $cv->send;
357 print "tick\n";
358 443 },
359 # (re-)create the watcher
360 $time_watcher = AnyEvent->timer (
361 after => 1,
362 cb => \&once_per_second,
363 ); 444 );
364 }
365
366 # now start the timer
367 once_per_second;
368
369Having to recreate your timer is a restriction put on AnyEvent that is
370present in most event libraries it uses. It is so annoying that some
371future version might work around this limitation, but right now, it's the
372only way to do repeating timers.
373
374Fortunately most timers aren't really repeating but specify timeouts of
375some sort.
376 445
377=head3 More esoteric sources 446=head3 More esoteric sources
378 447
379AnyEvent also has some other, more esoteric event sources you can tap 448AnyEvent also has some other, more esoteric event sources you can tap
380into: signal and child watchers. 449into: signal, child and idle watchers.
381 450
382Signal watchers can be used to wait for "signal events", which simply 451Signal watchers can be used to wait for "signal events", which simply
383means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>). 452means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>).
384 453
385Process watchers wait for a child process to exit. They are useful when 454Child-process watchers wait for a child process to exit. They are useful
386you fork a separate process and need to know when it exits, but you do not 455when you fork a separate process and need to know when it exits, but you
387wait for that by blocking. 456do not wait for that by blocking.
388 457
458Idle watchers invoke their callback when the event loop has handled all
459outstanding events, polled for new events and didn't find any, i.e., when
460your process is otherwise idle. They are useful if you want to do some
461non-trivial data processing that can be done when your program doesn't
462have anything better to do.
463
389Both watcher types are described in detail in the main L<AnyEvent> manual 464All these watcher types are described in detail in the main L<AnyEvent>
390page. 465manual page.
391 466
467Sometimes you also need to know what the current time is: C<<
468AnyEvent->now >> returns the time the event toolkit uses to schedule
469relative timers, and is usually what you want. It is often cached (which
470means it can be a bit outdated). In that case, you can use the more costly
471C<< AnyEvent->time >> method which will ask your operating system for the
472current time, which is slower, but also more up to date.
392 473
393=head1 Network programming and AnyEvent 474=head1 Network programming and AnyEvent
394 475
395So far you have seen how to register event watchers and handle events. 476So far you have seen how to register event watchers and handle events.
396 477
397This is a great foundation to write network clients and servers, and might be 478This is a great foundation to write network clients and servers, and might
398all that your module (or program) ever requires, but writing your own I/O 479be all that your module (or program) ever requires, but writing your own
399buffering again and again becomes tedious, not to mention that it attracts 480I/O buffering again and again becomes tedious, not to mention that it
400errors. 481attracts errors.
401 482
402While the core L<AnyEvent> module is still small and self-contained, 483While the core L<AnyEvent> module is still small and self-contained,
403the distribution comes with some very useful utility modules such as 484the distribution comes with some very useful utility modules such as
404L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can 485L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can
405make your life as non-blocking network programmer a lot easier. 486make your life as non-blocking network programmer a lot easier.
413a great way to do other DNS resolution tasks, such as reverse lookups of 494a great way to do other DNS resolution tasks, such as reverse lookups of
414IP addresses for log files. 495IP addresses for log files.
415 496
416=head2 L<AnyEvent::Handle> 497=head2 L<AnyEvent::Handle>
417 498
418This module handles non-blocking IO on file handles in an event based 499This module handles non-blocking IO on (socket-, pipe- etc.) file handles
419manner. It provides a wrapper object around your file handle that provides 500in an event based manner. It provides a wrapper object around your file
420queueing and buffering of incoming and outgoing data for you. 501handle that provides queueing and buffering of incoming and outgoing data
502for you.
421 503
422It also implements the most common data formats, such as text lines, or 504It also implements the most common data formats, such as text lines, or
423fixed and variable-width data blocks. 505fixed and variable-width data blocks.
424 506
425=head2 L<AnyEvent::Socket> 507=head2 L<AnyEvent::Socket>
442successful? That unsuccessful TCP connects might never be reported back 524successful? That unsuccessful TCP connects might never be reported back
443to your program? That C<WSAEINPROGRESS> means your C<connect> call was 525to your program? That C<WSAEINPROGRESS> means your C<connect> call was
444ignored instead of being in progress? AnyEvent::Socket works around all of 526ignored instead of being in progress? AnyEvent::Socket works around all of
445these Windows/Perl bugs for you). 527these Windows/Perl bugs for you).
446 528
447=head2 First experiments with non-blocking connects: a parallel finger 529=head2 Implementing a parallel finger client with non-blocking connects
448client. 530and AnyEvent::Socket
449 531
450The finger protocol is one of the simplest protocols in use on the 532The finger protocol is one of the simplest protocols in use on the
451internet. Or in use in the past, as almost nobody uses it anymore. 533internet. Or in use in the past, as almost nobody uses it anymore.
452 534
453It works by connecting to the finger port on another host, writing a 535It works by connecting to the finger port on another host, writing a
454single line with a user name and then reading the finger response, as 536single line with a user name and then reading the finger response, as
455specified by that user. OK, RFC 1288 specifies a vastly more complex 537specified by that user. OK, RFC 1288 specifies a vastly more complex
456protocol, but it basically boils down to this: 538protocol, but it basically boils down to this:
457 539
458 # telnet idsoftware.com finger 540 # telnet kernel.org finger
459 Trying 192.246.40.37... 541 Trying 204.152.191.37...
460 Connected to idsoftware.com (192.246.40.37). 542 Connected to kernel.org (204.152.191.37).
461 Escape character is '^]'. 543 Escape character is '^]'.
462 johnc 544
463 Welcome to id Software's Finger Service V1.5! 545 The latest stable version of the Linux kernel is: [...]
464
465 [...]
466 Now on the web:
467 [...]
468
469 Connection closed by foreign host. 546 Connection closed by foreign host.
470 547
471Yeah, I<was> used indeed, but at least the finger daemon still works, so
472let's write a little AnyEvent function that makes a finger request: 548So let's write a little AnyEvent function that makes a finger request:
473 549
474 use AnyEvent; 550 use AnyEvent;
475 use AnyEvent::Socket; 551 use AnyEvent::Socket;
476 552
477 sub finger($$) { 553 sub finger($$) {
509 585
510 # pass $cv to the caller 586 # pass $cv to the caller
511 $cv 587 $cv
512 } 588 }
513 589
514That's a mouthful! Let's dissect this function a bit, first the overall function: 590That's a mouthful! Let's dissect this function a bit, first the overall
591function and execution flow:
515 592
516 sub finger($$) { 593 sub finger($$) {
517 my ($user, $host) = @_; 594 my ($user, $host) = @_;
518 595
519 # use a condvar to return results 596 # use a condvar to return results
525 }; 602 };
526 603
527 $cv 604 $cv
528 } 605 }
529 606
530This isn't too complicated, just a function with two parameters, which 607This isn't too complicated, just a function with two parameters, that
531creates a condition variable, returns it, and while it does that, 608creates a condition variable, returns it, and while it does that,
532initiates a TCP connect to C<$host>. The condition variable 609initiates a TCP connect to C<$host>. The condition variable will be used
533will be used by the caller to receive the finger response. 610by the caller to receive the finger response, but one could equally well
611pass a third argument, a callback, to the function.
534 612
535Since we are event-based programmers, we do not wait for the connect to 613Since we are programming event'ish, we do not wait for the connect to
536finish - it could block your program for a minute or longer! Instead, 614finish - it could block the program for a minute or longer!
615
537we pass the callback it should invoke when the connect is done to 616Instead, we pass the callback it should invoke when the connect is done to
538C<tcp_connect>. If it is successful, our callback gets called with the 617C<tcp_connect>. If it is successful, that callback gets called with the
539socket handle as first argument, otherwise, nothing will be passed to our 618socket handle as first argument, otherwise, nothing will be passed to our
540callback. 619callback. The important point is that it will always be called as soon as
620the outcome of the TCP connect is known.
541 621
622This style of programming is also called "continuation style": the
623"continuation" is simply the way the program continues - normally at the
624next line after some statement (the exception is loops or things like
625C<return>). When we are interested in events, however, we instead specify
626the "continuation" of our program by passing a closure, which makes that
627closure the "continuation" of the program.
628
629The C<tcp_connect> call is like saying "return now, and when the
630connection is established or it failed, continue there".
631
542Let's look at our callback in more detail: 632Now let's look at the callback/closure in more detail:
543 633
544 # the callback gets the socket handle - or nothing 634 # the callback receives the socket handle - or nothing
545 my ($fh) = @_ 635 my ($fh) = @_
546 or return $cv->send; 636 or return $cv->send;
547 637
548The first thing the callback does is indeed save the socket handle in 638The first thing the callback does is indeed save the socket handle in
549C<$fh>. When there was an error (no arguments), then our instinct as 639C<$fh>. When there was an error (no arguments), then our instinct as
550expert Perl programmers would tell us to die: 640expert Perl programmers would tell us to C<die>:
551 641
552 my ($fh) = @_ 642 my ($fh) = @_
553 or die "$host: $!"; 643 or die "$host: $!";
554 644
555While this would give good feedback to the user, our program would 645While this would give good feedback to the user (if he happens to watch
556probably freeze here, as we never report the results to anybody, certainly 646standard error), our program would probably stop working here, as we never
557not the caller of our C<finger> function! 647report the results to anybody, certainly not the caller of our C<finger>
648function, and most event loops continue even after a C<die>!
558 649
559This is why we instead return, but also call C<< $cv->send >> without any 650This is why we instead C<return>, but also call C<< $cv->send >> without
560arguments to signal to our consumer that something bad has happened. The 651any arguments to signal to the condvar consumer that something bad has
561return value of C<< $cv->send >> is irrelevant, as is the return value of 652happened. The return value of C<< $cv->send >> is irrelevant, as is
562our callback. The return statement is simply used for the side effect of, 653the return value of our callback. The C<return> statement is simply
563well, returning immediately from the callback. 654used for the side effect of, well, returning immediately from the
655callback. Checking for errors and handling them this way is very common,
656which is why this compact idiom is so handy.
564 657
565As the next step in the finger protocol, we send the username to the 658As the next step in the finger protocol, we send the username to the
566finger daemon on the other side of our connection: 659finger daemon on the other side of our connection (the kernel.org finger
660service doesn't actually wait for a username, but the net is running out
661of finger servers fast):
567 662
568 syswrite $fh, "$user\015\012"; 663 syswrite $fh, "$user\015\012";
569 664
570Note that this isn't 100% clean - the socket could, for whatever reasons, 665Note that this isn't 100% clean socket programming - the socket could,
571not accept our data. When writing a small amount of data like in this 666for whatever reasons, not accept our data. When writing a small amount
572example it doesn't matter, but for real-world cases you might need to 667of data like in this example it doesn't matter, as a socket buffer is
573implement some kind of write buffering - or use L<AnyEvent::Handle>, which 668almost always big enough for a mere "username", but for real-world
574handles these matters for you. 669cases you might need to implement some kind of write buffering - or use
670L<AnyEvent::Handle>, which handles these matters for you, as shown in the
671next section.
575 672
576What we do have to do is to implement our own read buffer - the response 673What we I<do> have to do is to implement our own read buffer - the response
577data could arrive late or in multiple chunks, and we cannot just wait for 674data could arrive late or in multiple chunks, and we cannot just wait for
578it (event-based programming, you know?). 675it (event-based programming, you know?).
579 676
580To do that, we register a read watcher on the socket which waits for data: 677To do that, we register a read watcher on the socket which waits for data:
581 678
587variable, but in a local one - if the callback returns, it would normally 684variable, but in a local one - if the callback returns, it would normally
588destroy the variable and its contents, which would in turn unregister our 685destroy the variable and its contents, which would in turn unregister our
589watcher. 686watcher.
590 687
591To avoid that, we C<undef>ine the variable in the watcher callback. This 688To avoid that, we C<undef>ine the variable in the watcher callback. This
592means that, when the C<tcp_connect> callback returns, that perl thinks 689means that, when the C<tcp_connect> callback returns, perl thinks (quite
593(quite correctly) that the read watcher is still in use - namely in the 690correctly) that the read watcher is still in use - namely in the callback,
594callback. 691and thus keeps it alive even if nothing else in the program refers to it
692anymore (it is much like Baron Münchhausen keeping himself from dying by
693pulling himself out of a swamp).
694
695The trick, however, is that instead of:
696
697 my $read_watcher = AnyEvent->io (...
698
699The program does:
700
701 my $read_watcher; $read_watcher = AnyEvent->io (...
702
703The reason for this is a quirk in the way Perl works: variable names
704declared with C<my> are only visible in the I<next> statement. If the
705whole C<< AnyEvent->io >> call, including the callback, would be done in
706a single statement, the callback could not refer to the C<$read_watcher>
707variable to undefine it, so it is done in two statements.
708
709Whether you'd want to format it like this is of course a matter of style,
710this way emphasizes that the declaration and assignment really are one
711logical statement.
595 712
596The callback itself calls C<sysread> for as many times as necessary, until 713The callback itself calls C<sysread> for as many times as necessary, until
597C<sysread> returns an error or end-of-file: 714C<sysread> returns either an error or end-of-file:
598 715
599 cb => sub { 716 cb => sub {
600 my $len = sysread $fh, $response, 1024, length $response; 717 my $len = sysread $fh, $response, 1024, length $response;
601 718
602 if ($len <= 0) { 719 if ($len <= 0) {
603 720
604Note that C<sysread> has the ability to append data it reads to a scalar, 721Note that C<sysread> has the ability to append data it reads to a scalar,
605which is what we make good use of in this example. 722by specifying an offset, a feature of which we make good use of in this
723example.
606 724
607When C<sysread> indicates we are done, the callback C<undef>ines 725When C<sysread> indicates we are done, the callback C<undef>ines
608the watcher and then C<send>'s the response data to the condition 726the watcher and then C<send>'s the response data to the condition
609variable. All this has the following effects: 727variable. All this has the following effects:
610 728
623But the main advantage is that we can not only run this finger function in 741But the main advantage is that we can not only run this finger function in
624the background, we even can run multiple sessions in parallel, like this: 742the background, we even can run multiple sessions in parallel, like this:
625 743
626 my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets 744 my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets
627 my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 745 my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736
628 my $f3 = finger "johnc", "idsoftware.com"; # finger john 746 my $f3 = finger "hpa" , "kernel.org"; # finger hpa
629 747
630 print "trouble tickets:\n", $f1->recv, "\n"; 748 print "trouble tickets:\n" , $f1->recv, "\n";
631 print "trouble ticket #1736:\n", $f2->recv, "\n"; 749 print "trouble ticket #1736:\n", $f2->recv, "\n";
632 print "john carmacks finger file: ", $f3->recv, "\n"; 750 print "kernel release info: " , $f3->recv, "\n";
633 751
634It doesn't look like it, but in fact all three requests run in 752It doesn't look like it, but in fact all three requests run in
635parallel. The code waits for the first finger request to finish first, but 753parallel. The code waits for the first finger request to finish first, but
636that doesn't keep it from executing in parallel, because when the first 754that doesn't keep it from executing them parallel: when the first C<recv>
637C<recv> call sees that the data isn't ready yet, it serves events for all 755call sees that the data isn't ready yet, it serves events for all three
638three requests automatically. 756requests automatically, until the first request has finished.
757
758The second C<recv> call might either find the data is already there, or it
759will continue handling events until that is the case, and so on.
639 760
640By taking advantage of network latencies, which allows us to serve other 761By taking advantage of network latencies, which allows us to serve other
641requests and events while we wait for an event on one socket, the overall 762requests and events while we wait for an event on one socket, the overall
642time to do these three requests will be greatly reduces, typically all 763time to do these three requests will be greatly reduced, typically all
643three are done in the same time as the slowest of them would use. 764three are done in the same time as the slowest of them would need to finish.
644 765
645By the way, you do not actually have to wait in the C<recv> method on an 766By the way, you do not actually have to wait in the C<recv> method on an
646AnyEvent condition variable, you can also register a callback: 767AnyEvent condition variable - after all, waiting is evil - you can also
768register a callback:
647 769
648 $cv->cb (sub { 770 $cv->cb (sub {
649 my $response = shift->recv; 771 my $response = shift->recv;
650 # ... 772 # ...
651 }); 773 });
656response: 778response:
657 779
658 sub finger($$$) { 780 sub finger($$$) {
659 my ($user, $host, $cb) = @_; 781 my ($user, $host, $cb) = @_;
660 782
661What you use is a matter of taste - if you expect your function to be 783How you implement it is a matter of taste - if you expect your function to
662used mainly in an event-based program you would normally prefer to pass a 784be used mainly in an event-based program you would normally prefer to pass
663callback directly. 785a callback directly. If you write a module and expect your users to use
786it "synchronously" often (for example, a simple http-get script would not
787really care much for events), then you would use a condition variable and
788tell them "simply C<< ->recv >> the data".
664 789
665=head3 Criticism and fix 790=head3 Problems with the implementation and how to fix them
666 791
667To make this example more real-world-ready, we would not only implement 792To make this example more real-world-ready, we would not only implement
668some write buffering (for the paranoid), but we would also have to handle 793some write buffering (for the paranoid, or maybe denial-of-service aware
669timeouts and maybe protocol errors. 794security expert), but we would also have to handle timeouts and maybe
795protocol errors.
670 796
671This quickly gets unwieldy, which is why we introduce L<AnyEvent::Handle> 797Doing this quickly gets unwieldy, which is why we introduce
672in the next section, which takes care of all these details for us. 798L<AnyEvent::Handle> in the next section, which takes care of all these
799details for you and let's you concentrate on the actual protocol.
673 800
674 801
675=head2 First experiments with AnyEvent::Handle 802=head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
676 803
677Now let's start with something simple: a program that reads from standard 804The L<AnyEvent::Handle> module has been hyped quite a bit in this document
678input in a non-blocking way, that is, in a way that lets your program do 805so far, so let's see what it really offers.
679other things while it is waiting for input.
680 806
681First, the full program listing: 807As finger is such a simple protocol, let's try something slightly more
808complicated: HTTP/1.0.
682 809
683 #!/usr/bin/perl 810An HTTP GET request works by sending a single request line that indicates
811what you want the server to do and the URI you want to act it on, followed
812by as many "header" lines (C<Header: data>, same as e-mail headers) as
813required for the request, ended by an empty line.
684 814
685 use AnyEvent; 815The response is formatted very similarly, first a line with the response
686 use AnyEvent::Handle; 816status, then again as many header lines as required, then an empty line,
817followed by any data that the server might send.
687 818
688 my $end_prog = AnyEvent->condvar; 819Again, let's try it out with C<telnet> (I condensed the output a bit - if
820you want to see the full response, do it yourself).
689 821
690 my $handle = 822 # telnet www.google.com 80
691 AnyEvent::Handle->new ( 823 Trying 209.85.135.99...
692 fh => \*STDIN, 824 Connected to www.google.com (209.85.135.99).
825 Escape character is '^]'.
826 GET /test HTTP/1.0
827
828 HTTP/1.0 404 Not Found
829 Date: Mon, 02 Jun 2008 07:05:54 GMT
830 Content-Type: text/html; charset=UTF-8
831
832 <html><head>
833 [...]
834 Connection closed by foreign host.
835
836The C<GET ...> and the empty line were entered manually, the rest of the
837telnet output is google's response, in which case a C<404 not found> one.
838
839So, here is how you would do it with C<AnyEvent::Handle>:
840
841 sub http_get {
842 my ($host, $uri, $cb) = @_;
843
844 # store results here
845 my ($response, $header, $body);
846
847 my $handle; $handle = new AnyEvent::Handle
848 connect => [$host => 'http'],
693 on_eof => sub { 849 on_error => sub {
694 print "received EOF, exiting...\n"; 850 $cb->("HTTP/1.0 500 $!");
695 $end_prog->broadcast; 851 $handle->destroy; # explicitly destroy handle
696 }, 852 },
853 on_eof => sub {
854 $cb->($response, $header, $body);
855 $handle->destroy; # explicitly destroy handle
856 };
857
858 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
859
860 # now fetch response status line
861 $handle->push_read (line => sub {
862 my ($handle, $line) = @_;
863 $response = $line;
864 });
865
866 # then the headers
867 $handle->push_read (line => "\015\012\015\012", sub {
868 my ($handle, $line) = @_;
869 $header = $line;
870 });
871
872 # and finally handle any remaining data as body
873 $handle->on_read (sub {
874 $body .= $_[0]->rbuf;
875 $_[0]->rbuf = "";
876 });
877 }
878
879And now let's go through it step by step. First, as usual, the overall
880C<http_get> function structure:
881
882 sub http_get {
883 my ($host, $uri, $cb) = @_;
884
885 # store results here
886 my ($response, $header, $body);
887
888 my $handle; $handle = new AnyEvent::Handle
889 ... create handle object
890
891 ... push data to write
892
893 ... push what to expect to read queue
894 }
895
896Unlike in the finger example, this time the caller has to pass a callback
897to C<http_get>. Also, instead of passing a URL as one would expect, the
898caller has to provide the hostname and URI - normally you would use the
899C<URI> module to parse a URL and separate it into those parts, but that is
900left to the inspired reader :)
901
902Since everything else is left to the caller, all C<http_get> does it to
903initiate the connection by creating the AnyEvent::Handle object (which
904calls C<tcp_connect> for us) and leave everything else to it's callback.
905
906The handle object is created, unsurprisingly, by calling the C<new>
907method of L<AnyEvent::Handle>:
908
909 my $handle; $handle = new AnyEvent::Handle
910 connect => [$host => 'http'],
697 on_error => sub { 911 on_error => sub {
698 print "error while reading from STDIN: $!\n"; 912 $cb->("HTTP/1.0 500 $!");
699 $end_prog->broadcast; 913 $handle->destroy; # explicitly destroy handle
700 } 914 },
915 on_eof => sub {
916 $cb->($response, $header, $body);
917 $handle->destroy; # explicitly destroy handle
918 };
919
920The C<connect> argument tells AnyEvent::Handle to call C<tcp_connect> for
921the specified host and service/port.
922
923The C<on_error> callback will be called on any unexpected error, such as a
924refused connection, or unexpected connection while reading the header.
925
926Instead of having an extra mechanism to signal errors, connection errors
927are signalled by crafting a special "response status line", like this:
928
929 HTTP/1.0 500 Connection refused
930
931This means the caller cannot distinguish (easily) between
932locally-generated errors and server errors, but it simplifies error
933handling for the caller a lot.
934
935The error callback also destroys the handle explicitly, because we are not
936interested in continuing after any errors. In AnyEvent::Handle callbacks
937you have to call C<destroy> explicitly to destroy a handle. Outside of
938those callbacks you cna just forget the object reference and it will be
939automatically cleaned up.
940
941Last not least, we set an C<on_eof> callback that is called when the
942other side indicates it has stopped writing data, which we will use to
943gracefully shut down the handle and report the results. This callback is
944only called when the read queue is empty - if the read queue expects some
945data and the handle gets an EOF from the other side this will be an error
946- after all, you did expect more to come.
947
948If you wanted to write a server using AnyEvent::Handle, you would use
949C<tcp_accept> and then create the AnyEvent::Handle with the C<fh>
950argument.
951
952=head3 The write queue
953
954The next line sends the actual request:
955
956 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
957
958No headers will be sent (this is fine for simple requests), so the whole
959request is just a single line followed by an empty line to signal the end
960of the headers to the server.
961
962The more interesting question is why the method is called C<push_write>
963and not just write. The reason is that you can I<always> add some write
964data without blocking, and to do this, AnyEvent::Handle needs some write
965queue internally - and C<push_write> simply pushes some data onto the end
966of that queue, just like Perl's C<push> pushes data onto the end of an
967array.
968
969The deeper reason is that at some point in the future, there might
970be C<unshift_write> as well, and in any case, we will shortly meet
971C<push_read> and C<unshift_read>, and it's usually easiest to remember if
972all those functions have some symmetry in their name. So C<push> is used
973as the opposite of C<unshift> in AnyEvent::Handle, not as the opposite of
974C<pull> - just like in Perl.
975
976Note that we call C<push_write> right after creating the AnyEvent::Handle
977object, before it has had time to actually connect to the server. This is
978fine, pushing the read and write requests will simply queue them in the
979handle object until the connection has been established. Alternatively, we
980could do this "on demand" in the C<on_connect> callback.
981
982If C<push_write> is called with more than one argument, then you can even
983do I<formatted> I/O, which simply means your data will be transformed in
984some ways. For example, this would JSON-encode your data before pushing it
985to the write queue:
986
987 $handle->push_write (json => [1, 2, 3]);
988
989Apart from that, this pretty much summarises the write queue, there is
990little else to it.
991
992Reading the response is far more interesting, because it involves the more
993powerful and complex I<read queue>:
994
995=head3 The read queue
996
997The response consists of three parts: a single line with the response
998status, a single paragraph of headers ended by an empty line, and the
999request body, which is simply the remaining data on that connection.
1000
1001For the first two, we push two read requests onto the read queue:
1002
1003 # now fetch response status line
1004 $handle->push_read (line => sub {
1005 my ($handle, $line) = @_;
1006 $response = $line;
1007 });
1008
1009 # then the headers
1010 $handle->push_read (line => "\015\012\015\012", sub {
1011 my ($handle, $line) = @_;
1012 $header = $line;
1013 });
1014
1015While one can simply push a single callback to parse the data the
1016queue, I<formatted> I/O really comes to our advantage here, as there
1017is a ready-made "read line" read type. The first read expects a single
1018line, ended by C<\015\012> (the standard end-of-line marker in internet
1019protocols).
1020
1021The second "line" is actually a single paragraph - instead of reading it
1022line by line we tell C<push_read> that the end-of-line marker is really
1023C<\015\012\015\012>, which is an empty line. The result is that the whole
1024header paragraph will be treated as a single line and read. The word
1025"line" is interpreted very freely, much like Perl itself does it.
1026
1027Note that push read requests are pushed immediately after creating the
1028handle object - since AnyEvent::Handle provides a queue we can push as
1029many requests as we want, and AnyEvent::Handle will handle them in order.
1030
1031There is, however, no read type for "the remaining data". For that, we
1032install our own C<on_read> callback:
1033
1034 # and finally handle any remaining data as body
1035 $handle->on_read (sub {
1036 $body .= $_[0]->rbuf;
1037 $_[0]->rbuf = "";
1038 });
1039
1040This callback is invoked every time data arrives and the read queue is
1041empty - which in this example will only be the case when both response and
1042header have been read. The C<on_read> callback could actually have been
1043specified when constructing the object, but doing it this way preserves
1044logical ordering.
1045
1046The read callback simply adds the current read buffer to it's C<$body>
1047variable and, most importantly, I<empties> the buffer by assigning the
1048empty string to it.
1049
1050After AnyEvent::Handle has been so instructed, it will handle incoming
1051data according to these instructions - if all goes well, the callback will
1052be invoked with the response data, if not, it will get an error.
1053
1054In general, you can implement pipelining (a semi-advanced feature of many
1055protocols) very easy with AnyEvent::Handle: If you have a protocol with a
1056request/response structure, your request methods/functions will all look
1057like this (simplified):
1058
1059 sub request {
1060
1061 # send the request to the server
1062 $handle->push_write (...);
1063
1064 # push some response handlers
1065 $handle->push_read (...);
1066 }
1067
1068This means you can queue as many requests as you want, and while
1069AnyEvent::Handle goes through its read queue to handle the response data,
1070the other side can work on the next request - queueing the request just
1071appends some data to the write queue and installs a handler to be called
1072later.
1073
1074You might ask yourself how to handle decisions you can only make I<after>
1075you have received some data (such as handling a short error response or a
1076long and differently-formatted response). The answer to this problem is
1077C<unshift_read>, which we will introduce together with an example in the
1078coming sections.
1079
1080=head3 Using C<http_get>
1081
1082Finally, here is how you would use C<http_get>:
1083
1084 http_get "www.google.com", "/", sub {
1085 my ($response, $header, $body) = @_;
1086
1087 print
1088 $response, "\n",
1089 $body;
1090 };
1091
1092And of course, you can run as many of these requests in parallel as you
1093want (and your memory supports).
1094
1095=head3 HTTPS
1096
1097Now, as promised, let's implement the same thing for HTTPS, or more
1098correctly, let's change our C<http_get> function into a function that
1099speaks HTTPS instead.
1100
1101HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer
1102B<S>ecurity is the official name for what most people refer to as C<SSL>)
1103that contains standard HTTP protocol exchanges. The only other difference
1104to HTTP is that by default it uses port C<443> instead of port C<80>.
1105
1106To implement these two differences we need two tiny changes, first, in the
1107C<connect> parameter, we replace C<http> by C<https> to connect to the
1108https port:
1109
1110 connect => [$host => 'https'],
1111
1112The other change deals with TLS, which is something L<AnyEvent::Handle>
1113does for us, as long as I<you> made sure that the L<Net::SSLeay> module
1114is around. To enable TLS with L<AnyEvent::Handle>, we simply pass an
1115additional C<tls> parameter to the call to C<AnyEvent::Handle::new>:
1116
1117 tls => "connect",
1118
1119Specifying C<tls> enables TLS, and the argument specifies whether
1120AnyEvent::Handle is the server side ("accept") or the client side
1121("connect") for the TLS connection, as unlike TCP, there is a clear
1122server/client relationship in TLS.
1123
1124That's all.
1125
1126Of course, all this should be handled transparently by C<http_get>
1127after parsing the URL. If you need this, see the part about exercising
1128your inspiration earlier in this document. You could also use the
1129L<AnyEvent::HTTP> module from CPAN, which implements all this and works
1130around a lot of quirks for you, too.
1131
1132=head3 The read queue - revisited
1133
1134HTTP always uses the same structure in its responses, but many protocols
1135require parsing responses differently depending on the response itself.
1136
1137For example, in SMTP, you normally get a single response line:
1138
1139 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1140
1141But SMTP also supports multi-line responses:
1142
1143 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1144 220-hey guys
1145 220 my response is longer than yours
1146
1147To handle this, we need C<unshift_read>. As the name (hopefully) implies,
1148C<unshift_read> will not append your read request to the end of the read
1149queue, but instead it will prepend it to the queue.
1150
1151This is useful in the situation above: Just push your response-line read
1152request when sending the SMTP command, and when handling it, you look at
1153the line to see if more is to come, and C<unshift_read> another reader
1154callback if required, like this:
1155
1156 my $response; # response lines end up in here
1157
1158 my $read_response; $read_response = sub {
1159 my ($handle, $line) = @_;
1160
1161 $response .= "$line\n";
1162
1163 # check for continuation lines ("-" as 4th character")
1164 if ($line =~ /^...-/) {
1165 # if yes, then unshift another line read
1166 $handle->unshift_read (line => $read_response);
1167
1168 } else {
1169 # otherwise we are done
1170
1171 # free callback
1172 undef $read_response;
1173
1174 print "we are don reading: $response\n";
701 ); 1175 }
1176 };
1177
1178 $handle->push_read (line => $read_response);
1179
1180This recipe can be used for all similar parsing problems, for example in
1181NNTP, the response code to some commands indicates that more data will be
1182sent:
1183
1184 $handle->push_write ("article 42");
1185
1186 # read response line
1187 $handle->push_read (line => sub {
1188 my ($handle, $status) = @_;
1189
1190 # article data following?
1191 if ($status =~ /^2/) {
1192 # yes, read article body
1193
1194 $handle->unshift_read (line => "\012.\015\012", sub {
1195 my ($handle, $body) = @_;
1196
1197 $finish->($status, $body);
1198 });
1199
1200 } else {
1201 # some error occured, no article data
1202
1203 $finish->($status);
1204 }
1205 }
1206
1207=head3 Your own read queue handler
1208
1209Sometimes, your protocol doesn't play nice and uses lines or chunks of
1210data not formatted in a way handled by AnyEvent::Handle out of the box. In
1211this case you have to implement your own read parser.
1212
1213To make up a contorted example, imagine you are looking for an even
1214number of characters followed by a colon (":"). Also imagine that
1215AnyEvent::Handle had no C<regex> read type which could be used, so you'd
1216had to do it manually.
1217
1218To implement a read handler for this, you would C<push_read> (or
1219C<unshift_read>) just a single code reference.
1220
1221This code reference will then be called each time there is (new) data
1222available in the read buffer, and is expected to either successfully
1223eat/consume some of that data (and return true) or to return false to
1224indicate that it wants to be called again.
1225
1226If the code reference returns true, then it will be removed from the
1227read queue (because it has parsed/consumed whatever it was supposed to
1228consume), otherwise it stays in the front of it.
1229
1230The example above could be coded like this:
702 1231
703 $handle->push_read (sub { 1232 $handle->push_read (sub {
704 my ($handle) = @_; 1233 my ($handle) = @_;
705 1234
706 if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { 1235 # check for even number of characters + ":"
707 print "got 'end', existing...\n"; 1236 # and remove the data if a match is found.
708 $end_prog->broadcast; 1237 # if not, return false (actually nothing)
1238
1239 $handle->{rbuf} =~ s/^( (?:..)* ) ://x
709 return 1 1240 or return;
1241
1242 # we got some data in $1, pass it to whoever wants it
1243 $finish->($1);
1244
1245 # and return true to indicate we are done
710 } 1246 1
711
712 0
713 }); 1247 });
714 1248
715 $end_prog->recv; 1249This concludes our little tutorial.
716 1250
717That's a mouthful, so let's go through it step by step: 1251=head1 Where to go from here?
718 1252
719 #!/usr/bin/perl 1253This introduction should have explained the key concepts of L<AnyEvent>
1254- event watchers and condition variables, L<AnyEvent::Socket> - basic
1255networking utilities, and L<AnyEvent::Handle> - a nice wrapper around
1256handles.
720 1257
721 use AnyEvent; 1258You could either start coding stuff right away, look at those manual
722 use AnyEvent::Handle; 1259pages for the gory details, or roam CPAN for other AnyEvent modules (such
1260as L<AnyEvent::IRC> or L<AnyEvent::HTTP>) to see more code examples (or
1261simply to use them).
723 1262
724Nothing unexpected here, just load AnyEvent for the event functionality 1263If you need a protocol that doesn't have an implementation using AnyEvent,
725and AnyEvent::Handle for your file handling needs. 1264remember that you can mix AnyEvent with one other event framework, such as
1265L<POE>, so you can always use AnyEvent for your own tasks plus modules of
1266one other event framework to fill any gaps.
726 1267
727 my $end_prog = AnyEvent->condvar; 1268And last not least, you could also look at L<Coro>, especially
1269L<Coro::AnyEvent>, to see how you can turn event-based programming from
1270callback style back to the usual imperative style (also called "inversion
1271of control" - AnyEvent calls I<you>, but Coro lets I<you> call AnyEvent).
728 1272
729Here the program creates a so-called 'condition variable': Condition 1273=head1 Authors
730variables are a great way to signal the completion of some event, or to
731state that some condition became true (thus the name).
732
733This condition variable represents the condition that the program wants to
734terminate. Later in the program, we will 'recv' that condition (call the
735C<recv> method on it), which will wait until the condition gets signalled
736(which is done by calling the C<send> method on it).
737
738The next step is to create the handle object:
739
740 my $handle =
741 AnyEvent::Handle->new (
742 fh => \*STDIN,
743 on_eof => sub {
744 print "received EOF, exiting...\n";
745 $end_prog->broadcast;
746 },
747
748This handle object will read from standard input. Setting the C<on_eof>
749callback should be done for every file handle, as that is a condition that
750we always need to check for when working with file handles, to prevent
751reading or writing to a closed file handle, or getting stuck indefinitely
752in case of an error.
753
754Speaking of errors:
755
756 on_error => sub {
757 print "error while reading from STDIN: $!\n";
758 $end_prog->broadcast;
759 }
760 );
761
762The C<on_error> callback is also not required, but we set it here in case
763any error happens when we read from the file handle. It is usually a good
764idea to set this callback and at least print some diagnostic message: Even
765in our small example an error can happen. More on this later...
766
767 $handle->push_read (sub {
768
769Next we push a general read callback on the read queue, which
770will wait until we have received all the data we wanted to
771receive. L<AnyEvent::Handle> has two queues per file handle, a read and a
772write queue. The write queue queues pending data that waits to be written
773to the file handle. And the read queue queues reading callbacks. For more
774details see the documentation L<AnyEvent::Handle> about the READ QUEUE and
775WRITE QUEUE.
776
777 my ($handle) = @_;
778
779 if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) {
780 print "got 'end', existing...\n";
781 $end_prog->broadcast;
782 return 1
783 }
784
785 0
786 });
787
788The actual callback waits until the word 'end' has been seen in the data
789received on standard input. Once we encounter the stop word 'end' we
790remove everything from the read buffer and call the condition variable
791we setup earlier, that signals our 'end of program' condition. And the
792callback returns with a true value, that signals we are done with reading
793all the data we were interested in (all data until the word 'end' has been
794seen).
795
796In all other cases, when the stop word has not been seen yet, we just
797return a false value, to indicate that we are not finished yet.
798
799The C<rbuf> method returns our read buffer, that we can directly modify as
800lvalue. Alternatively we also could have written:
801
802 if ($handle->{rbuf} =~ s/^.*?\bend\b.*$//s) {
803
804The last line will wait for the condition that our program wants to exit:
805
806 $end_prog->recv;
807
808The call to C<recv> will setup an event loop for us and wait for IO, timer
809or signal events and will handle them until the condition gets sent (by
810calling its C<send> method).
811
812The key points to learn from this example are:
813
814=over 4
815
816=item * Condition variables are used to start an event loop.
817
818=item * How to registering some basic callbacks on AnyEvent::Handle's.
819
820=item * How to process data in the read buffer.
821
822=back
823
824=head1 AUTHORS
825 1274
826Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. 1275Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.
827 1276

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines