ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent/Intro.pod
(Generate patch)

Comparing AnyEvent/lib/AnyEvent/Intro.pod (file contents):
Revision 1.11 by root, Mon Jun 2 09:10:38 2008 UTC vs.
Revision 1.26 by root, Thu Dec 24 10:07:06 2009 UTC

1=head1 NAME
2
3AnyEvent::Intro - an introductory tutorial to AnyEvent
4
1=head1 Introduction to AnyEvent 5=head1 Introduction to AnyEvent
2 6
3This is a tutorial that will introduce you to the features of AnyEvent. 7This is a tutorial that will introduce you to the features of AnyEvent.
4 8
5The first part introduces the core AnyEvent module (after swamping you a 9The first part introduces the core AnyEvent module (after swamping you a
6bit in evangelism), which might already provide all you ever need. 10bit in evangelism), which might already provide all you ever need: If you
11are only interested in AnyEvent's event handling capabilities, read no
12further.
7 13
8The second part focuses on network programming using sockets, for which 14The second part focuses on network programming using sockets, for which
9AnyEvent offers a lot of support you can use. 15AnyEvent offers a lot of support you can use, and a lot of workarounds
16around portability quirks.
10 17
11 18
12=head1 What is AnyEvent? 19=head1 What is AnyEvent?
13 20
14If you don't care for the whys and want to see code, skip this section! 21If you don't care for the whys and want to see code, skip this section!
16AnyEvent is first of all just a framework to do event-based 23AnyEvent is first of all just a framework to do event-based
17programming. Typically such frameworks are an all-or-nothing thing: If you 24programming. Typically such frameworks are an all-or-nothing thing: If you
18use one such framework, you can't (easily, or even at all) use another in 25use one such framework, you can't (easily, or even at all) use another in
19the same program. 26the same program.
20 27
21AnyEvent is different - it is a thin abstraction layer above all kinds 28AnyEvent is different - it is a thin abstraction layer on top of other of
29event loops, just like DBI is an abstraction of many different database
22of event loops. Its main purpose is to move the choice of the underlying 30APIs. Its main purpose is to move the choice of the underlying framework
23framework (the event loop) from the module author to the program author 31(the event loop) from the module author to the program author using the
24using the module. 32module.
25 33
26That means you can write code that uses events to control what it 34That means you can write code that uses events to control what it
27does, without forcing other code in the same program to use the same 35does, without forcing other code in the same program to use the same
28underlying framework as you do - i.e. you can create a Perl module 36underlying framework as you do - i.e. you can create a Perl module
29that is event-based using AnyEvent, and users of that module can still 37that is event-based using AnyEvent, and users of that module can still
30choose between using L<Gtk2>, L<Tk>, L<Event> or no event loop at 38choose between using L<Gtk2>, L<Tk>, L<Event> (or run inside Irssi or
31all: AnyEvent comes with its own event loop implementation, so your 39rxvt-unicode) or any other supported event loop. AnyEvent even comes with
32code works regardless of other modules that might or might not be 40its own pure-perl event loop implementation, so your code works regardless
33installed. The latter is important, as AnyEvent does not have any 41of other modules that might or might not be installed. The latter is
34dependencies to other modules, which makes it easy to install, for 42important, as AnyEvent does not have any hard dependencies to other
35example, when you lack a C compiler. 43modules, which makes it easy to install, for example, when you lack a C
44compiler. No mater what environment, AnyEvent will just cope with it.
36 45
37A typical problem with Perl modules such as L<Net::IRC> is that they 46A typical limitation of existing Perl modules such as L<Net::IRC> is that
38come with their own event loop: In L<Net::IRC>, the program who uses it 47they come with their own event loop: In L<Net::IRC>, the program who uses
39needs to start the event loop of L<Net::IRC>. That means that one cannot 48it needs to start the event loop of L<Net::IRC>. That means that one
40integrate this module into a L<Gtk2> GUI for instance, as that module, 49cannot integrate this module into a L<Gtk2> GUI for instance, as that
41too, enforces the use of its own event loop (namely L<Glib>). 50module, too, enforces the use of its own event loop (namely L<Glib>).
42 51
43Another example is L<LWP>: it provides no event interface at all. It's a 52Another example is L<LWP>: it provides no event interface at all. It's
44pure blocking HTTP (and FTP etc.) client library, which usually means that 53a pure blocking HTTP (and FTP etc.) client library, which usually means
45you either have to start a thread or have to fork for a HTTP request, or 54that you either have to start another process or have to fork for a HTTP
46use L<Coro::LWP>, if you want to do something else while waiting for the 55request, or use threads (e.g. L<Coro::LWP>), if you want to do something
47request to finish. 56else while waiting for the request to finish.
48 57
49The motivation behind these designs is often that a module doesn't want to 58The motivation behind these designs is often that a module doesn't want
50depend on some complicated XS-module (Net::IRC), or that it doesn't want 59to depend on some complicated XS-module (Net::IRC), or that it doesn't
51to force the user to use some specific event loop at all (LWP). 60want to force the user to use some specific event loop at all (LWP), out
61of fear of severly limiting the usefulness of the module: If your module
62requires Glib, it will not run in a Tk program.
52 63
53L<AnyEvent> solves this dilemma, by B<not> forcing module authors to either 64L<AnyEvent> solves this dilemma, by B<not> forcing module authors to
65either:
54 66
55=over 4 67=over 4
56 68
57=item - write their own event loop (because guarantees to offer one 69=item - write their own event loop (because it guarantees the availability
58everywhere - even on windows). 70of an event loop everywhere - even on windows with no extra modules
71installed).
59 72
60=item - choose one fixed event loop (because AnyEvent works with all 73=item - choose one specific event loop (because AnyEvent works with most
61important event loops available for Perl, and adding others is trivial). 74event loops available for Perl).
62 75
63=back 76=back
64 77
65If the module author uses L<AnyEvent> for all his event needs (IO events, 78If the module author uses L<AnyEvent> for all his (or her) event needs
66timers, signals, ...) then all other modules can just use his module and 79(IO events, timers, signals, ...) then all other modules can just use
67don't have to choose an event loop or adapt to his event loop. The choice 80his module and don't have to choose an event loop or adapt to his event
68of the event loop is ultimately made by the program author who uses all 81loop. The choice of the event loop is ultimately made by the program
69the modules and writes the main program. And even there he doesn't have to 82author who uses all the modules and writes the main program. And even
70choose, he can just let L<AnyEvent> choose the best available event loop 83there he doesn't have to choose, he can just let L<AnyEvent> choose the
71for him. 84most efficient event loop available on the system.
72 85
73Read more about this in the main documentation of the L<AnyEvent> module. 86Read more about this in the main documentation of the L<AnyEvent> module.
74 87
75 88
76=head1 Introduction to Event-Based Programming 89=head1 Introduction to Event-Based Programming
101 ); 114 );
102 115
103 # do something else here 116 # do something else here
104 117
105Looks more complicated, and surely is, but the advantage of using events 118Looks more complicated, and surely is, but the advantage of using events
106is that your program can do something else instead of waiting for 119is that your program can do something else instead of waiting for input
120(side note: combining AnyEvent with a thread package such as Coro can
121recoup much of the simplicity, effectively getting the best of two
122worlds).
123
107input. Waiting as in the first example is also called "blocking" because 124Waiting as done in the first example is also called "blocking" the process
108you "block" your process from executing anything else while you do so. 125because you "block"/keep your process from executing anything else while
126you do so.
109 127
110The second example avoids blocking, by only registering interest in a read 128The second example avoids blocking by only registering interest in a read
111event, which is fast and doesn't block your process. Only when read data 129event, which is fast and doesn't block your process. Only when read data
112is available will the callback be called, which can then proceed to read 130is available will the callback be called, which can then proceed to read
113the data. 131the data.
114 132
115The "interest" is represented by an object returned by C<< AnyEvent->io 133The "interest" is represented by an object returned by C<< AnyEvent->io
116>> called a "watcher" object - called like that because it "watches" your 134>> called a "watcher" object - called like that because it "watches" your
117file handle (or other event sources) for the event you are interested in. 135file handle (or other event sources) for the event you are interested in.
118 136
119In the example above, we create an I/O watcher by calling the C<< 137In the example above, we create an I/O watcher by calling the C<<
120AnyEvent->io >> method. Disinterest in some event is simply expressed by 138AnyEvent->io >> method. Disinterest in some event is simply expressed
121forgetting about the watcher, for example, by C<undef>'ing the variable it 139by forgetting about the watcher, for example, by C<undef>'ing the only
122is stored in. AnyEvent will automatically clean up the watcher if it is no 140variable it is stored in. AnyEvent will automatically clean up the watcher
123longer used, much like Perl closes your file handles if you no longer use 141if it is no longer used, much like Perl closes your file handles if you no
124them anywhere. 142longer use them anywhere.
143
144=head3 A short note on callbacks
145
146A common issue that hits people is the problem of passing parameters
147to callbacks. Programmers used to languages such as C or C++ are often
148used to a style where one passes the address of a function (a function
149reference) and some data value, e.g.:
150
151 sub callback {
152 my ($arg) = @_;
153
154 $arg->method;
155 }
156
157 my $arg = ...;
158
159 call_me_back_later \&callback, $arg;
160
161This is clumsy, as the place where behaviour is specified (when the
162callback is registered) is often far away from the place where behaviour
163is implemented. It also doesn't use Perl syntax to invoke the code. There
164is also an abstraction penalty to pay as one has to I<name> the callback,
165which often is unnecessary and leads to nonsensical or duplicated names.
166
167In Perl, one can specify behaviour much more directly by using
168I<closures>. Closures are code blocks that take a reference to the
169enclosing scope(s) when they are created. This means lexical variables in
170scope at the time of creating the closure can simply be used inside the
171closure:
172
173 my $arg = ...;
174
175 call_me_back_later sub { $arg->method };
176
177Under most circumstances, closures are faster, use fewer resources and
178result in much clearer code then the traditional approach. Faster,
179because parameter passing and storing them in local variables in Perl
180is relatively slow. Fewer resources, because closures take references
181to existing variables without having to create new ones, and clearer
182code because it is immediately obvious that the second example calls the
183C<method> method when the callback is invoked.
184
185Apart from these, the strongest argument for using closures with AnyEvent
186is that AnyEvent does not allow passing parameters to the callback, so
187closures are the only way to achieve that in most cases :->
188
189
190=head3 A hint on debugging
191
192AnyEvent does, by default, not do any argument checking. This can lead to
193strange and unexpected results especially if you are trying to learn your
194ways with AnyEvent.
195
196AnyEvent supports a special "strict" mode - off by default - which does very
197strict argument checking, at the expense of being somewhat slower. During
198development, however, this mode is very useful.
199
200You can enable this strict mode either by having an environment variable
201C<PERL_ANYEVENT_STRICT> with a true value in your environment:
202
203 PERL_ANYEVENT_STRICT=1 perl test.pl
204
205Or you can write C<use AnyEvent::Strict> in your program, which has the
206same effect (do not do this in production, however).
207
125 208
126=head2 Condition Variables 209=head2 Condition Variables
127 210
128However, the above is not a fully working program, and will not work 211Back to the I/O watcher example: The code is not yet a fully working
129as-is. The reason is that your callback will not be invoked out of the 212program, and will not work as-is. The reason is that your callback will
130blue, you have to run the event loop. Also, event-based programs sometimes 213not be invoked out of the blue, you have to run the event loop. Also,
131have to block, too, as when there simply is nothing else to do and 214event-based programs sometimes have to block, too, as when there simply is
132everything waits for some events, it needs to block the process as well. 215nothing else to do and everything waits for some events, it needs to block
216the process as well until new events arrive.
133 217
134In AnyEvent, this is done using condition variables. Condition variables 218In AnyEvent, this is done using condition variables. Condition variables
135are named "condition variables" because they represent a condition that is 219are named "condition variables" because they represent a condition that is
136initially false and needs to be fulfilled. 220initially false and needs to be fulfilled.
137 221
139or even callbacks and many other things (and they are often called like 223or even callbacks and many other things (and they are often called like
140this in other frameworks). The important point is that you can create them 224this in other frameworks). The important point is that you can create them
141freely and later wait for them to become true. 225freely and later wait for them to become true.
142 226
143Condition variables have two sides - one side is the "producer" of the 227Condition variables have two sides - one side is the "producer" of the
144condition (whatever code detects the condition), the other side is the 228condition (whatever code detects and flags the condition), the other side
145"consumer" (the code that waits for that condition). 229is the "consumer" (the code that waits for that condition).
146 230
147In our example in the previous section, the producer is the event callback 231In our example in the previous section, the producer is the event callback
148and there is no consumer yet - let's change that now: 232and there is no consumer yet - let's change that right now:
149 233
150 use AnyEvent; 234 use AnyEvent;
151 235
152 $| = 1; print "enter your name> "; 236 $| = 1; print "enter your name> ";
153 237
174 print "your name is $name\n"; 258 print "your name is $name\n";
175 259
176This program creates an AnyEvent condvar by calling the C<< 260This program creates an AnyEvent condvar by calling the C<<
177AnyEvent->condvar >> method. It then creates a watcher as usual, but 261AnyEvent->condvar >> method. It then creates a watcher as usual, but
178inside the callback it C<send>'s the C<$name_ready> condition variable, 262inside the callback it C<send>'s the C<$name_ready> condition variable,
179which causes anybody waiting on it to continue. 263which causes whoever is waiting on it to continue.
180 264
181The "anybody" in this case is the code that follows, which calls C<< 265The "whoever" in this case is the code that follows, which calls C<<
182$name_ready->recv >>: The producer calls C<send>, the consumer calls 266$name_ready->recv >>: The producer calls C<send>, the consumer calls
183C<recv>. 267C<recv>.
184 268
185If there is no C<$name> available yet, then the call to C<< 269If there is no C<$name> available yet, then the call to C<<
186$name_ready->recv >> will halt your program until the condition becomes 270$name_ready->recv >> will halt your program until the condition becomes
196 280
197 my $name_ready = AnyEvent->condvar; 281 my $name_ready = AnyEvent->condvar;
198 282
199 my $wait_for_input = AnyEvent->io ( 283 my $wait_for_input = AnyEvent->io (
200 fh => \*STDIN, poll => "r", 284 fh => \*STDIN, poll => "r",
201 cb => sub { $name_ready->send (scalar = <STDIN>) } 285 cb => sub { $name_ready->send (scalar <STDIN>) }
202 ); 286 );
203 287
204 # do something else here 288 # do something else here
205 289
206 # now wait and fetch the name 290 # now wait and fetch the name
268This also shows that AnyEvent is quite flexible - you didn't have anything 352This also shows that AnyEvent is quite flexible - you didn't have anything
269to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just 353to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just
270worked. 354worked.
271 355
272Admittedly, the example is a bit silly - who would want to read names 356Admittedly, the example is a bit silly - who would want to read names
273form standard input in a Gtk+ application. But imagine that instead of 357from standard input in a Gtk+ application. But imagine that instead of
274doing that, you would make a HTTP request in the background and display 358doing that, you would make a HTTP request in the background and display
275it's results. In fact, with event-based programming you can make many 359it's results. In fact, with event-based programming you can make many
276http-requests in parallel in your program and still provide feedback to 360http-requests in parallel in your program and still provide feedback to
277the user and stay interactive. 361the user and stay interactive.
278 362
279In the next part you will see how to do just that - by implementing an 363And in the next part you will see how to do just that - by implementing an
280HTTP request, on our own, with the utility modules AnyEvent comes with. 364HTTP request, on our own, with the utility modules AnyEvent comes with.
281 365
282Before that, however, let's briefly look at how you would write your 366Before that, however, let's briefly look at how you would write your
283program with using only AnyEvent, without ever calling some other event 367program with using only AnyEvent, without ever calling some other event
284loop's run function. 368loop's run function.
285 369
286In the example using condition variables, we used that, and in fact, this 370In the example using condition variables, we used those to start waiting
287is the solution: 371for events, and in fact, condition variables are the solution:
288 372
289 my $quit_program = AnyEvent->condvar; 373 my $quit_program = AnyEvent->condvar;
290 374
291 # create AnyEvent watchers (or not) here 375 # create AnyEvent watchers (or not) here
292 376
293 $quit_program->recv; 377 $quit_program->recv;
294 378
295If any of your watcher callbacks decide to quit, they can simply call 379If any of your watcher callbacks decide to quit (this is often
380called an "unloop" in other frameworks), they can simply call C<<
296C<< $quit_program->send >>. Of course, they could also decide not to and 381$quit_program->send >>. Of course, they could also decide not to and
297simply call C<exit> instead, or they could decide not to quit, ever (e.g. 382simply call C<exit> instead, or they could decide not to quit, ever (e.g.
298in a long-running daemon program). 383in a long-running daemon program).
299 384
300In that case, you can simply use: 385If you don't need some clean quit functionality and just want to run the
386event loop, you can simply do this:
301 387
302 AnyEvent->condvar->recv; 388 AnyEvent->condvar->recv;
303 389
304And this is, in fact, closest to the idea of a main loop run function that 390And this is, in fact, closest to the idea of a main loop run function that
305AnyEvent offers. 391AnyEvent offers.
306 392
307=head2 Timers and other event sources 393=head2 Timers and other event sources
308 394
309So far, we have only used I/O watchers. These are useful mainly to find 395So far, we have only used I/O watchers. These are useful mainly to find
310out whether a Socket has data to read, or space to write more data. On sane 396out whether a socket has data to read, or space to write more data. On sane
311operating systems this also works for console windows/terminals (typically 397operating systems this also works for console windows/terminals (typically
312on standard input), serial lines, all sorts of other devices, basically 398on standard input), serial lines, all sorts of other devices, basically
313almost everything that has a file descriptor but isn't a file itself. (As 399almost everything that has a file descriptor but isn't a file itself. (As
314usual, "sane" excludes windows - on that platform you would need different 400usual, "sane" excludes windows - on that platform you would need different
315functions for all of these, complicating code immensely - think "socket 401functions for all of these, complicating code immensely - think "socket
337 423
338 # now wait till our time has come 424 # now wait till our time has come
339 $cv->recv; 425 $cv->recv;
340 426
341Unlike I/O watchers, timers are only interested in the amount of seconds 427Unlike I/O watchers, timers are only interested in the amount of seconds
342they have to wait. When that amount of time has passed, AnyEvent will 428they have to wait. When (at least) that amount of time has passed,
343invoke your callback. 429AnyEvent will invoke your callback.
344 430
345Unlike I/O watchers, which will call your callback as many times as there 431Unlike I/O watchers, which will call your callback as many times as there
346is data available, timers are one-shot: after they have "fired" once and 432is data available, timers are normally one-shot: after they have "fired"
347invoked your callback, they are dead and no longer do anything. 433once and invoked your callback, they are dead and no longer do anything.
348 434
349To get a repeating timer, such as a timer firing roughly once per second, 435To get a repeating timer, such as a timer firing roughly once per second,
350you have to recreate it: 436you can specify an C<interval> parameter:
351 437
352 use AnyEvent; 438 my $once_per_second = AnyEvent->timer (
353 439 after => 0, # first invoke ASAP
354 my $time_watcher; 440 interval => 1, # then invoke every second
355 441 cb => sub { # the callback to invoke
356 sub once_per_second { 442 $cv->send;
357 print "tick\n";
358 443 },
359 # (re-)create the watcher
360 $time_watcher = AnyEvent->timer (
361 after => 1,
362 cb => \&once_per_second,
363 ); 444 );
364 }
365
366 # now start the timer
367 once_per_second;
368
369Having to recreate your timer is a restriction put on AnyEvent that is
370present in most event libraries it uses. It is so annoying that some
371future version might work around this limitation, but right now, it's the
372only way to do repeating timers.
373
374Fortunately most timers aren't really repeating but specify timeouts of
375some sort.
376 445
377=head3 More esoteric sources 446=head3 More esoteric sources
378 447
379AnyEvent also has some other, more esoteric event sources you can tap 448AnyEvent also has some other, more esoteric event sources you can tap
380into: signal and child watchers. 449into: signal, child and idle watchers.
381 450
382Signal watchers can be used to wait for "signal events", which simply 451Signal watchers can be used to wait for "signal events", which simply
383means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>). 452means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>).
384 453
385Process watchers wait for a child process to exit. They are useful when 454Child-process watchers wait for a child process to exit. They are useful
386you fork a separate process and need to know when it exits, but you do not 455when you fork a separate process and need to know when it exits, but you
387wait for that by blocking. 456do not wait for that by blocking.
388 457
458Idle watchers invoke their callback when the event loop has handled all
459outstanding events, polled for new events and didn't find any, i.e., when
460your process is otherwise idle. They are useful if you want to do some
461non-trivial data processing that can be done when your program doesn't
462have anything better to do.
463
389Both watcher types are described in detail in the main L<AnyEvent> manual 464All these watcher types are described in detail in the main L<AnyEvent>
390page. 465manual page.
391 466
467Sometimes you also need to know what the current time is: C<<
468AnyEvent->now >> returns the time the event toolkit uses to schedule
469relative timers, and is usually what you want. It is often cached (which
470means it can be a bit outdated). In that case, you can use the more costly
471C<< AnyEvent->time >> method which will ask your operating system for the
472current time, which is slower, but also more up to date.
392 473
393=head1 Network programming and AnyEvent 474=head1 Network programming and AnyEvent
394 475
395So far you have seen how to register event watchers and handle events. 476So far you have seen how to register event watchers and handle events.
396 477
397This is a great foundation to write network clients and servers, and might be 478This is a great foundation to write network clients and servers, and might
398all that your module (or program) ever requires, but writing your own I/O 479be all that your module (or program) ever requires, but writing your own
399buffering again and again becomes tedious, not to mention that it attracts 480I/O buffering again and again becomes tedious, not to mention that it
400errors. 481attracts errors.
401 482
402While the core L<AnyEvent> module is still small and self-contained, 483While the core L<AnyEvent> module is still small and self-contained,
403the distribution comes with some very useful utility modules such as 484the distribution comes with some very useful utility modules such as
404L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can 485L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can
405make your life as non-blocking network programmer a lot easier. 486make your life as non-blocking network programmer a lot easier.
413a great way to do other DNS resolution tasks, such as reverse lookups of 494a great way to do other DNS resolution tasks, such as reverse lookups of
414IP addresses for log files. 495IP addresses for log files.
415 496
416=head2 L<AnyEvent::Handle> 497=head2 L<AnyEvent::Handle>
417 498
418This module handles non-blocking IO on file handles in an event based 499This module handles non-blocking IO on (socket-, pipe- etc.) file handles
419manner. It provides a wrapper object around your file handle that provides 500in an event based manner. It provides a wrapper object around your file
420queueing and buffering of incoming and outgoing data for you. 501handle that provides queueing and buffering of incoming and outgoing data
502for you.
421 503
422It also implements the most common data formats, such as text lines, or 504It also implements the most common data formats, such as text lines, or
423fixed and variable-width data blocks. 505fixed and variable-width data blocks.
424 506
425=head2 L<AnyEvent::Socket> 507=head2 L<AnyEvent::Socket>
443to your program? That C<WSAEINPROGRESS> means your C<connect> call was 525to your program? That C<WSAEINPROGRESS> means your C<connect> call was
444ignored instead of being in progress? AnyEvent::Socket works around all of 526ignored instead of being in progress? AnyEvent::Socket works around all of
445these Windows/Perl bugs for you). 527these Windows/Perl bugs for you).
446 528
447=head2 Implementing a parallel finger client with non-blocking connects 529=head2 Implementing a parallel finger client with non-blocking connects
530and AnyEvent::Socket
448 531
449The finger protocol is one of the simplest protocols in use on the 532The finger protocol is one of the simplest protocols in use on the
450internet. Or in use in the past, as almost nobody uses it anymore. 533internet. Or in use in the past, as almost nobody uses it anymore.
451 534
452It works by connecting to the finger port on another host, writing a 535It works by connecting to the finger port on another host, writing a
453single line with a user name and then reading the finger response, as 536single line with a user name and then reading the finger response, as
454specified by that user. OK, RFC 1288 specifies a vastly more complex 537specified by that user. OK, RFC 1288 specifies a vastly more complex
455protocol, but it basically boils down to this: 538protocol, but it basically boils down to this:
456 539
457 # telnet idsoftware.com finger 540 # telnet kernel.org finger
458 Trying 192.246.40.37... 541 Trying 204.152.191.37...
459 Connected to idsoftware.com (192.246.40.37). 542 Connected to kernel.org (204.152.191.37).
460 Escape character is '^]'. 543 Escape character is '^]'.
461 johnc 544
462 Welcome to id Software's Finger Service V1.5! 545 The latest stable version of the Linux kernel is: [...]
463
464 [...]
465 Now on the web:
466 [...]
467
468 Connection closed by foreign host. 546 Connection closed by foreign host.
469 547
470"Now on the web..." yeah, I<was> used indeed, but at least the finger 548So let's write a little AnyEvent function that makes a finger request:
471daemon still works, so let's write a little AnyEvent function that makes a
472finger request:
473 549
474 use AnyEvent; 550 use AnyEvent;
475 use AnyEvent::Socket; 551 use AnyEvent::Socket;
476 552
477 sub finger($$) { 553 sub finger($$) {
542socket handle as first argument, otherwise, nothing will be passed to our 618socket handle as first argument, otherwise, nothing will be passed to our
543callback. The important point is that it will always be called as soon as 619callback. The important point is that it will always be called as soon as
544the outcome of the TCP connect is known. 620the outcome of the TCP connect is known.
545 621
546This style of programming is also called "continuation style": the 622This style of programming is also called "continuation style": the
547"continuation" is simply the way the program continues - normally, a 623"continuation" is simply the way the program continues - normally at the
548program continues at the next line after some statement (the exception 624next line after some statement (the exception is loops or things like
549is loops or things like C<return>). When we are interested in events, 625C<return>). When we are interested in events, however, we instead specify
550however, we instead specify the "continuation" of our program by passing a 626the "continuation" of our program by passing a closure, which makes that
551closure, which makes that closure the "continuation" of the program. The 627closure the "continuation" of the program.
628
552C<tcp_connect> call is like saying "return now, and when the connection is 629The C<tcp_connect> call is like saying "return now, and when the
553established or it failed, continue there". 630connection is established or it failed, continue there".
554 631
555Now let's look at the callback/closure in more detail: 632Now let's look at the callback/closure in more detail:
556 633
557 # the callback receives the socket handle - or nothing 634 # the callback receives the socket handle - or nothing
558 my ($fh) = @_ 635 my ($fh) = @_
570report the results to anybody, certainly not the caller of our C<finger> 647report the results to anybody, certainly not the caller of our C<finger>
571function, and most event loops continue even after a C<die>! 648function, and most event loops continue even after a C<die>!
572 649
573This is why we instead C<return>, but also call C<< $cv->send >> without 650This is why we instead C<return>, but also call C<< $cv->send >> without
574any arguments to signal to the condvar consumer that something bad has 651any arguments to signal to the condvar consumer that something bad has
575happened. The return value of C<< $cv->send >> is irrelevant, as is the 652happened. The return value of C<< $cv->send >> is irrelevant, as is
576return value of our callback. The return statement is simply used for the 653the return value of our callback. The C<return> statement is simply
577side effect of, well, returning immediately from the callback. Checking 654used for the side effect of, well, returning immediately from the
578for errors and handling them this way is very common, which is why this 655callback. Checking for errors and handling them this way is very common,
579compact idiom is so handy. 656which is why this compact idiom is so handy.
580 657
581As the next step in the finger protocol, we send the username to the 658As the next step in the finger protocol, we send the username to the
582finger daemon on the other side of our connection: 659finger daemon on the other side of our connection (the kernel.org finger
660service doesn't actually wait for a username, but the net is running out
661of finger servers fast):
583 662
584 syswrite $fh, "$user\015\012"; 663 syswrite $fh, "$user\015\012";
585 664
586Note that this isn't 100% clean socket programming - the socket could, 665Note that this isn't 100% clean socket programming - the socket could,
587for whatever reasons, not accept our data. When writing a small amount 666for whatever reasons, not accept our data. When writing a small amount
605variable, but in a local one - if the callback returns, it would normally 684variable, but in a local one - if the callback returns, it would normally
606destroy the variable and its contents, which would in turn unregister our 685destroy the variable and its contents, which would in turn unregister our
607watcher. 686watcher.
608 687
609To avoid that, we C<undef>ine the variable in the watcher callback. This 688To avoid that, we C<undef>ine the variable in the watcher callback. This
610means that, when the C<tcp_connect> callback returns, that perl thinks 689means that, when the C<tcp_connect> callback returns, perl thinks (quite
611(quite correctly) that the read watcher is still in use - namely in the 690correctly) that the read watcher is still in use - namely in the callback,
612callback. 691and thus keeps it alive even if nothing else in the program refers to it
692anymore (it is much like Baron Münchhausen keeping himself from dying by
693pulling himself out of a swamp).
613 694
614The trick, however, is that instead of: 695The trick, however, is that instead of:
615 696
616 my $read_watcher = AnyEvent->io (... 697 my $read_watcher = AnyEvent->io (...
617 698
636 my $len = sysread $fh, $response, 1024, length $response; 717 my $len = sysread $fh, $response, 1024, length $response;
637 718
638 if ($len <= 0) { 719 if ($len <= 0) {
639 720
640Note that C<sysread> has the ability to append data it reads to a scalar, 721Note that C<sysread> has the ability to append data it reads to a scalar,
641by specifying an offset, which is what we make good use of in this 722by specifying an offset, a feature of which we make good use of in this
642example. 723example.
643 724
644When C<sysread> indicates we are done, the callback C<undef>ines 725When C<sysread> indicates we are done, the callback C<undef>ines
645the watcher and then C<send>'s the response data to the condition 726the watcher and then C<send>'s the response data to the condition
646variable. All this has the following effects: 727variable. All this has the following effects:
660But the main advantage is that we can not only run this finger function in 741But the main advantage is that we can not only run this finger function in
661the background, we even can run multiple sessions in parallel, like this: 742the background, we even can run multiple sessions in parallel, like this:
662 743
663 my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets 744 my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets
664 my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 745 my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736
665 my $f3 = finger "johnc", "idsoftware.com"; # finger john 746 my $f3 = finger "hpa" , "kernel.org"; # finger hpa
666 747
667 print "trouble tickets:\n", $f1->recv, "\n"; 748 print "trouble tickets:\n" , $f1->recv, "\n";
668 print "trouble ticket #1736:\n", $f2->recv, "\n"; 749 print "trouble ticket #1736:\n", $f2->recv, "\n";
669 print "john carmacks finger file: ", $f3->recv, "\n"; 750 print "kernel release info: " , $f3->recv, "\n";
670 751
671It doesn't look like it, but in fact all three requests run in 752It doesn't look like it, but in fact all three requests run in
672parallel. The code waits for the first finger request to finish first, but 753parallel. The code waits for the first finger request to finish first, but
673that doesn't keep it from executing them parallel: when the first C<recv> 754that doesn't keep it from executing them parallel: when the first C<recv>
674call sees that the data isn't ready yet, it serves events for all three 755call sees that the data isn't ready yet, it serves events for all three
702How you implement it is a matter of taste - if you expect your function to 783How you implement it is a matter of taste - if you expect your function to
703be used mainly in an event-based program you would normally prefer to pass 784be used mainly in an event-based program you would normally prefer to pass
704a callback directly. If you write a module and expect your users to use 785a callback directly. If you write a module and expect your users to use
705it "synchronously" often (for example, a simple http-get script would not 786it "synchronously" often (for example, a simple http-get script would not
706really care much for events), then you would use a condition variable and 787really care much for events), then you would use a condition variable and
707tell them "simply ->recv the data". 788tell them "simply C<< ->recv >> the data".
708 789
709=head3 Problems with the implementation and how to fix them 790=head3 Problems with the implementation and how to fix them
710 791
711To make this example more real-world-ready, we would not only implement 792To make this example more real-world-ready, we would not only implement
712some write buffering (for the paranoid), but we would also have to handle 793some write buffering (for the paranoid, or maybe denial-of-service aware
713timeouts and maybe protocol errors. 794security expert), but we would also have to handle timeouts and maybe
795protocol errors.
714 796
715Doing this quickly gets unwieldy, which is why we introduce 797Doing this quickly gets unwieldy, which is why we introduce
716L<AnyEvent::Handle> in the next section, which takes care of all these 798L<AnyEvent::Handle> in the next section, which takes care of all these
717details for you and let's you concentrate on the actual protocol. 799details for you and let's you concentrate on the actual protocol.
718 800
719 801
720=head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle 802=head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle
721 803
722The L<AnyEvent::Handle> module has been hyped quite a bit so far, so let's 804The L<AnyEvent::Handle> module has been hyped quite a bit in this document
723see what it really offers. 805so far, so let's see what it really offers.
724 806
725As finger is such a simple protocol, let's try something slightly more 807As finger is such a simple protocol, let's try something slightly more
726complicated: HTTP/1.0. 808complicated: HTTP/1.0.
727 809
728An HTTP GET request works by sending a single request line that indicates 810An HTTP GET request works by sending a single request line that indicates
754The C<GET ...> and the empty line were entered manually, the rest of the 836The C<GET ...> and the empty line were entered manually, the rest of the
755telnet output is google's response, in which case a C<404 not found> one. 837telnet output is google's response, in which case a C<404 not found> one.
756 838
757So, here is how you would do it with C<AnyEvent::Handle>: 839So, here is how you would do it with C<AnyEvent::Handle>:
758 840
759###TODO 841 sub http_get {
842 my ($host, $uri, $cb) = @_;
843
844 # store results here
845 my ($response, $header, $body);
846
847 my $handle; $handle = new AnyEvent::Handle
848 connect => [$host => 'http'],
849 on_error => sub {
850 $cb->("HTTP/1.0 500 $!");
851 $handle->destroy; # explicitly destroy handle
852 },
853 on_eof => sub {
854 $cb->($response, $header, $body);
855 $handle->destroy; # explicitly destroy handle
856 };
857
858 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
859
860 # now fetch response status line
861 $handle->push_read (line => sub {
862 my ($handle, $line) = @_;
863 $response = $line;
864 });
865
866 # then the headers
867 $handle->push_read (line => "\015\012\015\012", sub {
868 my ($handle, $line) = @_;
869 $header = $line;
870 });
871
872 # and finally handle any remaining data as body
873 $handle->on_read (sub {
874 $body .= $_[0]->rbuf;
875 $_[0]->rbuf = "";
876 });
877 }
760 878
761And now let's go through it step by step. First, as usual, the overall 879And now let's go through it step by step. First, as usual, the overall
762C<http_get> function structure: 880C<http_get> function structure:
763 881
764 sub http_get { 882 sub http_get {
765 my ($host, $uri, $cb) = @_; 883 my ($host, $uri, $cb) = @_;
766 884
767 tcp_connect $host, "http", sub { 885 # store results here
768 ... 886 my ($response, $header, $body);
769 }; 887
888 my $handle; $handle = new AnyEvent::Handle
889 ... create handle object
890
891 ... push data to write
892
893 ... push what to expect to read queue
770 } 894 }
771 895
772Unlike in the finger example, this time the caller has to pass a callback 896Unlike in the finger example, this time the caller has to pass a callback
773to C<http_get>. Also, instead of passing a URL as one would expect, the 897to C<http_get>. Also, instead of passing a URL as one would expect, the
774caller has to provide the hostname and URI - normally you would use the 898caller has to provide the hostname and URI - normally you would use the
775C<URI> module to parse a URL and separate it into those parts, but that is 899C<URI> module to parse a URL and separate it into those parts, but that is
776left to the inspired reader :) 900left to the inspired reader :)
777 901
778Since everything else is left to the caller, all C<http_get> does it to 902Since everything else is left to the caller, all C<http_get> does it to
779initiate the connection with C<tcp_connect> and leave everything else to 903initiate the connection by creating the AnyEvent::Handle object (which
780it's callback. 904calls C<tcp_connect> for us) and leave everything else to it's callback.
781 905
782The first thing the callback does is check for connection errors and 906The handle object is created, unsurprisingly, by calling the C<new>
783declare some variables: 907method of L<AnyEvent::Handle>:
784 908
785 my ($fh) = @_ 909 my $handle; $handle = new AnyEvent::Handle
910 connect => [$host => 'http'],
911 on_error => sub {
786 or $cb->("HTTP/1.0 500 $!"); 912 $cb->("HTTP/1.0 500 $!");
787 913 $handle->destroy; # explicitly destroy handle
914 },
915 on_eof => sub {
788 my ($response, $header, $body); 916 $cb->($response, $header, $body);
917 $handle->destroy; # explicitly destroy handle
918 };
919
920The C<connect> argument tells AnyEvent::Handle to call C<tcp_connect> for
921the specified host and service/port.
922
923The C<on_error> callback will be called on any unexpected error, such as a
924refused connection, or unexpected connection while reading the header.
789 925
790Instead of having an extra mechanism to signal errors, connection errors 926Instead of having an extra mechanism to signal errors, connection errors
791are signalled by crafting a special "response status line", like this: 927are signalled by crafting a special "response status line", like this:
792 928
793 HTTP/1.0 500 Connection refused 929 HTTP/1.0 500 Connection refused
794 930
795This means the caller cannot distinguish (easily) between 931This means the caller cannot distinguish (easily) between
796locally-generated errors and server errors, but it simplifies error 932locally-generated errors and server errors, but it simplifies error
797handling for the caller a lot. 933handling for the caller a lot.
798 934
799The next step finally involves L<AnyEvent::Handle>, namely it creates the 935The error callback also destroys the handle explicitly, because we are not
800handle object: 936interested in continuing after any errors. In AnyEvent::Handle callbacks
937you have to call C<destroy> explicitly to destroy a handle. Outside of
938those callbacks you cna just forget the object reference and it will be
939automatically cleaned up.
801 940
802 my $handle; $handle = new AnyEvent::Handle 941Last not least, we set an C<on_eof> callback that is called when the
803 fh => $fh, 942other side indicates it has stopped writing data, which we will use to
804 on_error => sub { 943gracefully shut down the handle and report the results. This callback is
805 undef $handle; 944only called when the read queue is empty - if the read queue expects some
806 $cb->("HTTP/1.0 500 $!"); 945data and the handle gets an EOF from the other side this will be an error
807 }, 946- after all, you did expect more to come.
808 on_eof => sub {
809 undef $handle; # keep it alive till eof
810 $cb->($response, $header, $body);
811 };
812 947
813The constructor expects a file handle, which gets passed via the C<fh> 948If you wanted to write a server using AnyEvent::Handle, you would use
949C<tcp_accept> and then create the AnyEvent::Handle with the C<fh>
814argument. 950argument.
815 951
816The remaining two argument pairs specify two callbacks to be called on
817any errors (C<on_error>) and in the case of a normal connection close
818(C<on_eof>).
819
820In the first case, we C<undef>ine the handle object and pass the error to
821the callback provided by the callback - done.
822
823In the second case we assume everything went fine and pass the results
824gobbled up so far to the caller-provided callback. This is not quite
825perfect, as when the server "cleanly" closes the connection in the middle
826of sending headers we might wrongly report this as an "OK" to the caller,
827but then, HTTP doesn't support a perfect mechanism that would detect such
828problems in all cases, so we don't bother either.
829
830=head3 The write queue 952=head3 The write queue
831 953
832The next line sends the actual request: 954The next line sends the actual request:
833 955
834 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012"); 956 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
835 957
836No headers will be sent (this is fine for simple requests), so the whole 958No headers will be sent (this is fine for simple requests), so the whole
837request is just a single line followed by an empty line to signal the end 959request is just a single line followed by an empty line to signal the end
838of the headers to the server. 960of the headers to the server.
839 961
840The more interesting question is why the method is called C<push_write> 962The more interesting question is why the method is called C<push_write>
841and not just write. The reason is that you can I<always> add some write 963and not just write. The reason is that you can I<always> add some write
842data without blocking, and to do this, AnyEvent::Handle needs some write 964data without blocking, and to do this, AnyEvent::Handle needs some write
843queue internally - and C<push_write> simply pushes some data at the end of 965queue internally - and C<push_write> simply pushes some data onto the end
844that queue, just like Perl's C<push> pushes data at the end of an array. 966of that queue, just like Perl's C<push> pushes data onto the end of an
967array.
845 968
846The deeper reason is that at some point in the future, there might 969The deeper reason is that at some point in the future, there might
847be C<unshift_write> as well, and in any case, we will shortly meet 970be C<unshift_write> as well, and in any case, we will shortly meet
848C<push_read> and C<unshift_read>, and it's usually easiest if all those 971C<push_read> and C<unshift_read>, and it's usually easiest to remember if
849functions have some symmetry in their name. 972all those functions have some symmetry in their name. So C<push> is used
973as the opposite of C<unshift> in AnyEvent::Handle, not as the opposite of
974C<pull> - just like in Perl.
975
976Note that we call C<push_write> right after creating the AnyEvent::Handle
977object, before it has had time to actually connect to the server. This is
978fine, pushing the read and write requests will simply queue them in the
979handle object until the connection has been established. Alternatively, we
980could do this "on demand" in the C<on_connect> callback.
850 981
851If C<push_write> is called with more than one argument, then you can even 982If C<push_write> is called with more than one argument, then you can even
852do I<formatted> I/O, which simply means your data will be transformed in 983do I<formatted> I/O, which simply means your data will be transformed in
853some ways. For example, this would JSON-encode your data before pushing it 984some ways. For example, this would JSON-encode your data before pushing it
854to the write queue: 985to the write queue:
856 $handle->push_write (json => [1, 2, 3]); 987 $handle->push_write (json => [1, 2, 3]);
857 988
858Apart from that, this pretty much summarises the write queue, there is 989Apart from that, this pretty much summarises the write queue, there is
859little else to it. 990little else to it.
860 991
861Reading the response if far more interesting: 992Reading the response is far more interesting, because it involves the more
993powerful and complex I<read queue>:
862 994
863=head3 The read queue 995=head3 The read queue
864 996
865the response consists of three parts: a single line of response status, a 997The response consists of three parts: a single line with the response
866single paragraph of headers ended by an empty line, and the request body, 998status, a single paragraph of headers ended by an empty line, and the
867which is simply the remaining data on that connection. 999request body, which is simply the remaining data on that connection.
868 1000
869For the first two, we push two read requests onto the read queue: 1001For the first two, we push two read requests onto the read queue:
870 1002
871 # now fetch response status line 1003 # now fetch response status line
872 $handle->push_read (line => sub { 1004 $handle->push_read (line => sub {
873 my ($handle, $line) = @_; 1005 my ($handle, $line) = @_;
874 $response = $line; 1006 $response = $line;
875 }); 1007 });
876 1008
877 # then the headers 1009 # then the headers
878 $handle->push_read (line => "\015\012\015\012", sub { 1010 $handle->push_read (line => "\015\012\015\012", sub {
879 my ($handle, $line) = @_; 1011 my ($handle, $line) = @_;
880 $header = $line; 1012 $header = $line;
881 }); 1013 });
882 1014
883While one can simply push a single callback to the queue, I<formatted> I/O 1015While one can simply push a single callback to parse the data the
884really comes to out advantage here, as there is a ready-made "read line" 1016queue, I<formatted> I/O really comes to our advantage here, as there
885read type. The first read expects a single line, ended by C<\015\012> (the 1017is a ready-made "read line" read type. The first read expects a single
886standard end-of-line marker in internet protocols). 1018line, ended by C<\015\012> (the standard end-of-line marker in internet
1019protocols).
887 1020
888The second "line" is actually a single paragraph - instead of reading it 1021The second "line" is actually a single paragraph - instead of reading it
889line by line we tell C<push_read> that the end-of-line marker is really 1022line by line we tell C<push_read> that the end-of-line marker is really
890C<\015\012\015\012>, which is an empty line. The result is that the whole 1023C<\015\012\015\012>, which is an empty line. The result is that the whole
891header paragraph will be treated as a single line and read. The word 1024header paragraph will be treated as a single line and read. The word
896many requests as we want, and AnyEvent::Handle will handle them in order. 1029many requests as we want, and AnyEvent::Handle will handle them in order.
897 1030
898There is, however, no read type for "the remaining data". For that, we 1031There is, however, no read type for "the remaining data". For that, we
899install our own C<on_read> callback: 1032install our own C<on_read> callback:
900 1033
901 # and finally handle any remaining data as body 1034 # and finally handle any remaining data as body
902 $handle->on_read (sub { 1035 $handle->on_read (sub {
903 $body .= $_[0]->rbuf; 1036 $body .= $_[0]->rbuf;
904 $_[0]->rbuf = ""; 1037 $_[0]->rbuf = "";
905 }); 1038 });
906 1039
907This callback is invoked every time data arrives and the read queue is 1040This callback is invoked every time data arrives and the read queue is
908empty - which in this example will only be the case when both response and 1041empty - which in this example will only be the case when both response and
909header have been read. 1042header have been read. The C<on_read> callback could actually have been
1043specified when constructing the object, but doing it this way preserves
1044logical ordering.
910 1045
1046The read callback simply adds the current read buffer to it's C<$body>
1047variable and, most importantly, I<empties> the buffer by assigning the
1048empty string to it.
911 1049
912############################################################################# 1050After AnyEvent::Handle has been so instructed, it will handle incoming
1051data according to these instructions - if all goes well, the callback will
1052be invoked with the response data, if not, it will get an error.
913 1053
914Now let's start with something simple: a program that reads from standard 1054In general, you can implement pipelining (a semi-advanced feature of many
915input in a non-blocking way, that is, in a way that lets your program do 1055protocols) very easy with AnyEvent::Handle: If you have a protocol with a
916other things while it is waiting for input. 1056request/response structure, your request methods/functions will all look
1057like this (simplified):
917 1058
918First, the full program listing: 1059 sub request {
919 1060
920 #!/usr/bin/perl 1061 # send the request to the server
1062 $handle->push_write (...);
921 1063
922 use AnyEvent; 1064 # push some response handlers
923 use AnyEvent::Handle; 1065 $handle->push_read (...);
1066 }
924 1067
925 my $end_prog = AnyEvent->condvar; 1068This means you can queue as many requests as you want, and while
1069AnyEvent::Handle goes through its read queue to handle the response data,
1070the other side can work on the next request - queueing the request just
1071appends some data to the write queue and installs a handler to be called
1072later.
926 1073
927 my $handle = 1074You might ask yourself how to handle decisions you can only make I<after>
928 AnyEvent::Handle->new ( 1075you have received some data (such as handling a short error response or a
929 fh => \*STDIN, 1076long and differently-formatted response). The answer to this problem is
930 on_eof => sub { 1077C<unshift_read>, which we will introduce together with an example in the
931 print "received EOF, exiting...\n"; 1078coming sections.
932 $end_prog->broadcast; 1079
933 }, 1080=head3 Using C<http_get>
934 on_error => sub { 1081
935 print "error while reading from STDIN: $!\n"; 1082Finally, here is how you would use C<http_get>:
936 $end_prog->broadcast; 1083
1084 http_get "www.google.com", "/", sub {
1085 my ($response, $header, $body) = @_;
1086
1087 print
1088 $response, "\n",
1089 $body;
1090 };
1091
1092And of course, you can run as many of these requests in parallel as you
1093want (and your memory supports).
1094
1095=head3 HTTPS
1096
1097Now, as promised, let's implement the same thing for HTTPS, or more
1098correctly, let's change our C<http_get> function into a function that
1099speaks HTTPS instead.
1100
1101HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer
1102B<S>ecurity is the official name for what most people refer to as C<SSL>)
1103that contains standard HTTP protocol exchanges. The only other difference
1104to HTTP is that by default it uses port C<443> instead of port C<80>.
1105
1106To implement these two differences we need two tiny changes, first, in the
1107C<connect> parameter, we replace C<http> by C<https> to connect to the
1108https port:
1109
1110 connect => [$host => 'https'],
1111
1112The other change deals with TLS, which is something L<AnyEvent::Handle>
1113does for us, as long as I<you> made sure that the L<Net::SSLeay> module
1114is around. To enable TLS with L<AnyEvent::Handle>, we simply pass an
1115additional C<tls> parameter to the call to C<AnyEvent::Handle::new>:
1116
1117 tls => "connect",
1118
1119Specifying C<tls> enables TLS, and the argument specifies whether
1120AnyEvent::Handle is the server side ("accept") or the client side
1121("connect") for the TLS connection, as unlike TCP, there is a clear
1122server/client relationship in TLS.
1123
1124That's all.
1125
1126Of course, all this should be handled transparently by C<http_get>
1127after parsing the URL. If you need this, see the part about exercising
1128your inspiration earlier in this document. You could also use the
1129L<AnyEvent::HTTP> module from CPAN, which implements all this and works
1130around a lot of quirks for you, too.
1131
1132=head3 The read queue - revisited
1133
1134HTTP always uses the same structure in its responses, but many protocols
1135require parsing responses differently depending on the response itself.
1136
1137For example, in SMTP, you normally get a single response line:
1138
1139 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1140
1141But SMTP also supports multi-line responses:
1142
1143 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1144 220-hey guys
1145 220 my response is longer than yours
1146
1147To handle this, we need C<unshift_read>. As the name (hopefully) implies,
1148C<unshift_read> will not append your read request to the end of the read
1149queue, but instead it will prepend it to the queue.
1150
1151This is useful in the situation above: Just push your response-line read
1152request when sending the SMTP command, and when handling it, you look at
1153the line to see if more is to come, and C<unshift_read> another reader
1154callback if required, like this:
1155
1156 my $response; # response lines end up in here
1157
1158 my $read_response; $read_response = sub {
1159 my ($handle, $line) = @_;
1160
1161 $response .= "$line\n";
1162
1163 # check for continuation lines ("-" as 4th character")
1164 if ($line =~ /^...-/) {
1165 # if yes, then unshift another line read
1166 $handle->unshift_read (line => $read_response);
1167
1168 } else {
1169 # otherwise we are done
1170
1171 # free callback
1172 undef $read_response;
937 } 1173
1174 print "we are don reading: $response\n";
938 ); 1175 }
1176 };
1177
1178 $handle->push_read (line => $read_response);
1179
1180This recipe can be used for all similar parsing problems, for example in
1181NNTP, the response code to some commands indicates that more data will be
1182sent:
1183
1184 $handle->push_write ("article 42");
1185
1186 # read response line
1187 $handle->push_read (line => sub {
1188 my ($handle, $status) = @_;
1189
1190 # article data following?
1191 if ($status =~ /^2/) {
1192 # yes, read article body
1193
1194 $handle->unshift_read (line => "\012.\015\012", sub {
1195 my ($handle, $body) = @_;
1196
1197 $finish->($status, $body);
1198 });
1199
1200 } else {
1201 # some error occured, no article data
1202
1203 $finish->($status);
1204 }
1205 }
1206
1207=head3 Your own read queue handler
1208
1209Sometimes, your protocol doesn't play nice and uses lines or chunks of
1210data not formatted in a way handled by AnyEvent::Handle out of the box. In
1211this case you have to implement your own read parser.
1212
1213To make up a contorted example, imagine you are looking for an even
1214number of characters followed by a colon (":"). Also imagine that
1215AnyEvent::Handle had no C<regex> read type which could be used, so you'd
1216had to do it manually.
1217
1218To implement a read handler for this, you would C<push_read> (or
1219C<unshift_read>) just a single code reference.
1220
1221This code reference will then be called each time there is (new) data
1222available in the read buffer, and is expected to either successfully
1223eat/consume some of that data (and return true) or to return false to
1224indicate that it wants to be called again.
1225
1226If the code reference returns true, then it will be removed from the
1227read queue (because it has parsed/consumed whatever it was supposed to
1228consume), otherwise it stays in the front of it.
1229
1230The example above could be coded like this:
939 1231
940 $handle->push_read (sub { 1232 $handle->push_read (sub {
941 my ($handle) = @_; 1233 my ($handle) = @_;
942 1234
943 if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { 1235 # check for even number of characters + ":"
944 print "got 'end', existing...\n"; 1236 # and remove the data if a match is found.
945 $end_prog->broadcast; 1237 # if not, return false (actually nothing)
1238
1239 $handle->{rbuf} =~ s/^( (?:..)* ) ://x
946 return 1 1240 or return;
1241
1242 # we got some data in $1, pass it to whoever wants it
1243 $finish->($1);
1244
1245 # and return true to indicate we are done
947 } 1246 1
948
949 0
950 }); 1247 });
951 1248
952 $end_prog->recv; 1249This concludes our little tutorial.
953 1250
954That's a mouthful, so let's go through it step by step: 1251=head1 Where to go from here?
955 1252
956 #!/usr/bin/perl 1253This introduction should have explained the key concepts of L<AnyEvent>
1254- event watchers and condition variables, L<AnyEvent::Socket> - basic
1255networking utilities, and L<AnyEvent::Handle> - a nice wrapper around
1256handles.
957 1257
958 use AnyEvent; 1258You could either start coding stuff right away, look at those manual
959 use AnyEvent::Handle; 1259pages for the gory details, or roam CPAN for other AnyEvent modules (such
1260as L<AnyEvent::IRC> or L<AnyEvent::HTTP>) to see more code examples (or
1261simply to use them).
960 1262
961Nothing unexpected here, just load AnyEvent for the event functionality 1263If you need a protocol that doesn't have an implementation using AnyEvent,
962and AnyEvent::Handle for your file handling needs. 1264remember that you can mix AnyEvent with one other event framework, such as
1265L<POE>, so you can always use AnyEvent for your own tasks plus modules of
1266one other event framework to fill any gaps.
963 1267
964 my $end_prog = AnyEvent->condvar; 1268And last not least, you could also look at L<Coro>, especially
1269L<Coro::AnyEvent>, to see how you can turn event-based programming from
1270callback style back to the usual imperative style (also called "inversion
1271of control" - AnyEvent calls I<you>, but Coro lets I<you> call AnyEvent).
965 1272
966Here the program creates a so-called 'condition variable': Condition 1273=head1 Authors
967variables are a great way to signal the completion of some event, or to
968state that some condition became true (thus the name).
969
970This condition variable represents the condition that the program wants to
971terminate. Later in the program, we will 'recv' that condition (call the
972C<recv> method on it), which will wait until the condition gets signalled
973(which is done by calling the C<send> method on it).
974
975The next step is to create the handle object:
976
977 my $handle =
978 AnyEvent::Handle->new (
979 fh => \*STDIN,
980 on_eof => sub {
981 print "received EOF, exiting...\n";
982 $end_prog->broadcast;
983 },
984
985This handle object will read from standard input. Setting the C<on_eof>
986callback should be done for every file handle, as that is a condition that
987we always need to check for when working with file handles, to prevent
988reading or writing to a closed file handle, or getting stuck indefinitely
989in case of an error.
990
991Speaking of errors:
992
993 on_error => sub {
994 print "error while reading from STDIN: $!\n";
995 $end_prog->broadcast;
996 }
997 );
998
999The C<on_error> callback is also not required, but we set it here in case
1000any error happens when we read from the file handle. It is usually a good
1001idea to set this callback and at least print some diagnostic message: Even
1002in our small example an error can happen. More on this later...
1003
1004 $handle->push_read (sub {
1005
1006Next we push a general read callback on the read queue, which
1007will wait until we have received all the data we wanted to
1008receive. L<AnyEvent::Handle> has two queues per file handle, a read and a
1009write queue. The write queue queues pending data that waits to be written
1010to the file handle. And the read queue queues reading callbacks. For more
1011details see the documentation L<AnyEvent::Handle> about the READ QUEUE and
1012WRITE QUEUE.
1013
1014 my ($handle) = @_;
1015
1016 if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) {
1017 print "got 'end', existing...\n";
1018 $end_prog->broadcast;
1019 return 1
1020 }
1021
1022 0
1023 });
1024
1025The actual callback waits until the word 'end' has been seen in the data
1026received on standard input. Once we encounter the stop word 'end' we
1027remove everything from the read buffer and call the condition variable
1028we setup earlier, that signals our 'end of program' condition. And the
1029callback returns with a true value, that signals we are done with reading
1030all the data we were interested in (all data until the word 'end' has been
1031seen).
1032
1033In all other cases, when the stop word has not been seen yet, we just
1034return a false value, to indicate that we are not finished yet.
1035
1036The C<rbuf> method returns our read buffer, that we can directly modify as
1037lvalue. Alternatively we also could have written:
1038
1039 if ($handle->{rbuf} =~ s/^.*?\bend\b.*$//s) {
1040
1041The last line will wait for the condition that our program wants to exit:
1042
1043 $end_prog->recv;
1044
1045The call to C<recv> will setup an event loop for us and wait for IO, timer
1046or signal events and will handle them until the condition gets sent (by
1047calling its C<send> method).
1048
1049The key points to learn from this example are:
1050
1051=over 4
1052
1053=item * Condition variables are used to start an event loop.
1054
1055=item * How to registering some basic callbacks on AnyEvent::Handle's.
1056
1057=item * How to process data in the read buffer.
1058
1059=back
1060
1061=head1 AUTHORS
1062 1274
1063Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. 1275Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.
1064 1276

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines