… | |
… | |
108 | ); |
108 | ); |
109 | |
109 | |
110 | # do something else here |
110 | # do something else here |
111 | |
111 | |
112 | Looks more complicated, and surely is, but the advantage of using events |
112 | Looks more complicated, and surely is, but the advantage of using events |
113 | is that your program can do something else instead of waiting for |
113 | is that your program can do something else instead of waiting for input |
|
|
114 | (side note: combining AnyEvent with a thread package such as Coro can |
|
|
115 | recoup much of the simplicity, effectively getting the best of two |
|
|
116 | worlds). |
|
|
117 | |
114 | input. Waiting as in the first example is also called "blocking" because |
118 | Waiting as done in the first example is also called "blocking" the process |
115 | you "block" your process from executing anything else while you do so. |
119 | because you "block"/keep your process from executing anything else while |
|
|
120 | you do so. |
116 | |
121 | |
117 | The second example avoids blocking, by only registering interest in a read |
122 | The second example avoids blocking by only registering interest in a read |
118 | event, which is fast and doesn't block your process. Only when read data |
123 | event, which is fast and doesn't block your process. Only when read data |
119 | is available will the callback be called, which can then proceed to read |
124 | is available will the callback be called, which can then proceed to read |
120 | the data. |
125 | the data. |
121 | |
126 | |
122 | The "interest" is represented by an object returned by C<< AnyEvent->io |
127 | The "interest" is represented by an object returned by C<< AnyEvent->io |
123 | >> called a "watcher" object - called like that because it "watches" your |
128 | >> called a "watcher" object - called like that because it "watches" your |
124 | file handle (or other event sources) for the event you are interested in. |
129 | file handle (or other event sources) for the event you are interested in. |
125 | |
130 | |
126 | In the example above, we create an I/O watcher by calling the C<< |
131 | In the example above, we create an I/O watcher by calling the C<< |
127 | AnyEvent->io >> method. Disinterest in some event is simply expressed by |
132 | AnyEvent->io >> method. Disinterest in some event is simply expressed |
128 | forgetting about the watcher, for example, by C<undef>'ing the variable it |
133 | by forgetting about the watcher, for example, by C<undef>'ing the only |
129 | is stored in. AnyEvent will automatically clean up the watcher if it is no |
134 | variable it is stored in. AnyEvent will automatically clean up the watcher |
130 | longer used, much like Perl closes your file handles if you no longer use |
135 | if it is no longer used, much like Perl closes your file handles if you no |
131 | them anywhere. |
136 | longer use them anywhere. |
132 | |
137 | |
133 | =head3 A short note on callbacks |
138 | =head3 A short note on callbacks |
134 | |
139 | |
135 | A common issue that hits people is the problem of passing parameters |
140 | A common issue that hits people is the problem of passing parameters |
136 | to callbacks. Programmers used to languages such as C or C++ are often |
141 | to callbacks. Programmers used to languages such as C or C++ are often |
… | |
… | |
153 | is also an abstraction penalty to pay as one has to I<name> the callback, |
158 | is also an abstraction penalty to pay as one has to I<name> the callback, |
154 | which often is unnecessary and leads to nonsensical or duplicated names. |
159 | which often is unnecessary and leads to nonsensical or duplicated names. |
155 | |
160 | |
156 | In Perl, one can specify behaviour much more directly by using |
161 | In Perl, one can specify behaviour much more directly by using |
157 | I<closures>. Closures are code blocks that take a reference to the |
162 | I<closures>. Closures are code blocks that take a reference to the |
158 | enclosing scope(s) when they are created. This means lexical variables in scope at the time |
163 | enclosing scope(s) when they are created. This means lexical variables in |
159 | of creating the closure can simply be used inside the closure: |
164 | scope at the time of creating the closure can simply be used inside the |
|
|
165 | closure: |
160 | |
166 | |
161 | my $arg = ...; |
167 | my $arg = ...; |
162 | |
168 | |
163 | call_me_back_later sub { $arg->method }; |
169 | call_me_back_later sub { $arg->method }; |
164 | |
170 | |
165 | Under most circumstances, closures are faster, use less resources and |
171 | Under most circumstances, closures are faster, use fewer resources and |
166 | result in much clearer code then the traditional approach. Faster, |
172 | result in much clearer code then the traditional approach. Faster, |
167 | because parameter passing and storing them in local variables in Perl |
173 | because parameter passing and storing them in local variables in Perl |
168 | is relatively slow. Less resources, because closures take references to |
174 | is relatively slow. Fewer resources, because closures take references |
169 | existing variables without having to create new ones, and clearer code |
175 | to existing variables without having to create new ones, and clearer |
170 | because it is immediately obvious that the second example calls the |
176 | code because it is immediately obvious that the second example calls the |
171 | C<method> method when the callback is invoked. |
177 | C<method> method when the callback is invoked. |
172 | |
178 | |
173 | Apart from these, the strongest argument for using closures with AnyEvent |
179 | Apart from these, the strongest argument for using closures with AnyEvent |
174 | is that AnyEvent does not allow passing parameters to the callback, so |
180 | is that AnyEvent does not allow passing parameters to the callback, so |
175 | closures are the only way to achieve that in most cases :-> |
181 | closures are the only way to achieve that in most cases :-> |
176 | |
182 | |
177 | |
183 | |
178 | =head3 A hint on debugging |
184 | =head3 A hint on debugging |
179 | |
185 | |
180 | AnyEvent does, by default, not do any argument checking. This can lead to |
186 | AnyEvent does, by default, not do any argument checking. This can lead to |
181 | strange and unexpected results especially if you are trying to learn yur |
187 | strange and unexpected results especially if you are trying to learn your |
182 | ways with AnyEvent. |
188 | ways with AnyEvent. |
183 | |
189 | |
184 | AnyEvent supports a special "strict" mode, off by default, which does very |
190 | AnyEvent supports a special "strict" mode, off by default, which does very |
185 | strict argument checking, at the expense of being somewhat slower. When |
191 | strict argument checking, at the expense of being somewhat slower. During |
186 | developing, however, this mode is very useful. |
192 | development, however, this mode is very useful. |
187 | |
193 | |
188 | You can enable this strict mode either by having an environment variable |
194 | You can enable this strict mode either by having an environment variable |
189 | C<PERL_ANYEVENT_STRICT> with a true value in your environment: |
195 | C<PERL_ANYEVENT_STRICT> with a true value in your environment: |
190 | |
196 | |
191 | PERL_ANYEVENT_STRICT=1 perl test.pl |
197 | PERL_ANYEVENT_STRICT=1 perl test.pl |
… | |
… | |
194 | same effect (do not do this in production, however). |
200 | same effect (do not do this in production, however). |
195 | |
201 | |
196 | |
202 | |
197 | =head2 Condition Variables |
203 | =head2 Condition Variables |
198 | |
204 | |
199 | Back to the I/O watcher example: The code not yet a fully working program, |
205 | Back to the I/O watcher example: The code is not yet a fully working |
200 | and will not work as-is. The reason is that your callback will not be |
206 | program, and will not work as-is. The reason is that your callback will |
201 | invoked out of the blue, you have to run the event loop. Also, event-based |
207 | not be invoked out of the blue, you have to run the event loop. Also, |
202 | programs sometimes have to block, too, as when there simply is nothing |
208 | event-based programs sometimes have to block, too, as when there simply is |
203 | else to do and everything waits for some events, it needs to block the |
209 | nothing else to do and everything waits for some events, it needs to block |
204 | process as well. |
210 | the process as well until new events arrive. |
205 | |
211 | |
206 | In AnyEvent, this is done using condition variables. Condition variables |
212 | In AnyEvent, this is done using condition variables. Condition variables |
207 | are named "condition variables" because they represent a condition that is |
213 | are named "condition variables" because they represent a condition that is |
208 | initially false and needs to be fulfilled. |
214 | initially false and needs to be fulfilled. |
209 | |
215 | |
… | |
… | |
246 | print "your name is $name\n"; |
252 | print "your name is $name\n"; |
247 | |
253 | |
248 | This program creates an AnyEvent condvar by calling the C<< |
254 | This program creates an AnyEvent condvar by calling the C<< |
249 | AnyEvent->condvar >> method. It then creates a watcher as usual, but |
255 | AnyEvent->condvar >> method. It then creates a watcher as usual, but |
250 | inside the callback it C<send>'s the C<$name_ready> condition variable, |
256 | inside the callback it C<send>'s the C<$name_ready> condition variable, |
251 | which causes anybody waiting on it to continue. |
257 | which causes whoever is waiting on it to continue. |
252 | |
258 | |
253 | The "anybody" in this case is the code that follows, which calls C<< |
259 | The "whoever" in this case is the code that follows, which calls C<< |
254 | $name_ready->recv >>: The producer calls C<send>, the consumer calls |
260 | $name_ready->recv >>: The producer calls C<send>, the consumer calls |
255 | C<recv>. |
261 | C<recv>. |
256 | |
262 | |
257 | If there is no C<$name> available yet, then the call to C<< |
263 | If there is no C<$name> available yet, then the call to C<< |
258 | $name_ready->recv >> will halt your program until the condition becomes |
264 | $name_ready->recv >> will halt your program until the condition becomes |
… | |
… | |
340 | This also shows that AnyEvent is quite flexible - you didn't have anything |
346 | This also shows that AnyEvent is quite flexible - you didn't have anything |
341 | to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just |
347 | to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just |
342 | worked. |
348 | worked. |
343 | |
349 | |
344 | Admittedly, the example is a bit silly - who would want to read names |
350 | Admittedly, the example is a bit silly - who would want to read names |
345 | form standard input in a Gtk+ application. But imagine that instead of |
351 | from standard input in a Gtk+ application. But imagine that instead of |
346 | doing that, you would make a HTTP request in the background and display |
352 | doing that, you would make a HTTP request in the background and display |
347 | it's results. In fact, with event-based programming you can make many |
353 | it's results. In fact, with event-based programming you can make many |
348 | http-requests in parallel in your program and still provide feedback to |
354 | http-requests in parallel in your program and still provide feedback to |
349 | the user and stay interactive. |
355 | the user and stay interactive. |
350 | |
356 | |
351 | In the next part you will see how to do just that - by implementing an |
357 | And in the next part you will see how to do just that - by implementing an |
352 | HTTP request, on our own, with the utility modules AnyEvent comes with. |
358 | HTTP request, on our own, with the utility modules AnyEvent comes with. |
353 | |
359 | |
354 | Before that, however, let's briefly look at how you would write your |
360 | Before that, however, let's briefly look at how you would write your |
355 | program with using only AnyEvent, without ever calling some other event |
361 | program with using only AnyEvent, without ever calling some other event |
356 | loop's run function. |
362 | loop's run function. |
357 | |
363 | |
358 | In the example using condition variables, we used that, and in fact, this |
364 | In the example using condition variables, we used those to start waiting |
359 | is the solution: |
365 | for events, and in fact, condition variables are the solution: |
360 | |
366 | |
361 | my $quit_program = AnyEvent->condvar; |
367 | my $quit_program = AnyEvent->condvar; |
362 | |
368 | |
363 | # create AnyEvent watchers (or not) here |
369 | # create AnyEvent watchers (or not) here |
364 | |
370 | |
365 | $quit_program->recv; |
371 | $quit_program->recv; |
366 | |
372 | |
367 | If any of your watcher callbacks decide to quit, they can simply call |
373 | If any of your watcher callbacks decide to quit (this is often |
|
|
374 | called an "unloop" in other frameworks), they can simply call C<< |
368 | C<< $quit_program->send >>. Of course, they could also decide not to and |
375 | $quit_program->send >>. Of course, they could also decide not to and |
369 | simply call C<exit> instead, or they could decide not to quit, ever (e.g. |
376 | simply call C<exit> instead, or they could decide not to quit, ever (e.g. |
370 | in a long-running daemon program). |
377 | in a long-running daemon program). |
371 | |
378 | |
372 | In that case, you can simply use: |
379 | If you don't need some clean quit functionality and just want to run the |
|
|
380 | event loop, you can simply do this: |
373 | |
381 | |
374 | AnyEvent->condvar->recv; |
382 | AnyEvent->condvar->recv; |
375 | |
383 | |
376 | And this is, in fact, closest to the idea of a main loop run function that |
384 | And this is, in fact, closest to the idea of a main loop run function that |
377 | AnyEvent offers. |
385 | AnyEvent offers. |
… | |
… | |
409 | |
417 | |
410 | # now wait till our time has come |
418 | # now wait till our time has come |
411 | $cv->recv; |
419 | $cv->recv; |
412 | |
420 | |
413 | Unlike I/O watchers, timers are only interested in the amount of seconds |
421 | Unlike I/O watchers, timers are only interested in the amount of seconds |
414 | they have to wait. When that amount of time has passed, AnyEvent will |
422 | they have to wait. When (at least) that amount of time has passed, |
415 | invoke your callback. |
423 | AnyEvent will invoke your callback. |
416 | |
424 | |
417 | Unlike I/O watchers, which will call your callback as many times as there |
425 | Unlike I/O watchers, which will call your callback as many times as there |
418 | is data available, timers are one-shot: after they have "fired" once and |
426 | is data available, timers are normally one-shot: after they have "fired" |
419 | invoked your callback, they are dead and no longer do anything. |
427 | once and invoked your callback, they are dead and no longer do anything. |
420 | |
428 | |
421 | To get a repeating timer, such as a timer firing roughly once per second, |
429 | To get a repeating timer, such as a timer firing roughly once per second, |
422 | you have to recreate it: |
430 | you can specify an C<interval> parameter: |
423 | |
431 | |
424 | use AnyEvent; |
432 | my $once_per_second = AnyEvent->timer ( |
425 | |
433 | after => 0, # first invoke ASAP |
426 | my $time_watcher; |
434 | interval => 1, # then invoke every second |
427 | |
435 | cb => sub { # the callback to invoke |
428 | sub once_per_second { |
436 | $cv->send; |
429 | print "tick\n"; |
|
|
430 | |
437 | }, |
431 | # (re-)create the watcher |
|
|
432 | $time_watcher = AnyEvent->timer ( |
|
|
433 | after => 1, |
|
|
434 | cb => \&once_per_second, |
|
|
435 | ); |
438 | ); |
436 | } |
|
|
437 | |
|
|
438 | # now start the timer |
|
|
439 | once_per_second; |
|
|
440 | |
|
|
441 | Having to recreate your timer is a restriction put on AnyEvent that is |
|
|
442 | present in most event libraries it uses. It is so annoying that some |
|
|
443 | future version might work around this limitation, but right now, it's the |
|
|
444 | only way to do repeating timers. |
|
|
445 | |
|
|
446 | Fortunately most timers aren't really repeating but specify timeouts of |
|
|
447 | some sort. |
|
|
448 | |
439 | |
449 | =head3 More esoteric sources |
440 | =head3 More esoteric sources |
450 | |
441 | |
451 | AnyEvent also has some other, more esoteric event sources you can tap |
442 | AnyEvent also has some other, more esoteric event sources you can tap |
452 | into: signal and child watchers. |
443 | into: signal, child and idle watchers. |
453 | |
444 | |
454 | Signal watchers can be used to wait for "signal events", which simply |
445 | Signal watchers can be used to wait for "signal events", which simply |
455 | means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>). |
446 | means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>). |
456 | |
447 | |
457 | Process watchers wait for a child process to exit. They are useful when |
448 | Child-process watchers wait for a child process to exit. They are useful |
458 | you fork a separate process and need to know when it exits, but you do not |
449 | when you fork a separate process and need to know when it exits, but you |
459 | wait for that by blocking. |
450 | do not wait for that by blocking. |
460 | |
451 | |
|
|
452 | Idle watchers invoke their callback when the event loop has handled all |
|
|
453 | outstanding events, polled for new events and didn't find any, i.e., when |
|
|
454 | your process is otherwise idle. They are useful if you want to do some |
|
|
455 | non-trivial data processing that can be done when your program doesn't |
|
|
456 | have anything better to do. |
|
|
457 | |
461 | Both watcher types are described in detail in the main L<AnyEvent> manual |
458 | All these watcher types are described in detail in the main L<AnyEvent> |
462 | page. |
459 | manual page. |
463 | |
460 | |
|
|
461 | Sometimes you also need to know what the current time is: C<< |
|
|
462 | AnyEvent->now >> returns the time the event toolkit uses to schedule |
|
|
463 | relative timers, and is usually what you want. It is often cached (which |
|
|
464 | means it can be a bit outdated). In that case, you can use the more costly |
|
|
465 | C<< AnyEvent->time >> method which will ask your operating system for the |
|
|
466 | current time, which is slower, but also more up to date. |
464 | |
467 | |
465 | =head1 Network programming and AnyEvent |
468 | =head1 Network programming and AnyEvent |
466 | |
469 | |
467 | So far you have seen how to register event watchers and handle events. |
470 | So far you have seen how to register event watchers and handle events. |
468 | |
471 | |
469 | This is a great foundation to write network clients and servers, and might be |
472 | This is a great foundation to write network clients and servers, and might |
470 | all that your module (or program) ever requires, but writing your own I/O |
473 | be all that your module (or program) ever requires, but writing your own |
471 | buffering again and again becomes tedious, not to mention that it attracts |
474 | I/O buffering again and again becomes tedious, not to mention that it |
472 | errors. |
475 | attracts errors. |
473 | |
476 | |
474 | While the core L<AnyEvent> module is still small and self-contained, |
477 | While the core L<AnyEvent> module is still small and self-contained, |
475 | the distribution comes with some very useful utility modules such as |
478 | the distribution comes with some very useful utility modules such as |
476 | L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can |
479 | L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can |
477 | make your life as non-blocking network programmer a lot easier. |
480 | make your life as non-blocking network programmer a lot easier. |
… | |
… | |
485 | a great way to do other DNS resolution tasks, such as reverse lookups of |
488 | a great way to do other DNS resolution tasks, such as reverse lookups of |
486 | IP addresses for log files. |
489 | IP addresses for log files. |
487 | |
490 | |
488 | =head2 L<AnyEvent::Handle> |
491 | =head2 L<AnyEvent::Handle> |
489 | |
492 | |
490 | This module handles non-blocking IO on file handles in an event based |
493 | This module handles non-blocking IO on (socket-, pipe- etc.) file handles |
491 | manner. It provides a wrapper object around your file handle that provides |
494 | in an event based manner. It provides a wrapper object around your file |
492 | queueing and buffering of incoming and outgoing data for you. |
495 | handle that provides queueing and buffering of incoming and outgoing data |
|
|
496 | for you. |
493 | |
497 | |
494 | It also implements the most common data formats, such as text lines, or |
498 | It also implements the most common data formats, such as text lines, or |
495 | fixed and variable-width data blocks. |
499 | fixed and variable-width data blocks. |
496 | |
500 | |
497 | =head2 L<AnyEvent::Socket> |
501 | =head2 L<AnyEvent::Socket> |
… | |
… | |
525 | It works by connecting to the finger port on another host, writing a |
529 | It works by connecting to the finger port on another host, writing a |
526 | single line with a user name and then reading the finger response, as |
530 | single line with a user name and then reading the finger response, as |
527 | specified by that user. OK, RFC 1288 specifies a vastly more complex |
531 | specified by that user. OK, RFC 1288 specifies a vastly more complex |
528 | protocol, but it basically boils down to this: |
532 | protocol, but it basically boils down to this: |
529 | |
533 | |
530 | # telnet idsoftware.com finger |
534 | # telnet kernel.org finger |
531 | Trying 192.246.40.37... |
535 | Trying 204.152.191.37... |
532 | Connected to idsoftware.com (192.246.40.37). |
536 | Connected to kernel.org (204.152.191.37). |
533 | Escape character is '^]'. |
537 | Escape character is '^]'. |
534 | johnc |
538 | |
535 | Welcome to id Software's Finger Service V1.5! |
539 | The latest stable version of the Linux kernel is: [...] |
536 | |
|
|
537 | [...] |
|
|
538 | Now on the web: |
|
|
539 | [...] |
|
|
540 | |
|
|
541 | Connection closed by foreign host. |
540 | Connection closed by foreign host. |
542 | |
541 | |
543 | "Now on the web..." yeah, I<was> used indeed, but at least the finger |
542 | So let's write a little AnyEvent function that makes a finger request: |
544 | daemon still works, so let's write a little AnyEvent function that makes a |
|
|
545 | finger request: |
|
|
546 | |
543 | |
547 | use AnyEvent; |
544 | use AnyEvent; |
548 | use AnyEvent::Socket; |
545 | use AnyEvent::Socket; |
549 | |
546 | |
550 | sub finger($$) { |
547 | sub finger($$) { |
… | |
… | |
615 | socket handle as first argument, otherwise, nothing will be passed to our |
612 | socket handle as first argument, otherwise, nothing will be passed to our |
616 | callback. The important point is that it will always be called as soon as |
613 | callback. The important point is that it will always be called as soon as |
617 | the outcome of the TCP connect is known. |
614 | the outcome of the TCP connect is known. |
618 | |
615 | |
619 | This style of programming is also called "continuation style": the |
616 | This style of programming is also called "continuation style": the |
620 | "continuation" is simply the way the program continues - normally, a |
617 | "continuation" is simply the way the program continues - normally at the |
621 | program continues at the next line after some statement (the exception |
618 | next line after some statement (the exception is loops or things like |
622 | is loops or things like C<return>). When we are interested in events, |
619 | C<return>). When we are interested in events, however, we instead specify |
623 | however, we instead specify the "continuation" of our program by passing a |
620 | the "continuation" of our program by passing a closure, which makes that |
624 | closure, which makes that closure the "continuation" of the program. The |
621 | closure the "continuation" of the program. |
|
|
622 | |
625 | C<tcp_connect> call is like saying "return now, and when the connection is |
623 | The C<tcp_connect> call is like saying "return now, and when the |
626 | established or it failed, continue there". |
624 | connection is established or it failed, continue there". |
627 | |
625 | |
628 | Now let's look at the callback/closure in more detail: |
626 | Now let's look at the callback/closure in more detail: |
629 | |
627 | |
630 | # the callback receives the socket handle - or nothing |
628 | # the callback receives the socket handle - or nothing |
631 | my ($fh) = @_ |
629 | my ($fh) = @_ |
… | |
… | |
643 | report the results to anybody, certainly not the caller of our C<finger> |
641 | report the results to anybody, certainly not the caller of our C<finger> |
644 | function, and most event loops continue even after a C<die>! |
642 | function, and most event loops continue even after a C<die>! |
645 | |
643 | |
646 | This is why we instead C<return>, but also call C<< $cv->send >> without |
644 | This is why we instead C<return>, but also call C<< $cv->send >> without |
647 | any arguments to signal to the condvar consumer that something bad has |
645 | any arguments to signal to the condvar consumer that something bad has |
648 | happened. The return value of C<< $cv->send >> is irrelevant, as is the |
646 | happened. The return value of C<< $cv->send >> is irrelevant, as is |
649 | return value of our callback. The return statement is simply used for the |
647 | the return value of our callback. The C<return> statement is simply |
650 | side effect of, well, returning immediately from the callback. Checking |
648 | used for the side effect of, well, returning immediately from the |
651 | for errors and handling them this way is very common, which is why this |
649 | callback. Checking for errors and handling them this way is very common, |
652 | compact idiom is so handy. |
650 | which is why this compact idiom is so handy. |
653 | |
651 | |
654 | As the next step in the finger protocol, we send the username to the |
652 | As the next step in the finger protocol, we send the username to the |
655 | finger daemon on the other side of our connection: |
653 | finger daemon on the other side of our connection (the kernel.org finger |
|
|
654 | service doesn't actually wait for a username, but the net is running out |
|
|
655 | of finger servers fast): |
656 | |
656 | |
657 | syswrite $fh, "$user\015\012"; |
657 | syswrite $fh, "$user\015\012"; |
658 | |
658 | |
659 | Note that this isn't 100% clean socket programming - the socket could, |
659 | Note that this isn't 100% clean socket programming - the socket could, |
660 | for whatever reasons, not accept our data. When writing a small amount |
660 | for whatever reasons, not accept our data. When writing a small amount |
… | |
… | |
678 | variable, but in a local one - if the callback returns, it would normally |
678 | variable, but in a local one - if the callback returns, it would normally |
679 | destroy the variable and its contents, which would in turn unregister our |
679 | destroy the variable and its contents, which would in turn unregister our |
680 | watcher. |
680 | watcher. |
681 | |
681 | |
682 | To avoid that, we C<undef>ine the variable in the watcher callback. This |
682 | To avoid that, we C<undef>ine the variable in the watcher callback. This |
683 | means that, when the C<tcp_connect> callback returns, that perl thinks |
683 | means that, when the C<tcp_connect> callback returns, perl thinks (quite |
684 | (quite correctly) that the read watcher is still in use - namely in the |
684 | correctly) that the read watcher is still in use - namely in the callback, |
685 | callback. |
685 | and thus keeps it alive even if nothing else in the program refers to it |
|
|
686 | anymore (it is much like Baron Münchhausen keeping himself from dying by |
|
|
687 | pulling himself out of a swamp). |
686 | |
688 | |
687 | The trick, however, is that instead of: |
689 | The trick, however, is that instead of: |
688 | |
690 | |
689 | my $read_watcher = AnyEvent->io (... |
691 | my $read_watcher = AnyEvent->io (... |
690 | |
692 | |
… | |
… | |
709 | my $len = sysread $fh, $response, 1024, length $response; |
711 | my $len = sysread $fh, $response, 1024, length $response; |
710 | |
712 | |
711 | if ($len <= 0) { |
713 | if ($len <= 0) { |
712 | |
714 | |
713 | Note that C<sysread> has the ability to append data it reads to a scalar, |
715 | Note that C<sysread> has the ability to append data it reads to a scalar, |
714 | by specifying an offset, which is what we make good use of in this |
716 | by specifying an offset, a feature of which we make good use of in this |
715 | example. |
717 | example. |
716 | |
718 | |
717 | When C<sysread> indicates we are done, the callback C<undef>ines |
719 | When C<sysread> indicates we are done, the callback C<undef>ines |
718 | the watcher and then C<send>'s the response data to the condition |
720 | the watcher and then C<send>'s the response data to the condition |
719 | variable. All this has the following effects: |
721 | variable. All this has the following effects: |
… | |
… | |
733 | But the main advantage is that we can not only run this finger function in |
735 | But the main advantage is that we can not only run this finger function in |
734 | the background, we even can run multiple sessions in parallel, like this: |
736 | the background, we even can run multiple sessions in parallel, like this: |
735 | |
737 | |
736 | my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets |
738 | my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets |
737 | my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 |
739 | my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 |
738 | my $f3 = finger "johnc", "idsoftware.com"; # finger john |
740 | my $f3 = finger "hpa" , "kernel.org"; # finger hpa |
739 | |
741 | |
740 | print "trouble tickets:\n", $f1->recv, "\n"; |
742 | print "trouble tickets:\n" , $f1->recv, "\n"; |
741 | print "trouble ticket #1736:\n", $f2->recv, "\n"; |
743 | print "trouble ticket #1736:\n", $f2->recv, "\n"; |
742 | print "john carmacks finger file: ", $f3->recv, "\n"; |
744 | print "kernel release info: " , $f3->recv, "\n"; |
743 | |
745 | |
744 | It doesn't look like it, but in fact all three requests run in |
746 | It doesn't look like it, but in fact all three requests run in |
745 | parallel. The code waits for the first finger request to finish first, but |
747 | parallel. The code waits for the first finger request to finish first, but |
746 | that doesn't keep it from executing them parallel: when the first C<recv> |
748 | that doesn't keep it from executing them parallel: when the first C<recv> |
747 | call sees that the data isn't ready yet, it serves events for all three |
749 | call sees that the data isn't ready yet, it serves events for all three |
… | |
… | |
775 | How you implement it is a matter of taste - if you expect your function to |
777 | How you implement it is a matter of taste - if you expect your function to |
776 | be used mainly in an event-based program you would normally prefer to pass |
778 | be used mainly in an event-based program you would normally prefer to pass |
777 | a callback directly. If you write a module and expect your users to use |
779 | a callback directly. If you write a module and expect your users to use |
778 | it "synchronously" often (for example, a simple http-get script would not |
780 | it "synchronously" often (for example, a simple http-get script would not |
779 | really care much for events), then you would use a condition variable and |
781 | really care much for events), then you would use a condition variable and |
780 | tell them "simply ->recv the data". |
782 | tell them "simply C<< ->recv >> the data". |
781 | |
783 | |
782 | =head3 Problems with the implementation and how to fix them |
784 | =head3 Problems with the implementation and how to fix them |
783 | |
785 | |
784 | To make this example more real-world-ready, we would not only implement |
786 | To make this example more real-world-ready, we would not only implement |
785 | some write buffering (for the paranoid), but we would also have to handle |
787 | some write buffering (for the paranoid, or maybe denial-of-service aware |
786 | timeouts and maybe protocol errors. |
788 | security expert), but we would also have to handle timeouts and maybe |
|
|
789 | protocol errors. |
787 | |
790 | |
788 | Doing this quickly gets unwieldy, which is why we introduce |
791 | Doing this quickly gets unwieldy, which is why we introduce |
789 | L<AnyEvent::Handle> in the next section, which takes care of all these |
792 | L<AnyEvent::Handle> in the next section, which takes care of all these |
790 | details for you and let's you concentrate on the actual protocol. |
793 | details for you and let's you concentrate on the actual protocol. |
791 | |
794 | |
792 | |
795 | |
793 | =head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle |
796 | =head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle |
794 | |
797 | |
795 | The L<AnyEvent::Handle> module has been hyped quite a bit so far, so let's |
798 | The L<AnyEvent::Handle> module has been hyped quite a bit in this document |
796 | see what it really offers. |
799 | so far, so let's see what it really offers. |
797 | |
800 | |
798 | As finger is such a simple protocol, let's try something slightly more |
801 | As finger is such a simple protocol, let's try something slightly more |
799 | complicated: HTTP/1.0. |
802 | complicated: HTTP/1.0. |
800 | |
803 | |
801 | An HTTP GET request works by sending a single request line that indicates |
804 | An HTTP GET request works by sending a single request line that indicates |
… | |
… | |
952 | of the headers to the server. |
955 | of the headers to the server. |
953 | |
956 | |
954 | The more interesting question is why the method is called C<push_write> |
957 | The more interesting question is why the method is called C<push_write> |
955 | and not just write. The reason is that you can I<always> add some write |
958 | and not just write. The reason is that you can I<always> add some write |
956 | data without blocking, and to do this, AnyEvent::Handle needs some write |
959 | data without blocking, and to do this, AnyEvent::Handle needs some write |
957 | queue internally - and C<push_write> simply pushes some data at the end of |
960 | queue internally - and C<push_write> simply pushes some data onto the end |
958 | that queue, just like Perl's C<push> pushes data at the end of an array. |
961 | of that queue, just like Perl's C<push> pushes data onto the end of an |
|
|
962 | array. |
959 | |
963 | |
960 | The deeper reason is that at some point in the future, there might |
964 | The deeper reason is that at some point in the future, there might |
961 | be C<unshift_write> as well, and in any case, we will shortly meet |
965 | be C<unshift_write> as well, and in any case, we will shortly meet |
962 | C<push_read> and C<unshift_read>, and it's usually easiest if all those |
966 | C<push_read> and C<unshift_read>, and it's usually easiest to remember if |
963 | functions have some symmetry in their name. |
967 | all those functions have some symmetry in their name. |
964 | |
968 | |
965 | If C<push_write> is called with more than one argument, then you can even |
969 | If C<push_write> is called with more than one argument, then you can even |
966 | do I<formatted> I/O, which simply means your data will be transformed in |
970 | do I<formatted> I/O, which simply means your data will be transformed in |
967 | some ways. For example, this would JSON-encode your data before pushing it |
971 | some ways. For example, this would JSON-encode your data before pushing it |
968 | to the write queue: |
972 | to the write queue: |
… | |
… | |
970 | $handle->push_write (json => [1, 2, 3]); |
974 | $handle->push_write (json => [1, 2, 3]); |
971 | |
975 | |
972 | Apart from that, this pretty much summarises the write queue, there is |
976 | Apart from that, this pretty much summarises the write queue, there is |
973 | little else to it. |
977 | little else to it. |
974 | |
978 | |
975 | Reading the response if far more interesting: |
979 | Reading the response is far more interesting, because it involves the more |
|
|
980 | powerful and complex I<read queue>: |
976 | |
981 | |
977 | =head3 The read queue |
982 | =head3 The read queue |
978 | |
983 | |
979 | The response consists of three parts: a single line of response status, a |
984 | The response consists of three parts: a single line with the response |
980 | single paragraph of headers ended by an empty line, and the request body, |
985 | status, a single paragraph of headers ended by an empty line, and the |
981 | which is simply the remaining data on that connection. |
986 | request body, which is simply the remaining data on that connection. |
982 | |
987 | |
983 | For the first two, we push two read requests onto the read queue: |
988 | For the first two, we push two read requests onto the read queue: |
984 | |
989 | |
985 | # now fetch response status line |
990 | # now fetch response status line |
986 | $handle->push_read (line => sub { |
991 | $handle->push_read (line => sub { |
… | |
… | |
992 | $handle->push_read (line => "\015\012\015\012", sub { |
997 | $handle->push_read (line => "\015\012\015\012", sub { |
993 | my ($handle, $line) = @_; |
998 | my ($handle, $line) = @_; |
994 | $header = $line; |
999 | $header = $line; |
995 | }); |
1000 | }); |
996 | |
1001 | |
997 | While one can simply push a single callback to the queue, I<formatted> I/O |
1002 | While one can simply push a single callback to parse the data the |
998 | really comes to out advantage here, as there is a ready-made "read line" |
1003 | queue, I<formatted> I/O really comes to our advantage here, as there |
999 | read type. The first read expects a single line, ended by C<\015\012> (the |
1004 | is a ready-made "read line" read type. The first read expects a single |
1000 | standard end-of-line marker in internet protocols). |
1005 | line, ended by C<\015\012> (the standard end-of-line marker in internet |
|
|
1006 | protocols). |
1001 | |
1007 | |
1002 | The second "line" is actually a single paragraph - instead of reading it |
1008 | The second "line" is actually a single paragraph - instead of reading it |
1003 | line by line we tell C<push_read> that the end-of-line marker is really |
1009 | line by line we tell C<push_read> that the end-of-line marker is really |
1004 | C<\015\012\015\012>, which is an empty line. The result is that the whole |
1010 | C<\015\012\015\012>, which is an empty line. The result is that the whole |
1005 | header paragraph will be treated as a single line and read. The word |
1011 | header paragraph will be treated as a single line and read. The word |
… | |
… | |
1023 | header have been read. The C<on_read> callback could actually have been |
1029 | header have been read. The C<on_read> callback could actually have been |
1024 | specified when constructing the object, but doing it this way preserves |
1030 | specified when constructing the object, but doing it this way preserves |
1025 | logical ordering. |
1031 | logical ordering. |
1026 | |
1032 | |
1027 | The read callback simply adds the current read buffer to it's C<$body> |
1033 | The read callback simply adds the current read buffer to it's C<$body> |
1028 | variable and, most importantly, I<empties> it by assign the empty string |
1034 | variable and, most importantly, I<empties> the buffer by assigning the |
1029 | to it. |
1035 | empty string to it. |
1030 | |
1036 | |
1031 | After AnyEvent::Handle has been so instructed, it will now handle incoming |
1037 | After AnyEvent::Handle has been so instructed, it will handle incoming |
1032 | data according to these instructions - if all goes well, the callback will |
1038 | data according to these instructions - if all goes well, the callback will |
1033 | be invoked with the response data, if not, it will get an error. |
1039 | be invoked with the response data, if not, it will get an error. |
1034 | |
1040 | |
1035 | In general, you get pipelining very easy with AnyEvent::Handle: If |
1041 | In general, you can implement pipelining (a semi-advanced feature of many |
1036 | you have a protocol with a request/response structure, your request |
1042 | protocols) very easy with AnyEvent::Handle: If you have a protocol with a |
1037 | methods/functions will all look like this (simplified): |
1043 | request/response structure, your request methods/functions will all look |
|
|
1044 | like this (simplified): |
1038 | |
1045 | |
1039 | sub request { |
1046 | sub request { |
1040 | |
1047 | |
1041 | # send the request to the server |
1048 | # send the request to the server |
1042 | $handle->push_write (...); |
1049 | $handle->push_write (...); |
1043 | |
1050 | |
1044 | # push some response handlers |
1051 | # push some response handlers |
1045 | $handle->push_read (...); |
1052 | $handle->push_read (...); |
1046 | } |
1053 | } |
1047 | |
1054 | |
1048 | =head3 Using it |
1055 | This means you can queue as many requests as you want, and while |
|
|
1056 | AnyEvent::Handle goes through its read queue to handle the response data, |
|
|
1057 | the other side can work on the next request - queueing the request just |
|
|
1058 | appends some data to the write queue and installs a handler to be called |
|
|
1059 | later. |
1049 | |
1060 | |
|
|
1061 | You might ask yourself how to handle decisions you can only make I<after> |
|
|
1062 | you have received some data (such as handling a short error response or a |
|
|
1063 | long and differently-formatted response). The answer to this problem is |
|
|
1064 | C<unshift_read>, which we will introduce together with an example in the |
|
|
1065 | coming sections. |
|
|
1066 | |
|
|
1067 | =head3 Using C<http_get> |
|
|
1068 | |
1050 | And here is how you would use it: |
1069 | Finally, here is how you would use C<http_get>: |
1051 | |
1070 | |
1052 | http_get "www.google.com", "/", sub { |
1071 | http_get "www.google.com", "/", sub { |
1053 | my ($response, $header, $body) = @_; |
1072 | my ($response, $header, $body) = @_; |
1054 | |
1073 | |
1055 | print |
1074 | print |
… | |
… | |
1066 | correctly, let's change our C<http_get> function into a function that |
1085 | correctly, let's change our C<http_get> function into a function that |
1067 | speaks HTTPS instead. |
1086 | speaks HTTPS instead. |
1068 | |
1087 | |
1069 | HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer |
1088 | HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer |
1070 | B<S>ecurity is the official name for what most people refer to as C<SSL>) |
1089 | B<S>ecurity is the official name for what most people refer to as C<SSL>) |
1071 | that contains standard HTTP protocol exchanges. The other difference to |
1090 | that contains standard HTTP protocol exchanges. The only other difference |
1072 | HTTP is that it uses port C<443> instead of port C<80>. |
1091 | to HTTP is that by default it uses port C<443> instead of port C<80>. |
1073 | |
1092 | |
1074 | To implement these two differences we need two tiny changes, first, in the C<tcp_connect> call |
1093 | To implement these two differences we need two tiny changes, first, in the |
1075 | we replace C<http> by C<https>): |
1094 | C<tcp_connect> call we replace C<http> by C<https>): |
1076 | |
1095 | |
1077 | tcp_connect $host, "https", sub { ... |
1096 | tcp_connect $host, "https", sub { ... |
1078 | |
1097 | |
1079 | The other change deals with TLS, which is something L<AnyEvent::Handle> |
1098 | The other change deals with TLS, which is something L<AnyEvent::Handle> |
1080 | does for us, as long as I<you> made sure that the L<Net::SSLeay> module is |
1099 | does for us, as long as I<you> made sure that the L<Net::SSLeay> module |
1081 | around. To enable TLS with L<AnyEvent::Handle>, we simply pass an addition |
1100 | is around. To enable TLS with L<AnyEvent::Handle>, we simply pass an |
1082 | C<tls> parameter to the call to C<AnyEvent::Handle::new>: |
1101 | additional C<tls> parameter to the call to C<AnyEvent::Handle::new>: |
1083 | |
1102 | |
1084 | tls => "connect", |
1103 | tls => "connect", |
1085 | |
1104 | |
1086 | Specifying C<tls> enables TLS, and the argument specifies whether |
1105 | Specifying C<tls> enables TLS, and the argument specifies whether |
1087 | AnyEvent::Handle is the server side ("accept") or the client side |
1106 | AnyEvent::Handle is the server side ("accept") or the client side |
1088 | ("connect") for the TLS connection, as unlike TCP, there is a clear |
1107 | ("connect") for the TLS connection, as unlike TCP, there is a clear |
1089 | server/client relationship in TLS. |
1108 | server/client relationship in TLS. |
1090 | |
1109 | |
1091 | That's all. |
1110 | That's all. |
1092 | |
1111 | |
1093 | Of course, all this should be handled transparently by C<http_get> after |
1112 | Of course, all this should be handled transparently by C<http_get> |
1094 | parsing the URL. See the part about exercising your inspiration earlier in |
1113 | after parsing the URL. If you need this, see the part about exercising |
1095 | this document. |
1114 | your inspiration earlier in this document. You could also use the |
|
|
1115 | L<AnyEvent::HTTP> module from CPAN, which implements all this and works |
|
|
1116 | around a lot of quirks for you, too. |
1096 | |
1117 | |
1097 | =head3 The read queue - revisited |
1118 | =head3 The read queue - revisited |
1098 | |
1119 | |
1099 | HTTP always uses the same structure in its responses, but many protocols |
1120 | HTTP always uses the same structure in its responses, but many protocols |
1100 | require parsing responses different depending on the response itself. |
1121 | require parsing responses differently depending on the response itself. |
1101 | |
1122 | |
1102 | For example, in SMTP, you normally get a single response line: |
1123 | For example, in SMTP, you normally get a single response line: |
1103 | |
1124 | |
1104 | 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net> |
1125 | 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net> |
1105 | |
1126 | |
… | |
… | |
1108 | 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net> |
1129 | 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net> |
1109 | 220-hey guys |
1130 | 220-hey guys |
1110 | 220 my response is longer than yours |
1131 | 220 my response is longer than yours |
1111 | |
1132 | |
1112 | To handle this, we need C<unshift_read>. As the name (hopefully) implies, |
1133 | To handle this, we need C<unshift_read>. As the name (hopefully) implies, |
1113 | C<unshift_read> will not append your read request tot he end of the read |
1134 | C<unshift_read> will not append your read request to the end of the read |
1114 | queue, but instead it will prepend it to the queue. |
1135 | queue, but instead it will prepend it to the queue. |
1115 | |
1136 | |
1116 | This is useful for this this situation: You push your response-line read |
1137 | This is useful in the situation above: Just push your response-line read |
1117 | request when sending the SMTP command, and when handling it, you look at |
1138 | request when sending the SMTP command, and when handling it, you look at |
1118 | the line to see if more is to come, and C<unshift_read> another reader, |
1139 | the line to see if more is to come, and C<unshift_read> another reader |
1119 | like this: |
1140 | callback if required, like this: |
1120 | |
1141 | |
1121 | my $response; # response lines end up in here |
1142 | my $response; # response lines end up in here |
1122 | |
1143 | |
1123 | my $read_response; $read_response = sub { |
1144 | my $read_response; $read_response = sub { |
1124 | my ($handle, $line) = @_; |
1145 | my ($handle, $line) = @_; |
… | |
… | |
1170 | } |
1191 | } |
1171 | |
1192 | |
1172 | =head3 Your own read queue handler |
1193 | =head3 Your own read queue handler |
1173 | |
1194 | |
1174 | Sometimes, your protocol doesn't play nice and uses lines or chunks of |
1195 | Sometimes, your protocol doesn't play nice and uses lines or chunks of |
|
|
1196 | data not formatted in a way handled by AnyEvent::Handle out of the box. In |
1175 | data, in which case you have to implement your own read parser. |
1197 | this case you have to implement your own read parser. |
1176 | |
1198 | |
1177 | To make up a contorted example, imagine you are looking for an even |
1199 | To make up a contorted example, imagine you are looking for an even |
1178 | number of characters followed by a colon (":"). Also imagine that |
1200 | number of characters followed by a colon (":"). Also imagine that |
1179 | AnyEvent::Handle had no C<regex> read type which could be used, so you'd |
1201 | AnyEvent::Handle had no C<regex> read type which could be used, so you'd |
1180 | had to do it manually. |
1202 | had to do it manually. |
1181 | |
1203 | |
1182 | To implement this, you would C<push_read> (or C<unshift_read>) just a |
1204 | To implement a read handler for this, you would C<push_read> (or |
1183 | single code reference. |
1205 | C<unshift_read>) just a single code reference. |
1184 | |
1206 | |
1185 | This code reference will then be called each time there is (new) data |
1207 | This code reference will then be called each time there is (new) data |
1186 | available in the read buffer, and is expected to either eat/consume some |
1208 | available in the read buffer, and is expected to either successfully |
1187 | of that data (and return true) or to return false to indicate that it |
1209 | eat/consume some of that data (and return true) or to return false to |
1188 | wants to be called again. |
1210 | indicate that it wants to be called again. |
1189 | |
1211 | |
1190 | If the code reference returns true, then it will be removed from the read |
1212 | If the code reference returns true, then it will be removed from the |
|
|
1213 | read queue (because it has parsed/consumed whatever it was supposed to |
1191 | queue, otherwise it stays in front of it. |
1214 | consume), otherwise it stays in the front of it. |
1192 | |
1215 | |
1193 | The example above could be coded like this: |
1216 | The example above could be coded like this: |
1194 | |
1217 | |
1195 | $handle->push_read (sub { |
1218 | $handle->push_read (sub { |
1196 | my ($handle) = @_; |
1219 | my ($handle) = @_; |
… | |
… | |
1207 | |
1230 | |
1208 | # and return true to indicate we are done |
1231 | # and return true to indicate we are done |
1209 | 1 |
1232 | 1 |
1210 | }); |
1233 | }); |
1211 | |
1234 | |
|
|
1235 | This concludes our little tutorial. |
|
|
1236 | |
|
|
1237 | =head1 Where to go from here? |
|
|
1238 | |
|
|
1239 | This introduction should have explained the key concepts between |
|
|
1240 | L<AnyEvent>, namely event watchers and condition variables, |
|
|
1241 | L<AnyEvent::Socket>, for your basic networking needs, and |
|
|
1242 | L<AnyEvent::Handle>, a nice wrapper around handles. |
|
|
1243 | |
|
|
1244 | You could either start coding stuff right away, look at those manual |
|
|
1245 | pages for the gory details, or roam CPAN for other AnyEvent modules (such |
|
|
1246 | as L<AnyEvent::IRC> or L<AnyEvent::HTTP>) to see more code examples (or |
|
|
1247 | simply to use them). |
|
|
1248 | |
|
|
1249 | If you need a protocol that doesn't have an implementation using AnyEvent, |
|
|
1250 | remember that you can mix AnyEvent with one other event framework, such as |
|
|
1251 | L<POE>, so you can always use AnyEvent for your own tasks plus modules of |
|
|
1252 | one other event framework to fill any gaps. |
|
|
1253 | |
|
|
1254 | And last not least, you could also look at L<Coro>, especially |
|
|
1255 | L<Coro::AnyEvent>, to see how you can turn event-based programming from |
|
|
1256 | callback style back to the usual imperative style (also called "inversion |
|
|
1257 | of control" - AnyEvent calls I<you>, but Coro lets I<you> call AnyEvent). |
1212 | |
1258 | |
1213 | =head1 Authors |
1259 | =head1 Authors |
1214 | |
1260 | |
1215 | Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. |
1261 | Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. |
1216 | |
1262 | |