|
|
1 | =head1 NAME |
|
|
2 | |
|
|
3 | AnyEvent::Intro - an introductory tutorial to AnyEvent |
|
|
4 | |
1 | =head1 Introduction to AnyEvent |
5 | =head1 Introduction to AnyEvent |
2 | |
6 | |
3 | This is a tutorial that will introduce you to the features of AnyEvent. |
7 | This is a tutorial that will introduce you to the features of AnyEvent. |
4 | |
8 | |
5 | The first part introduces the core AnyEvent module (after swamping you a |
9 | The first part introduces the core AnyEvent module (after swamping you a |
6 | bit in evangelism), which might already provide all you ever need. |
10 | bit in evangelism), which might already provide all you ever need. If you |
|
|
11 | are only interested in AnyEvent's event handling capabilities, read no |
|
|
12 | further. |
7 | |
13 | |
8 | The second part focuses on network programming using sockets, for which |
14 | The second part focuses on network programming using sockets, for which |
9 | AnyEvent offers a lot of support you can use. |
15 | AnyEvent offers a lot of support you can use, and a lot of workarounds |
|
|
16 | around portability quirks. |
10 | |
17 | |
11 | |
18 | |
12 | =head1 What is AnyEvent? |
19 | =head1 What is AnyEvent? |
13 | |
20 | |
14 | If you don't care for the whys and want to see code, skip this section! |
21 | If you don't care for the whys and want to see code, skip this section! |
… | |
… | |
101 | ); |
108 | ); |
102 | |
109 | |
103 | # do something else here |
110 | # do something else here |
104 | |
111 | |
105 | Looks more complicated, and surely is, but the advantage of using events |
112 | Looks more complicated, and surely is, but the advantage of using events |
106 | is that your program can do something else instead of waiting for |
113 | is that your program can do something else instead of waiting for input |
|
|
114 | (side note: combining AnyEvent with a thread package such as Coro can |
|
|
115 | recoup much of the simplicity, effectively getting the best of two |
|
|
116 | worlds). |
|
|
117 | |
107 | input. Waiting as in the first example is also called "blocking" because |
118 | Waiting as done in the first example is also called "blocking" the process |
108 | you "block" your process from executing anything else while you do so. |
119 | because you "block"/keep your process from executing anything else while |
|
|
120 | you do so. |
109 | |
121 | |
110 | The second example avoids blocking, by only registering interest in a read |
122 | The second example avoids blocking by only registering interest in a read |
111 | event, which is fast and doesn't block your process. Only when read data |
123 | event, which is fast and doesn't block your process. Only when read data |
112 | is available will the callback be called, which can then proceed to read |
124 | is available will the callback be called, which can then proceed to read |
113 | the data. |
125 | the data. |
114 | |
126 | |
115 | The "interest" is represented by an object returned by C<< AnyEvent->io |
127 | The "interest" is represented by an object returned by C<< AnyEvent->io |
116 | >> called a "watcher" object - called like that because it "watches" your |
128 | >> called a "watcher" object - called like that because it "watches" your |
117 | file handle (or other event sources) for the event you are interested in. |
129 | file handle (or other event sources) for the event you are interested in. |
118 | |
130 | |
119 | In the example above, we create an I/O watcher by calling the C<< |
131 | In the example above, we create an I/O watcher by calling the C<< |
120 | AnyEvent->io >> method. Disinterest in some event is simply expressed by |
132 | AnyEvent->io >> method. Disinterest in some event is simply expressed |
121 | forgetting about the watcher, for example, by C<undef>'ing the variable it |
133 | by forgetting about the watcher, for example, by C<undef>'ing the only |
122 | is stored in. AnyEvent will automatically clean up the watcher if it is no |
134 | variable it is stored in. AnyEvent will automatically clean up the watcher |
123 | longer used, much like Perl closes your file handles if you no longer use |
135 | if it is no longer used, much like Perl closes your file handles if you no |
124 | them anywhere. |
136 | longer use them anywhere. |
|
|
137 | |
|
|
138 | =head3 A short note on callbacks |
|
|
139 | |
|
|
140 | A common issue that hits people is the problem of passing parameters |
|
|
141 | to callbacks. Programmers used to languages such as C or C++ are often |
|
|
142 | used to a style where one passes the address of a function (a function |
|
|
143 | reference) and some data value, e.g.: |
|
|
144 | |
|
|
145 | sub callback { |
|
|
146 | my ($arg) = @_; |
|
|
147 | |
|
|
148 | $arg->method; |
|
|
149 | } |
|
|
150 | |
|
|
151 | my $arg = ...; |
|
|
152 | |
|
|
153 | call_me_back_later \&callback, $arg; |
|
|
154 | |
|
|
155 | This is clumsy, as the place where behaviour is specified (when the |
|
|
156 | callback is registered) is often far away from the place where behaviour |
|
|
157 | is implemented. It also doesn't use Perl syntax to invoke the code. There |
|
|
158 | is also an abstraction penalty to pay as one has to I<name> the callback, |
|
|
159 | which often is unnecessary and leads to nonsensical or duplicated names. |
|
|
160 | |
|
|
161 | In Perl, one can specify behaviour much more directly by using |
|
|
162 | I<closures>. Closures are code blocks that take a reference to the |
|
|
163 | enclosing scope(s) when they are created. This means lexical variables in |
|
|
164 | scope at the time of creating the closure can simply be used inside the |
|
|
165 | closure: |
|
|
166 | |
|
|
167 | my $arg = ...; |
|
|
168 | |
|
|
169 | call_me_back_later sub { $arg->method }; |
|
|
170 | |
|
|
171 | Under most circumstances, closures are faster, use fewer resources and |
|
|
172 | result in much clearer code then the traditional approach. Faster, |
|
|
173 | because parameter passing and storing them in local variables in Perl |
|
|
174 | is relatively slow. Fewer resources, because closures take references |
|
|
175 | to existing variables without having to create new ones, and clearer |
|
|
176 | code because it is immediately obvious that the second example calls the |
|
|
177 | C<method> method when the callback is invoked. |
|
|
178 | |
|
|
179 | Apart from these, the strongest argument for using closures with AnyEvent |
|
|
180 | is that AnyEvent does not allow passing parameters to the callback, so |
|
|
181 | closures are the only way to achieve that in most cases :-> |
|
|
182 | |
|
|
183 | |
|
|
184 | =head3 A hint on debugging |
|
|
185 | |
|
|
186 | AnyEvent does, by default, not do any argument checking. This can lead to |
|
|
187 | strange and unexpected results especially if you are trying to learn your |
|
|
188 | ways with AnyEvent. |
|
|
189 | |
|
|
190 | AnyEvent supports a special "strict" mode, off by default, which does very |
|
|
191 | strict argument checking, at the expense of being somewhat slower. During |
|
|
192 | development, however, this mode is very useful. |
|
|
193 | |
|
|
194 | You can enable this strict mode either by having an environment variable |
|
|
195 | C<PERL_ANYEVENT_STRICT> with a true value in your environment: |
|
|
196 | |
|
|
197 | PERL_ANYEVENT_STRICT=1 perl test.pl |
|
|
198 | |
|
|
199 | Or you can write C<use AnyEvent::Strict> in your program, which has the |
|
|
200 | same effect (do not do this in production, however). |
|
|
201 | |
125 | |
202 | |
126 | =head2 Condition Variables |
203 | =head2 Condition Variables |
127 | |
204 | |
128 | However, the above is not a fully working program, and will not work |
205 | Back to the I/O watcher example: The code is not yet a fully working |
129 | as-is. The reason is that your callback will not be invoked out of the |
206 | program, and will not work as-is. The reason is that your callback will |
130 | blue, you have to run the event loop. Also, event-based programs sometimes |
207 | not be invoked out of the blue, you have to run the event loop. Also, |
131 | have to block, too, as when there simply is nothing else to do and |
208 | event-based programs sometimes have to block, too, as when there simply is |
132 | everything waits for some events, it needs to block the process as well. |
209 | nothing else to do and everything waits for some events, it needs to block |
|
|
210 | the process as well until new events arrive. |
133 | |
211 | |
134 | In AnyEvent, this is done using condition variables. Condition variables |
212 | In AnyEvent, this is done using condition variables. Condition variables |
135 | are named "condition variables" because they represent a condition that is |
213 | are named "condition variables" because they represent a condition that is |
136 | initially false and needs to be fulfilled. |
214 | initially false and needs to be fulfilled. |
137 | |
215 | |
… | |
… | |
139 | or even callbacks and many other things (and they are often called like |
217 | or even callbacks and many other things (and they are often called like |
140 | this in other frameworks). The important point is that you can create them |
218 | this in other frameworks). The important point is that you can create them |
141 | freely and later wait for them to become true. |
219 | freely and later wait for them to become true. |
142 | |
220 | |
143 | Condition variables have two sides - one side is the "producer" of the |
221 | Condition variables have two sides - one side is the "producer" of the |
144 | condition (whatever code detects the condition), the other side is the |
222 | condition (whatever code detects and flags the condition), the other side |
145 | "consumer" (the code that waits for that condition). |
223 | is the "consumer" (the code that waits for that condition). |
146 | |
224 | |
147 | In our example in the previous section, the producer is the event callback |
225 | In our example in the previous section, the producer is the event callback |
148 | and there is no consumer yet - let's change that now: |
226 | and there is no consumer yet - let's change that right now: |
149 | |
227 | |
150 | use AnyEvent; |
228 | use AnyEvent; |
151 | |
229 | |
152 | $| = 1; print "enter your name> "; |
230 | $| = 1; print "enter your name> "; |
153 | |
231 | |
… | |
… | |
174 | print "your name is $name\n"; |
252 | print "your name is $name\n"; |
175 | |
253 | |
176 | This program creates an AnyEvent condvar by calling the C<< |
254 | This program creates an AnyEvent condvar by calling the C<< |
177 | AnyEvent->condvar >> method. It then creates a watcher as usual, but |
255 | AnyEvent->condvar >> method. It then creates a watcher as usual, but |
178 | inside the callback it C<send>'s the C<$name_ready> condition variable, |
256 | inside the callback it C<send>'s the C<$name_ready> condition variable, |
179 | which causes anybody waiting on it to continue. |
257 | which causes whoever is waiting on it to continue. |
180 | |
258 | |
181 | The "anybody" in this case is the code that follows, which calls C<< |
259 | The "whoever" in this case is the code that follows, which calls C<< |
182 | $name_ready->recv >>: The producer calls C<send>, the consumer calls |
260 | $name_ready->recv >>: The producer calls C<send>, the consumer calls |
183 | C<recv>. |
261 | C<recv>. |
184 | |
262 | |
185 | If there is no C<$name> available yet, then the call to C<< |
263 | If there is no C<$name> available yet, then the call to C<< |
186 | $name_ready->recv >> will halt your program until the condition becomes |
264 | $name_ready->recv >> will halt your program until the condition becomes |
… | |
… | |
196 | |
274 | |
197 | my $name_ready = AnyEvent->condvar; |
275 | my $name_ready = AnyEvent->condvar; |
198 | |
276 | |
199 | my $wait_for_input = AnyEvent->io ( |
277 | my $wait_for_input = AnyEvent->io ( |
200 | fh => \*STDIN, poll => "r", |
278 | fh => \*STDIN, poll => "r", |
201 | cb => sub { $name_ready->send (scalar = <STDIN>) } |
279 | cb => sub { $name_ready->send (scalar <STDIN>) } |
202 | ); |
280 | ); |
203 | |
281 | |
204 | # do something else here |
282 | # do something else here |
205 | |
283 | |
206 | # now wait and fetch the name |
284 | # now wait and fetch the name |
… | |
… | |
268 | This also shows that AnyEvent is quite flexible - you didn't have anything |
346 | This also shows that AnyEvent is quite flexible - you didn't have anything |
269 | to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just |
347 | to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just |
270 | worked. |
348 | worked. |
271 | |
349 | |
272 | Admittedly, the example is a bit silly - who would want to read names |
350 | Admittedly, the example is a bit silly - who would want to read names |
273 | form standard input in a Gtk+ application. But imagine that instead of |
351 | from standard input in a Gtk+ application. But imagine that instead of |
274 | doing that, you would make a HTTP request in the background and display |
352 | doing that, you would make a HTTP request in the background and display |
275 | it's results. In fact, with event-based programming you can make many |
353 | it's results. In fact, with event-based programming you can make many |
276 | http-requests in parallel in your program and still provide feedback to |
354 | http-requests in parallel in your program and still provide feedback to |
277 | the user and stay interactive. |
355 | the user and stay interactive. |
278 | |
356 | |
279 | In the next part you will see how to do just that - by implementing an |
357 | And in the next part you will see how to do just that - by implementing an |
280 | HTTP request, on our own, with the utility modules AnyEvent comes with. |
358 | HTTP request, on our own, with the utility modules AnyEvent comes with. |
281 | |
359 | |
282 | Before that, however, let's briefly look at how you would write your |
360 | Before that, however, let's briefly look at how you would write your |
283 | program with using only AnyEvent, without ever calling some other event |
361 | program with using only AnyEvent, without ever calling some other event |
284 | loop's run function. |
362 | loop's run function. |
285 | |
363 | |
286 | In the example using condition variables, we used that, and in fact, this |
364 | In the example using condition variables, we used those to start waiting |
287 | is the solution: |
365 | for events, and in fact, condition variables are the solution: |
288 | |
366 | |
289 | my $quit_program = AnyEvent->condvar; |
367 | my $quit_program = AnyEvent->condvar; |
290 | |
368 | |
291 | # create AnyEvent watchers (or not) here |
369 | # create AnyEvent watchers (or not) here |
292 | |
370 | |
293 | $quit_program->recv; |
371 | $quit_program->recv; |
294 | |
372 | |
295 | If any of your watcher callbacks decide to quit, they can simply call |
373 | If any of your watcher callbacks decide to quit (this is often |
|
|
374 | called an "unloop" in other frameworks), they can simply call C<< |
296 | C<< $quit_program->send >>. Of course, they could also decide not to and |
375 | $quit_program->send >>. Of course, they could also decide not to and |
297 | simply call C<exit> instead, or they could decide not to quit, ever (e.g. |
376 | simply call C<exit> instead, or they could decide not to quit, ever (e.g. |
298 | in a long-running daemon program). |
377 | in a long-running daemon program). |
299 | |
378 | |
300 | In that case, you can simply use: |
379 | If you don't need some clean quit functionality and just want to run the |
|
|
380 | event loop, you can simply do this: |
301 | |
381 | |
302 | AnyEvent->condvar->recv; |
382 | AnyEvent->condvar->recv; |
303 | |
383 | |
304 | And this is, in fact, closest to the idea of a main loop run function that |
384 | And this is, in fact, closest to the idea of a main loop run function that |
305 | AnyEvent offers. |
385 | AnyEvent offers. |
… | |
… | |
337 | |
417 | |
338 | # now wait till our time has come |
418 | # now wait till our time has come |
339 | $cv->recv; |
419 | $cv->recv; |
340 | |
420 | |
341 | Unlike I/O watchers, timers are only interested in the amount of seconds |
421 | Unlike I/O watchers, timers are only interested in the amount of seconds |
342 | they have to wait. When that amount of time has passed, AnyEvent will |
422 | they have to wait. When (at least) that amount of time has passed, |
343 | invoke your callback. |
423 | AnyEvent will invoke your callback. |
344 | |
424 | |
345 | Unlike I/O watchers, which will call your callback as many times as there |
425 | Unlike I/O watchers, which will call your callback as many times as there |
346 | is data available, timers are one-shot: after they have "fired" once and |
426 | is data available, timers are normally one-shot: after they have "fired" |
347 | invoked your callback, they are dead and no longer do anything. |
427 | once and invoked your callback, they are dead and no longer do anything. |
348 | |
428 | |
349 | To get a repeating timer, such as a timer firing roughly once per second, |
429 | To get a repeating timer, such as a timer firing roughly once per second, |
350 | you have to recreate it: |
430 | you can specify an C<interval> parameter: |
351 | |
431 | |
352 | use AnyEvent; |
432 | my $once_per_second = AnyEvent->timer ( |
353 | |
433 | after => 0, # first invoke ASAP |
354 | my $time_watcher; |
434 | interval => 1, # then invoke every second |
355 | |
435 | cb => sub { # the callback to invoke |
356 | sub once_per_second { |
436 | $cv->send; |
357 | print "tick\n"; |
|
|
358 | |
437 | }, |
359 | # (re-)create the watcher |
|
|
360 | $time_watcher = AnyEvent->timer ( |
|
|
361 | after => 1, |
|
|
362 | cb => \&once_per_second, |
|
|
363 | ); |
438 | ); |
364 | } |
|
|
365 | |
|
|
366 | # now start the timer |
|
|
367 | once_per_second; |
|
|
368 | |
|
|
369 | Having to recreate your timer is a restriction put on AnyEvent that is |
|
|
370 | present in most event libraries it uses. It is so annoying that some |
|
|
371 | future version might work around this limitation, but right now, it's the |
|
|
372 | only way to do repeating timers. |
|
|
373 | |
|
|
374 | Fortunately most timers aren't really repeating but specify timeouts of |
|
|
375 | some sort. |
|
|
376 | |
439 | |
377 | =head3 More esoteric sources |
440 | =head3 More esoteric sources |
378 | |
441 | |
379 | AnyEvent also has some other, more esoteric event sources you can tap |
442 | AnyEvent also has some other, more esoteric event sources you can tap |
380 | into: signal and child watchers. |
443 | into: signal, child and idle watchers. |
381 | |
444 | |
382 | Signal watchers can be used to wait for "signal events", which simply |
445 | Signal watchers can be used to wait for "signal events", which simply |
383 | means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>). |
446 | means your process got send a signal (such as C<SIGTERM> or C<SIGUSR1>). |
384 | |
447 | |
385 | Process watchers wait for a child process to exit. They are useful when |
448 | Child-process watchers wait for a child process to exit. They are useful |
386 | you fork a separate process and need to know when it exits, but you do not |
449 | when you fork a separate process and need to know when it exits, but you |
387 | wait for that by blocking. |
450 | do not wait for that by blocking. |
388 | |
451 | |
|
|
452 | Idle watchers invoke their callback when the event loop has handled all |
|
|
453 | outstanding events, polled for new events and didn't find any, i.e., when |
|
|
454 | your process is otherwise idle. They are useful if you want to do some |
|
|
455 | non-trivial data processing that can be done when your program doesn't |
|
|
456 | have anything better to do. |
|
|
457 | |
389 | Both watcher types are described in detail in the main L<AnyEvent> manual |
458 | All these watcher types are described in detail in the main L<AnyEvent> |
390 | page. |
459 | manual page. |
391 | |
460 | |
|
|
461 | Sometimes you also need to know what the current time is: C<< |
|
|
462 | AnyEvent->now >> returns the time the event toolkit uses to schedule |
|
|
463 | relative timers, and is usually what you want. It is often cached (which |
|
|
464 | means it can be a bit outdated). In that case, you can use the more costly |
|
|
465 | C<< AnyEvent->time >> method which will ask your operating system for the |
|
|
466 | current time, which is slower, but also more up to date. |
392 | |
467 | |
393 | =head1 Network programming and AnyEvent |
468 | =head1 Network programming and AnyEvent |
394 | |
469 | |
395 | So far you have seen how to register event watchers and handle events. |
470 | So far you have seen how to register event watchers and handle events. |
396 | |
471 | |
397 | This is a great foundation to write network clients and servers, and might be |
472 | This is a great foundation to write network clients and servers, and might |
398 | all that your module (or program) ever requires, but writing your own I/O |
473 | be all that your module (or program) ever requires, but writing your own |
399 | buffering again and again becomes tedious, not to mention that it attracts |
474 | I/O buffering again and again becomes tedious, not to mention that it |
400 | errors. |
475 | attracts errors. |
401 | |
476 | |
402 | While the core L<AnyEvent> module is still small and self-contained, |
477 | While the core L<AnyEvent> module is still small and self-contained, |
403 | the distribution comes with some very useful utility modules such as |
478 | the distribution comes with some very useful utility modules such as |
404 | L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can |
479 | L<AnyEvent::Handle>, L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can |
405 | make your life as non-blocking network programmer a lot easier. |
480 | make your life as non-blocking network programmer a lot easier. |
… | |
… | |
413 | a great way to do other DNS resolution tasks, such as reverse lookups of |
488 | a great way to do other DNS resolution tasks, such as reverse lookups of |
414 | IP addresses for log files. |
489 | IP addresses for log files. |
415 | |
490 | |
416 | =head2 L<AnyEvent::Handle> |
491 | =head2 L<AnyEvent::Handle> |
417 | |
492 | |
418 | This module handles non-blocking IO on file handles in an event based |
493 | This module handles non-blocking IO on (socket-, pipe- etc.) file handles |
419 | manner. It provides a wrapper object around your file handle that provides |
494 | in an event based manner. It provides a wrapper object around your file |
420 | queueing and buffering of incoming and outgoing data for you. |
495 | handle that provides queueing and buffering of incoming and outgoing data |
|
|
496 | for you. |
421 | |
497 | |
422 | It also implements the most common data formats, such as text lines, or |
498 | It also implements the most common data formats, such as text lines, or |
423 | fixed and variable-width data blocks. |
499 | fixed and variable-width data blocks. |
424 | |
500 | |
425 | =head2 L<AnyEvent::Socket> |
501 | =head2 L<AnyEvent::Socket> |
… | |
… | |
443 | to your program? That C<WSAEINPROGRESS> means your C<connect> call was |
519 | to your program? That C<WSAEINPROGRESS> means your C<connect> call was |
444 | ignored instead of being in progress? AnyEvent::Socket works around all of |
520 | ignored instead of being in progress? AnyEvent::Socket works around all of |
445 | these Windows/Perl bugs for you). |
521 | these Windows/Perl bugs for you). |
446 | |
522 | |
447 | =head2 Implementing a parallel finger client with non-blocking connects |
523 | =head2 Implementing a parallel finger client with non-blocking connects |
|
|
524 | and AnyEvent::Socket |
448 | |
525 | |
449 | The finger protocol is one of the simplest protocols in use on the |
526 | The finger protocol is one of the simplest protocols in use on the |
450 | internet. Or in use in the past, as almost nobody uses it anymore. |
527 | internet. Or in use in the past, as almost nobody uses it anymore. |
451 | |
528 | |
452 | It works by connecting to the finger port on another host, writing a |
529 | It works by connecting to the finger port on another host, writing a |
453 | single line with a user name and then reading the finger response, as |
530 | single line with a user name and then reading the finger response, as |
454 | specified by that user. OK, RFC 1288 specifies a vastly more complex |
531 | specified by that user. OK, RFC 1288 specifies a vastly more complex |
455 | protocol, but it basically boils down to this: |
532 | protocol, but it basically boils down to this: |
456 | |
533 | |
457 | # telnet idsoftware.com finger |
534 | # telnet kernel.org finger |
458 | Trying 192.246.40.37... |
535 | Trying 204.152.191.37... |
459 | Connected to idsoftware.com (192.246.40.37). |
536 | Connected to kernel.org (204.152.191.37). |
460 | Escape character is '^]'. |
537 | Escape character is '^]'. |
461 | johnc |
538 | |
462 | Welcome to id Software's Finger Service V1.5! |
539 | The latest stable version of the Linux kernel is: [...] |
463 | |
|
|
464 | [...] |
|
|
465 | Now on the web: |
|
|
466 | [...] |
|
|
467 | |
|
|
468 | Connection closed by foreign host. |
540 | Connection closed by foreign host. |
469 | |
541 | |
470 | "Now on the web..." yeah, I<was> used indeed, but at least the finger |
542 | So let's write a little AnyEvent function that makes a finger request: |
471 | daemon still works, so let's write a little AnyEvent function that makes a |
|
|
472 | finger request: |
|
|
473 | |
543 | |
474 | use AnyEvent; |
544 | use AnyEvent; |
475 | use AnyEvent::Socket; |
545 | use AnyEvent::Socket; |
476 | |
546 | |
477 | sub finger($$) { |
547 | sub finger($$) { |
… | |
… | |
542 | socket handle as first argument, otherwise, nothing will be passed to our |
612 | socket handle as first argument, otherwise, nothing will be passed to our |
543 | callback. The important point is that it will always be called as soon as |
613 | callback. The important point is that it will always be called as soon as |
544 | the outcome of the TCP connect is known. |
614 | the outcome of the TCP connect is known. |
545 | |
615 | |
546 | This style of programming is also called "continuation style": the |
616 | This style of programming is also called "continuation style": the |
547 | "continuation" is simply the way the program continues - normally, a |
617 | "continuation" is simply the way the program continues - normally at the |
548 | program continues at the next line after some statement (the exception |
618 | next line after some statement (the exception is loops or things like |
549 | is loops or things like C<return>). When we are interested in events, |
619 | C<return>). When we are interested in events, however, we instead specify |
550 | however, we instead specify the "continuation" of our program by passing a |
620 | the "continuation" of our program by passing a closure, which makes that |
551 | closure, which makes that closure the "continuation" of the program. The |
621 | closure the "continuation" of the program. |
|
|
622 | |
552 | C<tcp_connect> call is like saying "return now, and when the connection is |
623 | The C<tcp_connect> call is like saying "return now, and when the |
553 | established or it failed, continue there". |
624 | connection is established or it failed, continue there". |
554 | |
625 | |
555 | Now let's look at the callback/closure in more detail: |
626 | Now let's look at the callback/closure in more detail: |
556 | |
627 | |
557 | # the callback receives the socket handle - or nothing |
628 | # the callback receives the socket handle - or nothing |
558 | my ($fh) = @_ |
629 | my ($fh) = @_ |
… | |
… | |
570 | report the results to anybody, certainly not the caller of our C<finger> |
641 | report the results to anybody, certainly not the caller of our C<finger> |
571 | function, and most event loops continue even after a C<die>! |
642 | function, and most event loops continue even after a C<die>! |
572 | |
643 | |
573 | This is why we instead C<return>, but also call C<< $cv->send >> without |
644 | This is why we instead C<return>, but also call C<< $cv->send >> without |
574 | any arguments to signal to the condvar consumer that something bad has |
645 | any arguments to signal to the condvar consumer that something bad has |
575 | happened. The return value of C<< $cv->send >> is irrelevant, as is the |
646 | happened. The return value of C<< $cv->send >> is irrelevant, as is |
576 | return value of our callback. The return statement is simply used for the |
647 | the return value of our callback. The C<return> statement is simply |
577 | side effect of, well, returning immediately from the callback. Checking |
648 | used for the side effect of, well, returning immediately from the |
578 | for errors and handling them this way is very common, which is why this |
649 | callback. Checking for errors and handling them this way is very common, |
579 | compact idiom is so handy. |
650 | which is why this compact idiom is so handy. |
580 | |
651 | |
581 | As the next step in the finger protocol, we send the username to the |
652 | As the next step in the finger protocol, we send the username to the |
582 | finger daemon on the other side of our connection: |
653 | finger daemon on the other side of our connection (the kernel.org finger |
|
|
654 | service doesn't actually wait for a username, but the net is running out |
|
|
655 | of finger servers fast): |
583 | |
656 | |
584 | syswrite $fh, "$user\015\012"; |
657 | syswrite $fh, "$user\015\012"; |
585 | |
658 | |
586 | Note that this isn't 100% clean socket programming - the socket could, |
659 | Note that this isn't 100% clean socket programming - the socket could, |
587 | for whatever reasons, not accept our data. When writing a small amount |
660 | for whatever reasons, not accept our data. When writing a small amount |
… | |
… | |
605 | variable, but in a local one - if the callback returns, it would normally |
678 | variable, but in a local one - if the callback returns, it would normally |
606 | destroy the variable and its contents, which would in turn unregister our |
679 | destroy the variable and its contents, which would in turn unregister our |
607 | watcher. |
680 | watcher. |
608 | |
681 | |
609 | To avoid that, we C<undef>ine the variable in the watcher callback. This |
682 | To avoid that, we C<undef>ine the variable in the watcher callback. This |
610 | means that, when the C<tcp_connect> callback returns, that perl thinks |
683 | means that, when the C<tcp_connect> callback returns, perl thinks (quite |
611 | (quite correctly) that the read watcher is still in use - namely in the |
684 | correctly) that the read watcher is still in use - namely in the callback, |
612 | callback. |
685 | and thus keeps it alive even if nothing else in the program refers to it |
|
|
686 | anymore (it is much like Baron Münchhausen keeping himself from dying by |
|
|
687 | pulling himself out of a swamp). |
613 | |
688 | |
614 | The trick, however, is that instead of: |
689 | The trick, however, is that instead of: |
615 | |
690 | |
616 | my $read_watcher = AnyEvent->io (... |
691 | my $read_watcher = AnyEvent->io (... |
617 | |
692 | |
… | |
… | |
636 | my $len = sysread $fh, $response, 1024, length $response; |
711 | my $len = sysread $fh, $response, 1024, length $response; |
637 | |
712 | |
638 | if ($len <= 0) { |
713 | if ($len <= 0) { |
639 | |
714 | |
640 | Note that C<sysread> has the ability to append data it reads to a scalar, |
715 | Note that C<sysread> has the ability to append data it reads to a scalar, |
641 | by specifying an offset, which is what we make good use of in this |
716 | by specifying an offset, a feature of which we make good use of in this |
642 | example. |
717 | example. |
643 | |
718 | |
644 | When C<sysread> indicates we are done, the callback C<undef>ines |
719 | When C<sysread> indicates we are done, the callback C<undef>ines |
645 | the watcher and then C<send>'s the response data to the condition |
720 | the watcher and then C<send>'s the response data to the condition |
646 | variable. All this has the following effects: |
721 | variable. All this has the following effects: |
… | |
… | |
660 | But the main advantage is that we can not only run this finger function in |
735 | But the main advantage is that we can not only run this finger function in |
661 | the background, we even can run multiple sessions in parallel, like this: |
736 | the background, we even can run multiple sessions in parallel, like this: |
662 | |
737 | |
663 | my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets |
738 | my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets |
664 | my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 |
739 | my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 |
665 | my $f3 = finger "johnc", "idsoftware.com"; # finger john |
740 | my $f3 = finger "hpa" , "kernel.org"; # finger hpa |
666 | |
741 | |
667 | print "trouble tickets:\n", $f1->recv, "\n"; |
742 | print "trouble tickets:\n" , $f1->recv, "\n"; |
668 | print "trouble ticket #1736:\n", $f2->recv, "\n"; |
743 | print "trouble ticket #1736:\n", $f2->recv, "\n"; |
669 | print "john carmacks finger file: ", $f3->recv, "\n"; |
744 | print "kernel release info: " , $f3->recv, "\n"; |
670 | |
745 | |
671 | It doesn't look like it, but in fact all three requests run in |
746 | It doesn't look like it, but in fact all three requests run in |
672 | parallel. The code waits for the first finger request to finish first, but |
747 | parallel. The code waits for the first finger request to finish first, but |
673 | that doesn't keep it from executing them parallel: when the first C<recv> |
748 | that doesn't keep it from executing them parallel: when the first C<recv> |
674 | call sees that the data isn't ready yet, it serves events for all three |
749 | call sees that the data isn't ready yet, it serves events for all three |
… | |
… | |
702 | How you implement it is a matter of taste - if you expect your function to |
777 | How you implement it is a matter of taste - if you expect your function to |
703 | be used mainly in an event-based program you would normally prefer to pass |
778 | be used mainly in an event-based program you would normally prefer to pass |
704 | a callback directly. If you write a module and expect your users to use |
779 | a callback directly. If you write a module and expect your users to use |
705 | it "synchronously" often (for example, a simple http-get script would not |
780 | it "synchronously" often (for example, a simple http-get script would not |
706 | really care much for events), then you would use a condition variable and |
781 | really care much for events), then you would use a condition variable and |
707 | tell them "simply ->recv the data". |
782 | tell them "simply C<< ->recv >> the data". |
708 | |
783 | |
709 | =head3 Problems with the implementation and how to fix them |
784 | =head3 Problems with the implementation and how to fix them |
710 | |
785 | |
711 | To make this example more real-world-ready, we would not only implement |
786 | To make this example more real-world-ready, we would not only implement |
712 | some write buffering (for the paranoid), but we would also have to handle |
787 | some write buffering (for the paranoid, or maybe denial-of-service aware |
713 | timeouts and maybe protocol errors. |
788 | security expert), but we would also have to handle timeouts and maybe |
|
|
789 | protocol errors. |
714 | |
790 | |
715 | Doing this quickly gets unwieldy, which is why we introduce |
791 | Doing this quickly gets unwieldy, which is why we introduce |
716 | L<AnyEvent::Handle> in the next section, which takes care of all these |
792 | L<AnyEvent::Handle> in the next section, which takes care of all these |
717 | details for you and let's you concentrate on the actual protocol. |
793 | details for you and let's you concentrate on the actual protocol. |
718 | |
794 | |
719 | |
795 | |
720 | =head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle |
796 | =head2 Implementing simple HTTP and HTTPS GET requests with AnyEvent::Handle |
721 | |
797 | |
722 | The L<AnyEvent::Handle> module has been hyped quite a bit so far, so let's |
798 | The L<AnyEvent::Handle> module has been hyped quite a bit in this document |
723 | see what it really offers. |
799 | so far, so let's see what it really offers. |
724 | |
800 | |
725 | As finger is such a simple protocol, let's try something slightly more |
801 | As finger is such a simple protocol, let's try something slightly more |
726 | complicated: HTTP/1.0. |
802 | complicated: HTTP/1.0. |
727 | |
803 | |
728 | An HTTP GET request works by sending a single request line that indicates |
804 | An HTTP GET request works by sending a single request line that indicates |
… | |
… | |
754 | The C<GET ...> and the empty line were entered manually, the rest of the |
830 | The C<GET ...> and the empty line were entered manually, the rest of the |
755 | telnet output is google's response, in which case a C<404 not found> one. |
831 | telnet output is google's response, in which case a C<404 not found> one. |
756 | |
832 | |
757 | So, here is how you would do it with C<AnyEvent::Handle>: |
833 | So, here is how you would do it with C<AnyEvent::Handle>: |
758 | |
834 | |
759 | ###TODO |
835 | sub http_get { |
|
|
836 | my ($host, $uri, $cb) = @_; |
|
|
837 | |
|
|
838 | tcp_connect $host, "http", sub { |
|
|
839 | my ($fh) = @_ |
|
|
840 | or $cb->("HTTP/1.0 500 $!"); |
|
|
841 | |
|
|
842 | # store results here |
|
|
843 | my ($response, $header, $body); |
|
|
844 | |
|
|
845 | my $handle; $handle = new AnyEvent::Handle |
|
|
846 | fh => $fh, |
|
|
847 | on_error => sub { |
|
|
848 | undef $handle; |
|
|
849 | $cb->("HTTP/1.0 500 $!"); |
|
|
850 | }, |
|
|
851 | on_eof => sub { |
|
|
852 | undef $handle; # keep it alive till eof |
|
|
853 | $cb->($response, $header, $body); |
|
|
854 | }; |
|
|
855 | |
|
|
856 | $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012"); |
|
|
857 | |
|
|
858 | # now fetch response status line |
|
|
859 | $handle->push_read (line => sub { |
|
|
860 | my ($handle, $line) = @_; |
|
|
861 | $response = $line; |
|
|
862 | }); |
|
|
863 | |
|
|
864 | # then the headers |
|
|
865 | $handle->push_read (line => "\015\012\015\012", sub { |
|
|
866 | my ($handle, $line) = @_; |
|
|
867 | $header = $line; |
|
|
868 | }); |
|
|
869 | |
|
|
870 | # and finally handle any remaining data as body |
|
|
871 | $handle->on_read (sub { |
|
|
872 | $body .= $_[0]->rbuf; |
|
|
873 | $_[0]->rbuf = ""; |
|
|
874 | }); |
|
|
875 | }; |
|
|
876 | } |
760 | |
877 | |
761 | And now let's go through it step by step. First, as usual, the overall |
878 | And now let's go through it step by step. First, as usual, the overall |
762 | C<http_get> function structure: |
879 | C<http_get> function structure: |
763 | |
880 | |
764 | sub http_get { |
881 | sub http_get { |
… | |
… | |
838 | of the headers to the server. |
955 | of the headers to the server. |
839 | |
956 | |
840 | The more interesting question is why the method is called C<push_write> |
957 | The more interesting question is why the method is called C<push_write> |
841 | and not just write. The reason is that you can I<always> add some write |
958 | and not just write. The reason is that you can I<always> add some write |
842 | data without blocking, and to do this, AnyEvent::Handle needs some write |
959 | data without blocking, and to do this, AnyEvent::Handle needs some write |
843 | queue internally - and C<push_write> simply pushes some data at the end of |
960 | queue internally - and C<push_write> simply pushes some data onto the end |
844 | that queue, just like Perl's C<push> pushes data at the end of an array. |
961 | of that queue, just like Perl's C<push> pushes data onto the end of an |
|
|
962 | array. |
845 | |
963 | |
846 | The deeper reason is that at some point in the future, there might |
964 | The deeper reason is that at some point in the future, there might |
847 | be C<unshift_write> as well, and in any case, we will shortly meet |
965 | be C<unshift_write> as well, and in any case, we will shortly meet |
848 | C<push_read> and C<unshift_read>, and it's usually easiest if all those |
966 | C<push_read> and C<unshift_read>, and it's usually easiest to remember if |
849 | functions have some symmetry in their name. |
967 | all those functions have some symmetry in their name. |
850 | |
968 | |
851 | If C<push_write> is called with more than one argument, then you can even |
969 | If C<push_write> is called with more than one argument, then you can even |
852 | do I<formatted> I/O, which simply means your data will be transformed in |
970 | do I<formatted> I/O, which simply means your data will be transformed in |
853 | some ways. For example, this would JSON-encode your data before pushing it |
971 | some ways. For example, this would JSON-encode your data before pushing it |
854 | to the write queue: |
972 | to the write queue: |
… | |
… | |
856 | $handle->push_write (json => [1, 2, 3]); |
974 | $handle->push_write (json => [1, 2, 3]); |
857 | |
975 | |
858 | Apart from that, this pretty much summarises the write queue, there is |
976 | Apart from that, this pretty much summarises the write queue, there is |
859 | little else to it. |
977 | little else to it. |
860 | |
978 | |
861 | Reading the response if far more interesting: |
979 | Reading the response is far more interesting, because it involves the more |
|
|
980 | powerful and complex I<read queue>: |
862 | |
981 | |
863 | =head3 The read queue |
982 | =head3 The read queue |
864 | |
983 | |
865 | the response consists of three parts: a single line of response status, a |
984 | The response consists of three parts: a single line with the response |
866 | single paragraph of headers ended by an empty line, and the request body, |
985 | status, a single paragraph of headers ended by an empty line, and the |
867 | which is simply the remaining data on that connection. |
986 | request body, which is simply the remaining data on that connection. |
868 | |
987 | |
869 | For the first two, we push two read requests onto the read queue: |
988 | For the first two, we push two read requests onto the read queue: |
870 | |
989 | |
871 | # now fetch response status line |
990 | # now fetch response status line |
872 | $handle->push_read (line => sub { |
991 | $handle->push_read (line => sub { |
… | |
… | |
878 | $handle->push_read (line => "\015\012\015\012", sub { |
997 | $handle->push_read (line => "\015\012\015\012", sub { |
879 | my ($handle, $line) = @_; |
998 | my ($handle, $line) = @_; |
880 | $header = $line; |
999 | $header = $line; |
881 | }); |
1000 | }); |
882 | |
1001 | |
883 | While one can simply push a single callback to the queue, I<formatted> I/O |
1002 | While one can simply push a single callback to parse the data the |
884 | really comes to out advantage here, as there is a ready-made "read line" |
1003 | queue, I<formatted> I/O really comes to our advantage here, as there |
885 | read type. The first read expects a single line, ended by C<\015\012> (the |
1004 | is a ready-made "read line" read type. The first read expects a single |
886 | standard end-of-line marker in internet protocols). |
1005 | line, ended by C<\015\012> (the standard end-of-line marker in internet |
|
|
1006 | protocols). |
887 | |
1007 | |
888 | The second "line" is actually a single paragraph - instead of reading it |
1008 | The second "line" is actually a single paragraph - instead of reading it |
889 | line by line we tell C<push_read> that the end-of-line marker is really |
1009 | line by line we tell C<push_read> that the end-of-line marker is really |
890 | C<\015\012\015\012>, which is an empty line. The result is that the whole |
1010 | C<\015\012\015\012>, which is an empty line. The result is that the whole |
891 | header paragraph will be treated as a single line and read. The word |
1011 | header paragraph will be treated as a single line and read. The word |
… | |
… | |
904 | $_[0]->rbuf = ""; |
1024 | $_[0]->rbuf = ""; |
905 | }); |
1025 | }); |
906 | |
1026 | |
907 | This callback is invoked every time data arrives and the read queue is |
1027 | This callback is invoked every time data arrives and the read queue is |
908 | empty - which in this example will only be the case when both response and |
1028 | empty - which in this example will only be the case when both response and |
909 | header have been read. |
1029 | header have been read. The C<on_read> callback could actually have been |
|
|
1030 | specified when constructing the object, but doing it this way preserves |
|
|
1031 | logical ordering. |
910 | |
1032 | |
|
|
1033 | The read callback simply adds the current read buffer to it's C<$body> |
|
|
1034 | variable and, most importantly, I<empties> the buffer by assigning the |
|
|
1035 | empty string to it. |
911 | |
1036 | |
912 | ############################################################################# |
1037 | After AnyEvent::Handle has been so instructed, it will handle incoming |
|
|
1038 | data according to these instructions - if all goes well, the callback will |
|
|
1039 | be invoked with the response data, if not, it will get an error. |
913 | |
1040 | |
914 | Now let's start with something simple: a program that reads from standard |
1041 | In general, you can implement pipelining (a semi-advanced feature of many |
915 | input in a non-blocking way, that is, in a way that lets your program do |
1042 | protocols) very easy with AnyEvent::Handle: If you have a protocol with a |
916 | other things while it is waiting for input. |
1043 | request/response structure, your request methods/functions will all look |
|
|
1044 | like this (simplified): |
917 | |
1045 | |
918 | First, the full program listing: |
1046 | sub request { |
919 | |
1047 | |
920 | #!/usr/bin/perl |
1048 | # send the request to the server |
|
|
1049 | $handle->push_write (...); |
921 | |
1050 | |
922 | use AnyEvent; |
1051 | # push some response handlers |
923 | use AnyEvent::Handle; |
1052 | $handle->push_read (...); |
|
|
1053 | } |
924 | |
1054 | |
925 | my $end_prog = AnyEvent->condvar; |
1055 | This means you can queue as many requests as you want, and while |
|
|
1056 | AnyEvent::Handle goes through its read queue to handle the response data, |
|
|
1057 | the other side can work on the next request - queueing the request just |
|
|
1058 | appends some data to the write queue and installs a handler to be called |
|
|
1059 | later. |
926 | |
1060 | |
927 | my $handle = |
1061 | You might ask yourself how to handle decisions you can only make I<after> |
928 | AnyEvent::Handle->new ( |
1062 | you have received some data (such as handling a short error response or a |
929 | fh => \*STDIN, |
1063 | long and differently-formatted response). The answer to this problem is |
930 | on_eof => sub { |
1064 | C<unshift_read>, which we will introduce together with an example in the |
931 | print "received EOF, exiting...\n"; |
1065 | coming sections. |
932 | $end_prog->broadcast; |
1066 | |
933 | }, |
1067 | =head3 Using C<http_get> |
934 | on_error => sub { |
1068 | |
935 | print "error while reading from STDIN: $!\n"; |
1069 | Finally, here is how you would use C<http_get>: |
936 | $end_prog->broadcast; |
1070 | |
|
|
1071 | http_get "www.google.com", "/", sub { |
|
|
1072 | my ($response, $header, $body) = @_; |
|
|
1073 | |
|
|
1074 | print |
|
|
1075 | $response, "\n", |
|
|
1076 | $body; |
|
|
1077 | }; |
|
|
1078 | |
|
|
1079 | And of course, you can run as many of these requests in parallel as you |
|
|
1080 | want (and your memory supports). |
|
|
1081 | |
|
|
1082 | =head3 HTTPS |
|
|
1083 | |
|
|
1084 | Now, as promised, let's implement the same thing for HTTPS, or more |
|
|
1085 | correctly, let's change our C<http_get> function into a function that |
|
|
1086 | speaks HTTPS instead. |
|
|
1087 | |
|
|
1088 | HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer |
|
|
1089 | B<S>ecurity is the official name for what most people refer to as C<SSL>) |
|
|
1090 | that contains standard HTTP protocol exchanges. The only other difference |
|
|
1091 | to HTTP is that by default it uses port C<443> instead of port C<80>. |
|
|
1092 | |
|
|
1093 | To implement these two differences we need two tiny changes, first, in the |
|
|
1094 | C<tcp_connect> call we replace C<http> by C<https>): |
|
|
1095 | |
|
|
1096 | tcp_connect $host, "https", sub { ... |
|
|
1097 | |
|
|
1098 | The other change deals with TLS, which is something L<AnyEvent::Handle> |
|
|
1099 | does for us, as long as I<you> made sure that the L<Net::SSLeay> module |
|
|
1100 | is around. To enable TLS with L<AnyEvent::Handle>, we simply pass an |
|
|
1101 | additional C<tls> parameter to the call to C<AnyEvent::Handle::new>: |
|
|
1102 | |
|
|
1103 | tls => "connect", |
|
|
1104 | |
|
|
1105 | Specifying C<tls> enables TLS, and the argument specifies whether |
|
|
1106 | AnyEvent::Handle is the server side ("accept") or the client side |
|
|
1107 | ("connect") for the TLS connection, as unlike TCP, there is a clear |
|
|
1108 | server/client relationship in TLS. |
|
|
1109 | |
|
|
1110 | That's all. |
|
|
1111 | |
|
|
1112 | Of course, all this should be handled transparently by C<http_get> |
|
|
1113 | after parsing the URL. If you need this, see the part about exercising |
|
|
1114 | your inspiration earlier in this document. You could also use the |
|
|
1115 | L<AnyEvent::HTTP> module from CPAN, which implements all this and works |
|
|
1116 | around a lot of quirks for you, too. |
|
|
1117 | |
|
|
1118 | =head3 The read queue - revisited |
|
|
1119 | |
|
|
1120 | HTTP always uses the same structure in its responses, but many protocols |
|
|
1121 | require parsing responses differently depending on the response itself. |
|
|
1122 | |
|
|
1123 | For example, in SMTP, you normally get a single response line: |
|
|
1124 | |
|
|
1125 | 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net> |
|
|
1126 | |
|
|
1127 | But SMTP also supports multi-line responses: |
|
|
1128 | |
|
|
1129 | 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net> |
|
|
1130 | 220-hey guys |
|
|
1131 | 220 my response is longer than yours |
|
|
1132 | |
|
|
1133 | To handle this, we need C<unshift_read>. As the name (hopefully) implies, |
|
|
1134 | C<unshift_read> will not append your read request to the end of the read |
|
|
1135 | queue, but instead it will prepend it to the queue. |
|
|
1136 | |
|
|
1137 | This is useful in the situation above: Just push your response-line read |
|
|
1138 | request when sending the SMTP command, and when handling it, you look at |
|
|
1139 | the line to see if more is to come, and C<unshift_read> another reader |
|
|
1140 | callback if required, like this: |
|
|
1141 | |
|
|
1142 | my $response; # response lines end up in here |
|
|
1143 | |
|
|
1144 | my $read_response; $read_response = sub { |
|
|
1145 | my ($handle, $line) = @_; |
|
|
1146 | |
|
|
1147 | $response .= "$line\n"; |
|
|
1148 | |
|
|
1149 | # check for continuation lines ("-" as 4th character") |
|
|
1150 | if ($line =~ /^...-/) { |
|
|
1151 | # if yes, then unshift another line read |
|
|
1152 | $handle->unshift_read (line => $read_response); |
|
|
1153 | |
|
|
1154 | } else { |
|
|
1155 | # otherwise we are done |
|
|
1156 | |
|
|
1157 | # free callback |
|
|
1158 | undef $read_response; |
937 | } |
1159 | |
|
|
1160 | print "we are don reading: $response\n"; |
938 | ); |
1161 | } |
|
|
1162 | }; |
|
|
1163 | |
|
|
1164 | $handle->push_read (line => $read_response); |
|
|
1165 | |
|
|
1166 | This recipe can be used for all similar parsing problems, for example in |
|
|
1167 | NNTP, the response code to some commands indicates that more data will be |
|
|
1168 | sent: |
|
|
1169 | |
|
|
1170 | $handle->push_write ("article 42"); |
|
|
1171 | |
|
|
1172 | # read response line |
|
|
1173 | $handle->push_read (line => sub { |
|
|
1174 | my ($handle, $status) = @_; |
|
|
1175 | |
|
|
1176 | # article data following? |
|
|
1177 | if ($status =~ /^2/) { |
|
|
1178 | # yes, read article body |
|
|
1179 | |
|
|
1180 | $handle->unshift_read (line => "\012.\015\012", sub { |
|
|
1181 | my ($handle, $body) = @_; |
|
|
1182 | |
|
|
1183 | $finish->($status, $body); |
|
|
1184 | }); |
|
|
1185 | |
|
|
1186 | } else { |
|
|
1187 | # some error occured, no article data |
|
|
1188 | |
|
|
1189 | $finish->($status); |
|
|
1190 | } |
|
|
1191 | } |
|
|
1192 | |
|
|
1193 | =head3 Your own read queue handler |
|
|
1194 | |
|
|
1195 | Sometimes, your protocol doesn't play nice and uses lines or chunks of |
|
|
1196 | data not formatted in a way handled by AnyEvent::Handle out of the box. In |
|
|
1197 | this case you have to implement your own read parser. |
|
|
1198 | |
|
|
1199 | To make up a contorted example, imagine you are looking for an even |
|
|
1200 | number of characters followed by a colon (":"). Also imagine that |
|
|
1201 | AnyEvent::Handle had no C<regex> read type which could be used, so you'd |
|
|
1202 | had to do it manually. |
|
|
1203 | |
|
|
1204 | To implement a read handler for this, you would C<push_read> (or |
|
|
1205 | C<unshift_read>) just a single code reference. |
|
|
1206 | |
|
|
1207 | This code reference will then be called each time there is (new) data |
|
|
1208 | available in the read buffer, and is expected to either successfully |
|
|
1209 | eat/consume some of that data (and return true) or to return false to |
|
|
1210 | indicate that it wants to be called again. |
|
|
1211 | |
|
|
1212 | If the code reference returns true, then it will be removed from the |
|
|
1213 | read queue (because it has parsed/consumed whatever it was supposed to |
|
|
1214 | consume), otherwise it stays in the front of it. |
|
|
1215 | |
|
|
1216 | The example above could be coded like this: |
939 | |
1217 | |
940 | $handle->push_read (sub { |
1218 | $handle->push_read (sub { |
941 | my ($handle) = @_; |
1219 | my ($handle) = @_; |
942 | |
1220 | |
943 | if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { |
1221 | # check for even number of characters + ":" |
944 | print "got 'end', existing...\n"; |
1222 | # and remove the data if a match is found. |
945 | $end_prog->broadcast; |
1223 | # if not, return false (actually nothing) |
|
|
1224 | |
|
|
1225 | $handle->{rbuf} =~ s/^( (?:..)* ) ://x |
946 | return 1 |
1226 | or return; |
|
|
1227 | |
|
|
1228 | # we got some data in $1, pass it to whoever wants it |
|
|
1229 | $finish->($1); |
|
|
1230 | |
|
|
1231 | # and return true to indicate we are done |
947 | } |
1232 | 1 |
948 | |
|
|
949 | 0 |
|
|
950 | }); |
1233 | }); |
951 | |
1234 | |
952 | $end_prog->recv; |
1235 | This concludes our little tutorial. |
953 | |
1236 | |
954 | That's a mouthful, so let's go through it step by step: |
1237 | =head1 Where to go from here? |
955 | |
1238 | |
956 | #!/usr/bin/perl |
1239 | This introduction should have explained the key concepts of L<AnyEvent> |
|
|
1240 | - event watchers and condition variables, L<AnyEvent::Socket> - basic |
|
|
1241 | networking utilities, and L<AnyEvent::Handle> - a nice wrapper around |
|
|
1242 | handles. |
957 | |
1243 | |
958 | use AnyEvent; |
1244 | You could either start coding stuff right away, look at those manual |
959 | use AnyEvent::Handle; |
1245 | pages for the gory details, or roam CPAN for other AnyEvent modules (such |
|
|
1246 | as L<AnyEvent::IRC> or L<AnyEvent::HTTP>) to see more code examples (or |
|
|
1247 | simply to use them). |
960 | |
1248 | |
961 | Nothing unexpected here, just load AnyEvent for the event functionality |
1249 | If you need a protocol that doesn't have an implementation using AnyEvent, |
962 | and AnyEvent::Handle for your file handling needs. |
1250 | remember that you can mix AnyEvent with one other event framework, such as |
|
|
1251 | L<POE>, so you can always use AnyEvent for your own tasks plus modules of |
|
|
1252 | one other event framework to fill any gaps. |
963 | |
1253 | |
964 | my $end_prog = AnyEvent->condvar; |
1254 | And last not least, you could also look at L<Coro>, especially |
|
|
1255 | L<Coro::AnyEvent>, to see how you can turn event-based programming from |
|
|
1256 | callback style back to the usual imperative style (also called "inversion |
|
|
1257 | of control" - AnyEvent calls I<you>, but Coro lets I<you> call AnyEvent). |
965 | |
1258 | |
966 | Here the program creates a so-called 'condition variable': Condition |
1259 | =head1 Authors |
967 | variables are a great way to signal the completion of some event, or to |
|
|
968 | state that some condition became true (thus the name). |
|
|
969 | |
|
|
970 | This condition variable represents the condition that the program wants to |
|
|
971 | terminate. Later in the program, we will 'recv' that condition (call the |
|
|
972 | C<recv> method on it), which will wait until the condition gets signalled |
|
|
973 | (which is done by calling the C<send> method on it). |
|
|
974 | |
|
|
975 | The next step is to create the handle object: |
|
|
976 | |
|
|
977 | my $handle = |
|
|
978 | AnyEvent::Handle->new ( |
|
|
979 | fh => \*STDIN, |
|
|
980 | on_eof => sub { |
|
|
981 | print "received EOF, exiting...\n"; |
|
|
982 | $end_prog->broadcast; |
|
|
983 | }, |
|
|
984 | |
|
|
985 | This handle object will read from standard input. Setting the C<on_eof> |
|
|
986 | callback should be done for every file handle, as that is a condition that |
|
|
987 | we always need to check for when working with file handles, to prevent |
|
|
988 | reading or writing to a closed file handle, or getting stuck indefinitely |
|
|
989 | in case of an error. |
|
|
990 | |
|
|
991 | Speaking of errors: |
|
|
992 | |
|
|
993 | on_error => sub { |
|
|
994 | print "error while reading from STDIN: $!\n"; |
|
|
995 | $end_prog->broadcast; |
|
|
996 | } |
|
|
997 | ); |
|
|
998 | |
|
|
999 | The C<on_error> callback is also not required, but we set it here in case |
|
|
1000 | any error happens when we read from the file handle. It is usually a good |
|
|
1001 | idea to set this callback and at least print some diagnostic message: Even |
|
|
1002 | in our small example an error can happen. More on this later... |
|
|
1003 | |
|
|
1004 | $handle->push_read (sub { |
|
|
1005 | |
|
|
1006 | Next we push a general read callback on the read queue, which |
|
|
1007 | will wait until we have received all the data we wanted to |
|
|
1008 | receive. L<AnyEvent::Handle> has two queues per file handle, a read and a |
|
|
1009 | write queue. The write queue queues pending data that waits to be written |
|
|
1010 | to the file handle. And the read queue queues reading callbacks. For more |
|
|
1011 | details see the documentation L<AnyEvent::Handle> about the READ QUEUE and |
|
|
1012 | WRITE QUEUE. |
|
|
1013 | |
|
|
1014 | my ($handle) = @_; |
|
|
1015 | |
|
|
1016 | if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { |
|
|
1017 | print "got 'end', existing...\n"; |
|
|
1018 | $end_prog->broadcast; |
|
|
1019 | return 1 |
|
|
1020 | } |
|
|
1021 | |
|
|
1022 | 0 |
|
|
1023 | }); |
|
|
1024 | |
|
|
1025 | The actual callback waits until the word 'end' has been seen in the data |
|
|
1026 | received on standard input. Once we encounter the stop word 'end' we |
|
|
1027 | remove everything from the read buffer and call the condition variable |
|
|
1028 | we setup earlier, that signals our 'end of program' condition. And the |
|
|
1029 | callback returns with a true value, that signals we are done with reading |
|
|
1030 | all the data we were interested in (all data until the word 'end' has been |
|
|
1031 | seen). |
|
|
1032 | |
|
|
1033 | In all other cases, when the stop word has not been seen yet, we just |
|
|
1034 | return a false value, to indicate that we are not finished yet. |
|
|
1035 | |
|
|
1036 | The C<rbuf> method returns our read buffer, that we can directly modify as |
|
|
1037 | lvalue. Alternatively we also could have written: |
|
|
1038 | |
|
|
1039 | if ($handle->{rbuf} =~ s/^.*?\bend\b.*$//s) { |
|
|
1040 | |
|
|
1041 | The last line will wait for the condition that our program wants to exit: |
|
|
1042 | |
|
|
1043 | $end_prog->recv; |
|
|
1044 | |
|
|
1045 | The call to C<recv> will setup an event loop for us and wait for IO, timer |
|
|
1046 | or signal events and will handle them until the condition gets sent (by |
|
|
1047 | calling its C<send> method). |
|
|
1048 | |
|
|
1049 | The key points to learn from this example are: |
|
|
1050 | |
|
|
1051 | =over 4 |
|
|
1052 | |
|
|
1053 | =item * Condition variables are used to start an event loop. |
|
|
1054 | |
|
|
1055 | =item * How to registering some basic callbacks on AnyEvent::Handle's. |
|
|
1056 | |
|
|
1057 | =item * How to process data in the read buffer. |
|
|
1058 | |
|
|
1059 | =back |
|
|
1060 | |
|
|
1061 | =head1 AUTHORS |
|
|
1062 | |
1260 | |
1063 | Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. |
1261 | Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. |
1064 | |
1262 | |