=head1 Introduction to AnyEvent This is a tutorial that will introduce you to the features of AnyEvent. The first part introduces the core AnyEvent module (after swamping you a bit in evangelism), which might already provide all you ever need. The second part focuses on network programming using sockets, for which AnyEvent offers a lot of support you can use. =head1 What is AnyEvent? If you don't care for the whys and want to see code, skip this section! AnyEvent is first of all just a framework to do event-based programming. Typically such frameworks are an all-or-nothing thing: If you use one such framework, you can't (easily, or even at all) use another in the same program. AnyEvent is different - it is a thin abstraction layer above all kinds of event loops. Its main purpose is to move the choice of the underlying framework (the event loop) from the module author to the program author using the module. That means you can write code that uses events to control what it does, without forcing other code in the same program to use the same underlying framework as you do - i.e. you can create a Perl module that is event-based using AnyEvent, and users of that module can still choose between using L, L, L or no event loop at all: AnyEvent comes with its own event loop implementation, so your code works regardless of other modules that might or might not be installed. The latter is important, as AnyEvent does not have any dependencies to other modules, which makes it easy to install, for example, when you lack a C compiler. A typical problem with Perl modules such as L is that they come with their own event loop: In L, the program who uses it needs to start the event loop of L. That means that one cannot integrate this module into a L GUI for instance, as that module, too, enforces the use of its own event loop (namely L). Another example is L: it provides no event interface at all. It's a pure blocking HTTP (and FTP etc.) client library, which usually means that you either have to start a thread or have to fork for a HTTP request, or use L, if you want to do something else while waiting for the request to finish. The motivation behind these designs is often that a module doesn't want to depend on some complicated XS-module (Net::IRC), or that it doesn't want to force the user to use some specific event loop at all (LWP). L solves this dilemma, by B forcing module authors to either =over 4 =item write their own event loop (because guarantees to offer one everywhere - even on windows). =item choose one fixed event loop (because AnyEvent works with all important event loops available for Perl, and adding others is trivial). =back If the module author uses L for all his event needs (IO events, timers, signals, ...) then all other modules can just use his module and don't have to choose an event loop or adapt to his event loop. The choice of the event loop is ultimately made by the program author who uses all the modules and writes the main program. And even there he doesn't have to choose, he can just let L choose the best available event loop for him. Read more about this in the main documentation of the L module. =head1 Introduction to Event-Based Programming So what exactly is programming using events? It quite simply means that instead of your code actively waiting for something, such as the user entering something on STDIN: $| = 1; print "enter your name> "; my $name = ; You instead tell your event framework to notify you in the event of some data being available on STDIN, by using a callback mechanism: use AnyEvent; $| = 1; print "enter your name> "; my $name; my $wait_for_input = AnyEvent->io ( fh => \*STDIN, # which file handle to check poll => "r", # which event to wait for ("r"ead data) cb => sub { # what callback to execute $name = ; # read it } ); # do something else here Looks more complicated, and surely is, but the advantage of using events is that your program can do something else instead of waiting for input. Waiting as in the first example is also called "blocking" because you "block" your process from executing anything else while you do so. The second example avoids blocking, by only registering interest in a read event, which is fast and doesn't block your process. Only when read data is available will the callback be called, which can then proceed to read the data. The "interest" is represented by an object returned by C<< AnyEvent->io >> called a "watcher" object - called like that because it "watches" your file handle (or other event sources) for the event you are interested in. In the example above, we create an I/O watcher by calling the C<< AnyEvent->io >> method. Disinterest in some event is simply expressed by forgetting about the watcher, for example, by C'ing the variable it is stored in. AnyEvent will automatically clean up the watcher if it is no longer used, much like Perl closes your file handles if you no longer use them anywhere. =head2 Condition Variables However, the above is not a fully working program, and will not work as-is. The reason is that your callback will not be invoked out of the blue, you have to run the event loop. Also, event-based programs sometimes have to block, too, as when there simply is nothing else to do and everything waits for some events, it needs to block the process as well. In AnyEvent, this is done using condition variables. Condition variables are named "condition variables" because they represent a condition that is initially false and needs to be fulfilled. You can also call them "merge points", "sync points", "rendezvous ports" or even callbacks and many other things (and they are often called like this in other frameworks). The important point is that you can create them freely and later wait for them to become true. Condition variables have two sides - one side is the "producer" of the condition (whatever code detects the condition), the other side is the "consumer" (the code that waits for that condition). In our example in the previous section, the producer is the event callback and there is no consumer yet - let's change that now: use AnyEvent; $| = 1; print "enter your name> "; my $name; my $name_ready = AnyEvent->condvar; my $wait_for_input = AnyEvent->io ( fh => \*STDIN, poll => "r", cb => sub { $name = ; $name_ready->send; } ); # do something else here # now wait until the name is available: $name_ready->recv; undef $wait_for_input; # watche rno longer needed print "your name is $name\n"; This program creates an AnyEvent condvar by calling the C<< AnyEvent->condvar >> method. It then creates a watcher as usual, but inside the callback it C's the C<$name_ready> condition variable, which causes anybody waiting on it to continue. The "anybody" in this case is the code that follows, which calls C<< $name_ready->recv >>: The producer calls C, the consumer calls C. If there is no C<$name> available yet, then the call to C<< $name_ready->recv >> will halt your program until the condition becomes true. As the names C and C imply, you can actually send and receive data using this, for example, the above code could also be written like this, without an extra variable to store the name in: use AnyEvent; $| = 1; print "enter your name> "; my $name_ready = AnyEvent->condvar; my $wait_for_input = AnyEvent->io ( fh => \*STDIN, poll => "r", cb => sub { $name_ready->send (scalar = ) } ); # do something else here # now wait and fetch the name my $name = $name_ready->recv; undef $wait_for_input; # watche rno longer needed print "your name is $name\n"; You can pass any number of arguments to C, and everybody call to C will return them. =head2 The "main loop" Most event-based frameworks have something called a "main loop" or "event loop run function" or something similar. Just like in C AnyEvent, these functions need to be called eventually so that your event loop has a chance of actually looking for those events you are interested in. For example, in a L program, the above example could also be written like this: use Gtk2 -init; use AnyEvent; ############################################ # create a window and some label my $window = new Gtk2::Window "toplevel"; $window->add (my $label = new Gtk2::Label "soon replaced by name"); $window->show_all; ############################################ # do our AnyEvent stuff $| = 1; print "enter your name> "; my $name_ready = AnyEvent->condvar; my $wait_for_input = AnyEvent->io ( fh => \*STDIN, poll => "r", cb => sub { # set the label $label->set_text (scalar ); print "enter another name> "; } ); ############################################ # Now enter Gtk2's event loop main Gtk2; No condition variable anywhere in sight - instead, we just read a line from STDIN and replace the text in the label. In fact, since nobody C's C<$wait_for_input> you can enter multiple lines. Instead of waiting for a condition variable, the program enters the Gtk2 main loop by calling C<< Gtk2->main >>, which will block the program and wait for events to arrive. This also shows that AnyEvent is quite flexible - you didn't have anything to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just worked. Admittedly, the example is a bit silly - who would want to read names form standard input in a Gtk+ application. But imagine that instead of doing that, you would make a HTTP request in the background and display it's results. In fact, with event-based programming you can make many http-requests in parallel in your program and still provide feedback to the user and stay interactive. In the next part you will see how to do just that - by implementing an HTTP request, on our own, with the utility modules AnyEvent comes with. Before that, however, let's briefly look at how you would write your program with using only AnyEvent, without ever calling some other event loop's run function. In the example using condition variables, we used that, and in fact, this is the solution: my $quit_program = AnyEvent->condvar; # create AnyEvent watchers (or not) here $quit_program->recv; If any of your watcher callbacks decide to quit, they can simply call C<< $quit_program->send >>. Of course, they could also decide not to and simply call C instead, or they could decide not to quit, ever (e.g. in a long-running daemon program). In that case, you can simply use: AnyEvent->condvar->recv; And this is, in fact, closest to the idea of a main loop run function that AnyEvent offers. =head2 Timers and other event sources So far, we have only used I/O watchers. These are useful mainly to find out whether a Socket has data to read, or space to write more data. On sane operating systems this also works for console windows/terminals (typically on standard input), serial lines, all sorts of other devices, basically almost everything that has a file descriptor but isn't a file itself. (As usual, "sane" excludes windows - on that platform you would need different functions for all of these, complicating code immensely - think "socket only" on windows). However, I/O is not everything - the second most important event source is the clock. For example when doing an HTTP request you might want to time out when the server doesn't answer within some predefined amount of time. In AnyEvent, timer event watchers are created by calling the C<< AnyEvent->timer >> method: use AnyEvent; my $cv = AnyEvent->condvar; my $wait_one_and_a_half_seconds = AnyEvent->timer ( after => 1.5, # after how many seconds to invoke the cb? cb => sub { # the callback to invoke $cv->send; }, ); # can do something else here # now wait till our time has come $cv->recv; Unlike I/O watchers, timers are only interested in the amount of seconds they have to wait. When that amount of time has passed, AnyEvent will invoke your callback. Unlike I/O watchers, which will call your callback as many times as there is data available, timers are one-shot: after they have "fired" once and invoked your callback, they are dead and no longer do anything. To get a repeating timer, such as a timer firing roughly once per second, you have to recreate it: use AnyEvent; my $time_watcher; sub once_per_second { print "tick\n"; # (re-)create the watcher $time_watcher = AnyEvent->timer ( after => 1, cb => \&once_per_second, ); } # now start the timer once_per_second; Having to recreate your timer is a restriction put on AnyEvent that is present in most event libraries it uses. It is so annoying that some future version might work around this limitation, but right now, it's the only way to do repeating timers. Fortunately most timers aren't really repeating but specify timeouts of some sort. =head3 More esoteric sources AnyEvent also has some other, more esoteric event sources you can tap into: signal and child watchers. Signal watchers can be used to wait for "signal events", which simply means your process got send a signal (such as C or C). Process watchers wait for a child process to exit. They are useful when you fork a separate process and need to know when it exits, but you do not wait for that by blocking. Both watcher types are described in detail in the main L manual page. =head1 Network programming and AnyEvent So far you have seen how to register event watchers and handle events. This is a great foundation to write network clients and servers, and might be all that your module (or program) ever requires, but writing your own I/O buffering again and again becomes tedious, not to mention that it attracts errors. While the core L module is still small and self-contained, the distribution comes with some very useful utility modules such as L, L and L. These can make your life as non-blocking network programmer a lot easier. Here is a quick overview over these three modules: =head2 L This module allows fully asynchronous DNS resolution. It is used mainly by L to resolve hostnames and service ports for you, but is a great way to do other DNS resolution tasks, such as reverse lookups of IP addresses for log files. =head2 L This module handles non-blocking IO on file handles in an event based manner. It provides a wrapper object around your file handle that provides queueing and buffering of incoming and outgoing data for you. It also implements the most common data formats, such as text lines, or fixed and variable-width data blocks. =head2 L This module provides you with functions that handle socket creation and IP address magic. The two main functions are C and C. The former will connect a (streaming) socket to an internet host for you and the later will make a server socket for you, to accept connections. This module also comes with transparent IPv6 support, this means: If you write your programs with this module, you will be IPv6 ready without doing anything special. It also works around a lot of portability quirks (especially on the windows platform), which makes it even easier to write your programs in a portable way (did you know that windows uses different error codes for all socket functions and that Perl does not know about these? That "Unknown error 10022" (which is C) can mean that our C call was successful? That unsuccessful TCP connects might never be reported back to your program? That C means your C call was ignored instead of being in progress? AnyEvent::Socket works around all of these Windows/Perl bugs for you). =head2 First experiments with non-blocking connects: a parallel finger client. The finger protocol is one of the simplest protocols in use on the internet. Or in use in the past, as almost nobody uses it anymore. It works by connecting to the finger port on another host, writing a single line with a user name and then reading the finger response, as specified by that user. OK, RFC 1288 specifies a vastly more complex protocol, but it basically boils down to this: # telnet idsoftware.com finger Trying 192.246.40.37... Connected to idsoftware.com (192.246.40.37). Escape character is '^]'. johnc Welcome to id Software's Finger Service V1.5! [...] Now on the web: [...] Connection closed by foreign host. Yeah, I used indeed, but at least the finger daemon still works, so let's write a little AnyEvent function that makes a finger request: use AnyEvent; use AnyEvent::Socket; sub finger($$) { my ($user, $host) = @_; # use a condvar to return results my $cv = AnyEvent->condvar; # first, connect to the host tcp_connect $host, "finger", sub { # the callback receives the socket handle - or nothing my ($fh) = @_ or return $cv->send; # now write the username syswrite $fh, "$user\015\012"; my $response; # register a read watcher my $read_watcher; $read_watcher = AnyEvent->io ( fh => $fh, poll => "r", cb => sub { my $len = sysread $fh, $response, 1024, length $response; if ($len <= 0) { # we are done, or an error occured, lets ignore the latter undef $read_watcher; # no longer interested $cv->send ($response); # send results } }, ); }; # pass $cv to the caller $cv } That's a mouthful! Let's dissect this function a bit, first the overall function: sub finger($$) { my ($user, $host) = @_; # use a condvar to return results my $cv = AnyEvent->condvar; # first, connect to the host tcp_connect $host, "finger", sub { ... }; $cv } This isn't too complicated, just a function with two parameters, which creates a condition variable, returns it, and while it does that, initiates a TCP connect to C<$host>. The condition variable will be used by the caller to receive the finger response. Since we are event-based programmers, we do not wait for the connect to finish - it could block your program for a minute or longer! Instead, we pass the callback it should invoke when the connect is done to C. If it is successful, our callback gets called with the socket handle as first argument, otherwise, nothing will be passed to our callback. Let's look at our callback in more detail: # the callback gets the socket handle - or nothing my ($fh) = @_ or return $cv->send; The first thing the callback does is indeed save the socket handle in C<$fh>. When there was an error (no arguments), then our instinct as expert Perl programmers would tell us to die: my ($fh) = @_ or die "$host: $!"; While this would give good feedback to the user, our program would probably freeze here, as we never report the results to anybody, certainly not the caller of our C function! This is why we instead return, but also call C<< $cv->send >> without any arguments to signal to our consumer that something bad has happened. The return value of C<< $cv->send >> is irrelevant, as is the return value of our callback. The return statement is simply used for the side effect of, well, returning immediately from the callback. As the next step in the finger protocol, we send the username to the finger daemon on the other side of our connection: syswrite $fh, "$user\015\012"; Note that this isn't 100% clean - the socket could, for whatever reasons, not accept our data. When writing a small amount of data like in this example it doesn't matter, but for real-world cases you might need to implement some kind of write buffering - or use L, which handles these matters for you. What we do have to do is to implement our own read buffer - the response data could arrive late or in multiple chunks, and we cannot just wait for it (event-based programming, you know?). To do that, we register a read watcher on the socket which waits for data: my $read_watcher; $read_watcher = AnyEvent->io ( fh => $fh, poll => "r", There is a trick here, however: the read watcher isn't stored in a global variable, but in a local one - if the callback returns, it would normally destroy the variable and its contents, which would in turn unregister our watcher. To avoid that, we Cine the variable in the watcher callback. This means that, when the C callback returns, that perl thinks (quite correctly) that the read watcher is still in use - namely in the callback. The callback itself calls C for as many times as necessary, until C returns an error or end-of-file: cb => sub { my $len = sysread $fh, $response, 1024, length $response; if ($len <= 0) { Note that C has the ability to append data it reads to a scalar, which is what we make good use of in this example. When C indicates we are done, the callback Cines the watcher and then C's the response data to the condition variable. All this has the following effects: Undefining the watcher destroys it, as our callback was the only one still having a reference to it. When the watcher gets destroyed, it destroys the callback, which in turn means the C<$fh> handle is no longer used, so that gets destroyed as well. The result is that all resources will be nicely cleaned up by perl for us. =head3 Using the finger client Now, we could probably write the same finger client in a simpler way if we used C, ignored the problem of multiple hosts and ignored IPv6 and a few other things that C handles for us. But the main advantage is that we can not only run this finger function in the background, we even can run multiple sessions in parallel, like this: my $f1 = finger "trouble", "noc.dfn.de"; # check for trouble tickets my $f2 = finger "1736" , "noc.dfn.de"; # fetch ticket 1736 my $f3 = finger "johnc", "idsoftware.com"; # finger john print "trouble tickets:\n", $f1->recv, "\n"; print "trouble ticket #1736:\n", $f2->recv, "\n"; print "john carmacks finger file: ", $f3->recv, "\n"; It doesn't look like it, but in fact all three requests run in parallel. The code waits for the first finger request to finish first, but that doesn't keep it from executing in parallel, because when the first C call sees that the data isn't ready yet, it serves events for all three requests automatically. By taking advantage of network latencies, which allows us to serve other requests and events while we wait for an event on one socket, the overall time to do these three requests will be greatly reduces, typically all three are done in the same time as the slowest of them would use. By the way, you do not actually have to wait in the C method on an AnyEvent condition variable, you can also register a callback: $cv->cb (sub { my $response = shift->recv; # ... }); The callback will only be invoked when C was called. In fact, instead of returning a condition variable you could also pass a third parameter to your finger function, the callback to invoke with the response: sub finger($$$) { my ($user, $host, $cb) = @_; What you use is a matter of taste - if you expect your function to be used mainly in an event-based program you would normally prefer to pass a callback directly. =head3 Criticism and fix To make this example more real-world-ready, we would not only implement some write buffering (for the paranoid), but we would also have to handle timeouts and maybe protocol errors. This quickly gets unwieldy, which is why we introduce L in the next section, which takes care of all these details for us. =head2 First experiments with AnyEvent::Handle Now let's start with something simple: a program that reads from standard input in a non-blocking way, that is, in a way that lets your program do other things while it is waiting for input. First, the full program listing: #!/usr/bin/perl use AnyEvent; use AnyEvent::Handle; my $end_prog = AnyEvent->condvar; my $handle = AnyEvent::Handle->new ( fh => \*STDIN, on_eof => sub { print "received EOF, exiting...\n"; $end_prog->broadcast; }, on_error => sub { print "error while reading from STDIN: $!\n"; $end_prog->broadcast; } ); $handle->push_read (sub { my ($handle) = @_; if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { print "got 'end', existing...\n"; $end_prog->broadcast; return 1 } 0 }); $end_prog->recv; That's a mouthful, so let's go through it step by step: #!/usr/bin/perl use AnyEvent; use AnyEvent::Handle; Nothing unexpected here, just load AnyEvent for the event functionality and AnyEvent::Handle for your file handling needs. my $end_prog = AnyEvent->condvar; Here the program creates a so-called 'condition variable': Condition variables are a great way to signal the completion of some event, or to state that some condition became true (thus the name). This condition variable represents the condition that the program wants to terminate. Later in the program, we will 'recv' that condition (call the C method on it), which will wait until the condition gets signalled (which is done by calling the C method on it). The next step is to create the handle object: my $handle = AnyEvent::Handle->new ( fh => \*STDIN, on_eof => sub { print "received EOF, exiting...\n"; $end_prog->broadcast; }, This handle object will read from standard input. Setting the C callback should be done for every file handle, as that is a condition that we always need to check for when working with file handles, to prevent reading or writing to a closed file handle, or getting stuck indefinitely in case of an error. Speaking of errors: on_error => sub { print "error while reading from STDIN: $!\n"; $end_prog->broadcast; } ); The C callback is also not required, but we set it here in case any error happens when we read from the file handle. It is usually a good idea to set this callback and at least print some diagnostic message: Even in our small example an error can happen. More on this later... $handle->push_read (sub { Next we push a general read callback on the read queue, which will wait until we have received all the data we wanted to receive. L has two queues per file handle, a read and a write queue. The write queue queues pending data that waits to be written to the file handle. And the read queue queues reading callbacks. For more details see the documentation L about the READ QUEUE and WRITE QUEUE. my ($handle) = @_; if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { print "got 'end', existing...\n"; $end_prog->broadcast; return 1 } 0 }); The actual callback waits until the word 'end' has been seen in the data received on standard input. Once we encounter the stop word 'end' we remove everything from the read buffer and call the condition variable we setup earlier, that signals our 'end of program' condition. And the callback returns with a true value, that signals we are done with reading all the data we were interested in (all data until the word 'end' has been seen). In all other cases, when the stop word has not been seen yet, we just return a false value, to indicate that we are not finished yet. The C method returns our read buffer, that we can directly modify as lvalue. Alternatively we also could have written: if ($handle->{rbuf} =~ s/^.*?\bend\b.*$//s) { The last line will wait for the condition that our program wants to exit: $end_prog->recv; The call to C will setup an event loop for us and wait for IO, timer or signal events and will handle them until the condition gets sent (by calling its C method). The key points to learn from this example are: =over 4 =item * Condition variables are used to start an event loop. =item * How to registering some basic callbacks on AnyEvent::Handle's. =item * How to process data in the read buffer. =back =head1 AUTHORS Robin Redeker C<< >>, Marc Lehmann .