lib/AnyEvent/Intro.pod

=head1 Introduction to AnyEvent

This is a tutorial that will introduce you to the features of AnyEvent.

The first part introduces the core AnyEvent module (after swamping you a
bit in evangelism), which might already provide all you ever need.

The second part focuses on network programming using sockets, for which
AnyEvent offers a lot of support you can use.


=head1 What is AnyEvent?

Skip this section if you want to see code, now!

AnyEvent is first of all just a framework to do event-based
programming. Typically such frameworks are an all-or-nothing thing: If you
use one such framework, you can't (easily, or even at all) use another in
the same program.

AnyEvent is different - it is a thin abstraction layer above all kinds
of event loops. Its main purpose is to move the choice of the underlying
framework (the event loop) from the module author to the program author
using the module.

That means you can write code that uses events to control what it
does, without forcing other code in the same program to use the same
underlying framework as you do - i.e. you can create a Perl module
that is event-based using AnyEvent, and users of that module can still
choose between using L<Gtk2>, L<Tk>, L<Event> or no event loop at
all: AnyEvent comes with its own event loop implementation, so your
code works regardless of other modules that might or might not be
installed. The latter is important, as AnyEvent does not have any
dependencies to other modules, which makes it easy to install, for
example, when you lack a C compiler.

A typical problem with Perl modules such as L<Net::IRC> is that they
come with their own event loop: In L<Net::IRC>, the program who uses it
needs to start the event loop of L<Net::IRC>. That means that one cannot
integrate this module into a L<Gtk2> GUI for instance, as that module,
too, enforces the use of its own event loop (namely L<Glib>).

Another example is L<LWP>: it provides no event interface at all. It's a
pure blocking HTTP (and FTP etc.) client library, which usually means that
you either have to start a thread or have to fork for a HTTP request, or
use L<Coro::LWP>, if you want to do something else while waiting for the
request to finish.

The motivation behind these designs is often that a module doesn't want to
depend on some complicated XS-module (Net::IRC), or that it doesn't want
to force the user to use some specific event loop at all (LWP).

L<AnyEvent> solves this dilemma, by B<not> forcing module authors to either

=over 4

=item write their own event loop (because guarantees to offer one
everywhere - even on windows).

=item choose one fixed event loop (because AnyEvent works with all
important event loops available for Perl, and adding others is trivial).

=back

If the module author uses L<AnyEvent> for all his event needs (IO events,
timers, signals, ...) then all other modules can just use his module and
don't have to choose an event loop or adapt to his event loop. The choice
of the event loop is ultimately made by the program author who uses all
the modules and writes the main program. And even there he doesn't have to
choose, he can just let L<AnyEvent> choose the best available event loop
for him.

Read more about this in the main documentation of the L<AnyEvent> module.


=head1 Introduction to Event-Based Programming

So what exactly is programming using events? It quite simply means that
instead of your code actively waiting for something, such as the user
entering something on STDIN:

   $| = 1; print "enter your name> ";

   my $name = <STDIN>;

You instead tell your event framework to notify you in the event of some
data being available on STDIN, by using a callback mechanism:

   use AnyEvent;

   $| = 1; print "enter your name> ";

   my $name;

   my $wait_for_input = AnyEvent->io (
      fh   => \*STDIN, # which file handle to check
      poll => "r",     # which event to wait for ("r"ead data)
      cb   => sub {    # what callback to execute
         $name = <STDIN>; # read it
      }
   );

   # do something else here

Looks more complicated, and surely is, but the advantage of using events
is that your program can do something else instead of waiting for
input. Waiting as in the first example is also called "blocking" because
you "block" your process from executing anything else while you do so.

The second example avoids blocking, by only registering interest in a read
event, which is fast and doesn't block your process. Only when read data
is available will the callback be called, which can then proceed to read
the data.

The "interest" is represented by an object returned by C<< AnyEvent->io
>> called a "watcher" object - called like that because it "watches" your
file handle (or other event sources) for the event you are interested in.

In the example above, we create an I/O watcher by calling the C<<
AnyEvent->io >> method. Disinterest in some event is simply expressed by
forgetting about the watcher, for example, by C<undef>'ing the variable it
is stored in. AnyEvent will automatically clean up the watcher if it is no
longer used, much like Perl closes your file handles if you no longer use
them anywhere.

=head2 Condition Variables

However, the above is not a fullly working program, and will not work
as-is. The reason is that your callback will not be invoked out of the
blue, you have to run the event loop. Also, event-based programs sometimes
have to block, too, as when there simply is nothing else to do and
everything waits for some events, it needs to block the process as well.

In AnyEvent, this is done using condition variables. Condition variables
are named "condition variables" because they represent a condition that is
initially false and needs to be fulfilled.

You can also call them mergepoints, syncpoints, rendezvous ports or even
callbacks and many other things (and they are often called like this in
other frameworks). The important point is that you can create them freely
and later wait for them to become true.

Condition variables have two sides - one side is the "producer" of the
condition (whatever code detects the condition), the other side is the
"consumer" (the code that waits for that condition).

In our example in the previous section, the producer is the event callback
and there is no consumer yet - let's change that now:

   use AnyEvent;

   $| = 1; print "enter your name> ";

   my $name;

   my $name_ready = AnyEvent->condvar;

   my $wait_for_input = AnyEvent->io (
      fh   => \*STDIN,
      poll => "r",
      cb   => sub {
         $name = <STDIN>;
         $name_ready->send;
      }
   );

   # do something else here

   # now wait until the name is available:
   $name_ready->recv;

   undef $wait_for_input; # watche rno longer needed

   print "your name is $name\n";

This program creates an AnyEvent condvar by calling the C<<
AnyEvent->condvar >> method. It then creates a watcher as usual, but
inside the callback it C<send>'s the C<$name_ready> condition variable,
which causes anybody waiting on it to continue.

The "anybody" in this case is the code that follows, which calls C<<
$name_ready->recv >>: The producer calls C<send>, the consumer calls
C<recv>.

If there is no C<$name> available yet, then the call to C<<
$name_ready->recv >> will halt your program until the condition becomes
true.

As the names C<send> and C<recv> imply, you can actually send and receive
data using this, for example, the above code could also be written like
this, without an extra variable to store the name in:

   use AnyEvent;

   $| = 1; print "enter your name> ";

   my $name_ready = AnyEvent->condvar;

   my $wait_for_input = AnyEvent->io (
      fh => \*STDIN, poll => "r",
      cb => sub { $name_ready->send (scalar = <STDIN>) }
   );

   # do something else here

   # now wait and fetch the name
   my $name = $name_ready->recv;

   undef $wait_for_input; # watche rno longer needed

   print "your name is $name\n";

You can pass any number of arguments to C<send>, and everybody call to
C<recv> will return them.

=head2 The "main loop"

Most event-based frameworks have something called a "main loop" or "event
loop run function" or something similar.

Just like in C<recv> AnyEvent, these functions need to be called
eventually so that your event loop has a chance of actually looking for
those events you are interested in.

For example, in a L<Gtk2> program, the above example could also be written
like this:

   use Gtk2 -init;
   use AnyEvent;

   ############################################
   # create a window and some label

   my $window = new Gtk2::Window "toplevel";
   $window->add (my $label = new Gtk2::Label "soon replaced by name");

   $window->show_all;

   ############################################
   # do our AnyEvent stuff

   $| = 1; print "enter your name> ";

   my $name_ready = AnyEvent->condvar;

   my $wait_for_input = AnyEvent->io (
      fh => \*STDIN, poll => "r",
      cb => sub {
         # set the label
         $label->set_text (scalar <STDIN>);
         print "enter another name> ";
      }
   );

   ############################################
   # Now enter Gtk2's event loop

   main Gtk2;

No condition variable anywhere in sight - instead, we just read a line
from STDIN and replace the text in the label. In fact, since nobody
C<undef>'s C<$wait_for_input> you can enter multiple lines.

Instead of waiting for a condition variable, the program enters the Gtk2
main loop by calling C<< Gtk2->main >>, which will block the program and
wait for events to arrive.

This also shows that AnyEvent is quite flexible - you didn't have anything
to do to make the AnyEvent watcher use Gtk2 (actually Glib) - it just
worked.

Admittedly, the example is a bit silly - who would want to read names
form standard input in a Gtk+ application. But imagine that instead of
doing that, you would make a HTTP request in the background and display
it's results. In fact, with event-based programming you can make many
http-requests in parallel in your program and still provide feedback to
the user and stay interactive.

In the next part you will see how to do just that - by implementing an
HTTP request, on our own, with the utility modules AnyEvent comes with.

Before that, however, lets briefly look at how you would write your
program with using only AnyEvent, without ever calling some other event
loop's run function.

In the example using condition variables, we used that, and in fact, this
is the solution:

   my $quit_program = AnyEvent->condvar;

   # create AnyEvent watchers (or not) here

   $quit_program->recv;

If any of your watcher callbacks decide to quit, they can simply call
C<< $quit_program->send >>. Of course, they could also decide not to and
simply call C<exit> instead, or they could decide not to quit, ever (e.g.
in a long-running daemon program).

In that case, you can simply use:

   AnyEvent->condvar->recv;

And this is, in fact, closest to the idea of a main loop run function that
AnyEvent offers.

=head2 Timers and other event sources

So far, we have only used I/O watchers. These are useful mainly to find
out wether a Socket has data to read, or space to write more data. On sane
operating systems this also works for console windows/terminals (typically
on standard input), serial lines, all sorts of other devices, basically
almost everything that has a file descriptor but isn't a file itself. (As
usual, "sane" excludes windows - on that platform you would need different
functions for all of these, complicating code immesely - think "socket
only" on windows).

However, I/O is not everything - the secondmost important event source is
the clock. For example when doing an HTTP request you might want to time
out when the server doesn't answre within some predefined amount of time.

In AnyEvent, timer event watchers are created by calling the C<<
AnyEvent->timer >> method:

   use AnyEvent;

   my $cv = AnyEvent->condvar;

   my $wait_one_and_a_half_seconds = AnyEvent->timer (
      after => 1.5,  # after how many seconds to invoke the cb?
      cb    => sub { # the callback to invoke
         $cv->send;
      },
   );

   # can do somehting else here

   # now wait till our time has come
   $cv->recv;

Unlike I/O watchers, timers are only interested in the amount of seconds
they have to wait. When that amount of time has passed, AnyEvent will
invoke your callback.

Unlike I/O watchers, which will call your callback as many times as there
is data available, timers are one-shot: after they have "fired" once and
invoked your callback, they are dead and no longer do anything.

To get a repeating timer, such as a timer firing roughly once per second,
you have to recreate it:

   use AnyEvent;

   my $time_watcher;

   sub once_per_second {
      print "tick\n";
      
      # (re-)create the watcher
      $time_watcher = AnyEvent->timer (
         after => 1,
         cb    => \&once_per_second,
      );
   }

   # now start the timer
   once_per_second;

Having to recreate your timer is a restriction put on AnyEvent that is
present in most event libraries it uses. It is so annoying that some
future version might worka round this limitation, but right now, it's the
only way to do repeating timers.

Fortunately most timers aren't really repeating but specify timeouts of
some sort.

=head3 More esoteric sources

AnyEvent also has some other, more esoteric event sources you can tap
into: signal and child watchers.

Signal watchers can be used to wait for "signal events", which simply
means your process got send a signal (Such as C<SIGTERM> or C<SIGUSR1>).

Process watchers wait for a child process to exit. They are useful when
you fork a separate process and ened to know when it exits, but you do not
wait for that by blocking.

Both watcher types are described in detail in the main L<AnyEvent> manual
page.


=head1 Network programming and AnyEvent

AnyEvent is not just a simple abstraction anymore. While the core
L<AnyEvent> module is still small and self-contained, the distribution
comes with some very useful utility modules such as L<AnyEvent::Handle>,
L<AnyEvent::DNS> and L<AnyEvent::Socket>. These can make your life as
non-blocking network programmer a lot easier.

Here is an introduction into these three submodules:

=head2 L<AnyEvent::Handle>

This module handles non-blocking IO on file handles in an event based
manner. It provides a wrapper object around your file handle that provides
queueing and buffering of incoming and outgoing data for you.

More about this later.

=head2 L<AnyEvent::Socket>

This module provides you with functions that handle socket creation
and IP address magic. The two main functions are C<tcp_connect> and
C<tcp_server>. The former will connect a (streaming) socket to an internet
host for you and the later will make a server socket for you, to accept
connections.

This module also comes with transparent IPv6 support, this means: If you
write your programs with this module, you will be IPv6 ready without doing
anything further.

It also works around a lot of portability quirks (especially on the
windows platform), which makes it even easier to write your programs in a
portable way.

=head2 L<AnyEvent::DNS>

This module allows fully asynchronous DNS resolution. It is used mainly
by L<AnyEvent::Socket> to resolve hostnames and service ports, but is a
great way to do other DNS resolution tasks, such as reverse lookups of IP
addresses for log files.

=head2 First experiments with AnyEvent::Handle

Now let's start with something simple: a program that reads from standard
input in a non-blocking way, that is, in a way that lets your program do
other things while it is waiting for input.

First, the full program listing:

   #!/usr/bin/perl

   use AnyEvent;
   use AnyEvent::Handle;

   my $end_prog = AnyEvent->condvar;

   my $handle =
      AnyEvent::Handle->new (
         fh => \*STDIN,
         on_eof => sub {
            print "received EOF, exiting...\n";
            $end_prog->broadcast;
         },
         on_error => sub {
            print "error while reading from STDIN: $!\n";
            $end_prog->broadcast;
         }
      );

   $handle->push_read (sub {
      my ($handle) = @_;

      if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) {
         print "got 'end', existing...\n";
         $end_prog->broadcast;
         return 1
      }

      0
   });

   $end_prog->recv;

That's a mouthful, so lets go through it step by step:

   #!/usr/bin/perl

   use AnyEvent;
   use AnyEvent::Handle;

Nothing unexpected here, just load AnyEvent for the event functionality
and AnyEvent::Handle for your file handling needs.

   my $end_prog = AnyEvent->condvar;

Here the program creates a so-called 'condition variable': Condition
variables are a great way to signal the completion of some event, or to
state that some condition became true (thus the name).

This condition variable represents the condition that the program wants to
terminate. Later in the progra, we will 'recv' that condition (call the
C<recv> method on it), which will wait until the condition gets signalled
(which is done by calling the C<send> method on it).

The next step is to create the handle object:

   my $handle =
      AnyEvent::Handle->new (
         fh     => \*STDIN,
         on_eof => sub {
            print "received EOF, exiting...\n";
            $end_prog->broadcast;
         },

This handle object will read from standard input. Setting the C<on_eof>
callback should be done for every file handle, as that is a condition that
we always need to check for when working with file handles, to prevent
reading or writing to a closed file handle, or getting stuck indefinitely
in case of an error.

Speaking of errors:

         on_error => sub {
            print "error while reading from STDIN: $!\n";
            $end_prog->broadcast;
         }
      );

The C<on_error> callback is also not required, but we set it here in case
any error happens when we read from the file handle. It is usually a good
idea to set this callback and at least print some diagnostic message: Even
in our small example an error can happen. More on this later...

   $handle->push_read (sub {

Next we push a general read callback on the read queue, which
will wait until we have received all the data we wanted to
receive. L<AnyEvent::Handle> has two queues per file handle, a read and a
write queue. The write queue queues pending data that waits to be written
to the file handle. And the read queue queues reading callbacks. For more
details see the documentation L<AnyEvent::Handle> about the READ QUEUE and
WRITE QUEUE.

      my ($handle) = @_;

      if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) {
         print "got 'end', existing...\n";
         $end_prog->broadcast;
         return 1
      }

      0
   });

The actual callback waits until the word 'end' has been seen in the data
received on standard input. Once we encounter the stop word 'end' we
remove everything from the read buffer and call the condition variable
we setup earlier, that signals our 'end of program' condition. And the
callback returns with a true value, that signals we are done with reading
all the data we were interested in (all data until the word 'end' has been
seen).

In all other cases, when the stop word has not been seen yet, we just
return a false value, to indicate that we are not finished yet.

The C<rbuf> method returns our read buffer, that we can directly modify as
lvalue.  Alternatively we also could have written:

      if ($handle->{rbuf} =~ s/^.*?\bend\b.*$//s) {

The last line will wait for the condition that our program wants to exit:

   $end_prog->recv;

The call to C<recv> will setup an event loop for us and wait for IO, timer
or signal events and will handle them until the condition gets sent (by
calling its C<send> method).

The key points to learn from this example are:

=over 4

=item * Condition variables are used to start an event loop.

=item * How to registering some basic callbacks on AnyEvent::Handle's.

=item * How to process data in the read buffer.

=back

Revision:	1.2
Committed:	Sat May 31 00:35:10 2008 UTC (16 years ago) by root
Branch:	MAIN
Changes since 1.1:	+374 -35 lines
Log Message:	* empty log message *