[ViewVC] Diff of: cvs/AnyEvent/lib/AnyEvent/Handle.pm

Comparing AnyEvent/lib/AnyEvent/Handle.pm (file contents):
Revision 1.205 by root, Mon Nov 15 17:11:00 2010 UTC vs.
Revision 1.220 by root, Sun Jul 24 13:10:43 2011 UTC

 =over 4
 =item on_prepare => $cb->($handle)
 This (rarely used) callback is called before a new connection is
-attempted, but after the file handle has been created. It could be used to
+attempted, but after the file handle has been created (you can access that
+file handle via C<< $handle->{fh} >>). It could be used to prepare the
-prepare the file handle with parameters required for the actual connect
+file handle with parameters required for the actual connect (as opposed to
-(as opposed to settings that can be changed when the connection is already
+settings that can be changed when the connection is already established).
-established).
 The return value of this callback should be the connect timeout value in
 seconds (or C<0>, or C<undef>, or the empty list, to indicate that the
 default timeout is to be used).
 many seconds pass without a successful read or write on the underlying
 file handle (or a call to C<timeout_reset>), the C<on_timeout> callback
 will be invoked (and if that one is missing, a non-fatal C<ETIMEDOUT>
 error will be raised).
-There are three variants of the timeouts that work independently
+There are three variants of the timeouts that work independently of each
-of each other, for both read and write, just read, and just write:
+other, for both read and write (triggered when nothing was read I<OR>
+written), just read (triggered when nothing was read), and just write:
 C<timeout>, C<rtimeout> and C<wtimeout>, with corresponding callbacks
 C<on_timeout>, C<on_rtimeout> and C<on_wtimeout>, and reset functions
 C<timeout_reset>, C<rtimeout_reset>, and C<wtimeout_reset>.
-Note that timeout processing is active even when you do not have
+Note that timeout processing is active even when you do not have any
-any outstanding read or write requests: If you plan to keep the connection
+outstanding read or write requests: If you plan to keep the connection
-idle then you should disable the timeout temporarily or ignore the timeout
+idle then you should disable the timeout temporarily or ignore the
-in the C<on_timeout> callback, in which case AnyEvent::Handle will simply
+timeout in the corresponding C<on_timeout> callback, in which case
-restart the timeout.
+AnyEvent::Handle will simply restart the timeout.
-Zero (the default) disables this timeout.
+Zero (the default) disables the corresponding timeout.
 =item on_timeout => $cb->($handle)
+=item on_rtimeout => $cb->($handle)
+=item on_wtimeout => $cb->($handle)
 Called whenever the inactivity timeout passes. If you return from this
 callback, then the timeout will be reset as if some activity had happened,
 so this condition is not fatal in any way.
 For example, a server accepting connections from untrusted sources should
 be configured to accept only so-and-so much data that it cannot act on
 (for example, when expecting a line, an attacker could send an unlimited
 amount of data without a callback ever being called as long as the line
 isn't finished).
+=item wbuf_max => <bytes>
+If defined, then a fatal error will be raised (with C<$!> set to C<ENOSPC>)
+when the write buffer ever (strictly) exceeds this size. This is useful to
+avoid some forms of denial-of-service attacks.
+Although the units of this parameter is bytes, this is the I<raw> number
+of bytes not yet accepted by the kernel. This can make a difference when
+you e.g. use TLS, as TLS typically makes your write data larger (but it
+can also make it smaller due to compression).
+As an example of when this limit is useful, take a chat server that sends
+chat messages to a client. If the client does not read those in a timely
+manner then the send buffer in the server would grow unbounded.
 =item autocork => <boolean>
 When disabled (the default), C<push_write> will try to immediately
 write the data to the handle if possible. This avoids having to register
 Use the C<< ->starttls >> method if you need to start TLS negotiation later.
 =item tls_ctx => $anyevent_tls
 Use the given C<AnyEvent::TLS> object to create the new TLS connection
-(unless a connection object was specified directly). If this parameter is
+(unless a connection object was specified directly). If this
-missing, then AnyEvent::Handle will use C<AnyEvent::Handle::TLS_CTX>.
+parameter is missing (or C<undef>), then AnyEvent::Handle will use
+C<AnyEvent::Handle::TLS_CTX>.
 Instead of an object, you can also specify a hash reference with C<< key
 => value >> pairs. Those will be passed to L<AnyEvent::TLS> to create a
 new TLS context object.
                $self->{connect}[0],
                $self->{connect}[1],
                sub {
                   my ($fh, $host, $port, $retry) = @_;
-                  delete $self->{_connect};
+                  delete $self->{_connect}; # no longer needed
                   if ($fh) {
                      $self->{fh} = $fh;
                      delete $self->{_skip_drain_rbuf};
                             });
                   } else {
                      if ($self->{on_connect_error}) {
                         $self->{on_connect_error}($self, "$!");
-                        $self->destroy;
+                        $self->destroy if $self;
                      } else {
                         $self->_error ($!, 1);
                      }
                   }
                },
                sub {
                   local $self->{fh} = $_[0];
                   $self->{on_prepare}
-                     ?  $self->{on_prepare}->($self)
+                     ? $self->{on_prepare}->($self)
                      : ()
                }
             );
       }
 =item $handle->rbuf_max ($max_octets)
 Configures the C<rbuf_max> setting (C<undef> disables it).
+=item $handle->wbuf_max ($max_octets)
+Configures the C<wbuf_max> setting (C<undef> disables it).
 =cut
 sub rbuf_max {
    $_[0]{rbuf_max} = $_[1];
 }
+sub wbuf_max {
+   $_[0]{wbuf_max} = $_[1];
+}
 #############################################################################
 =item $handle->timeout ($seconds)
 =item $handle->rtimeout ($seconds)
 =item $handle->wtimeout ($seconds)
 Configures (or disables) the inactivity timeout.
+The timeout will be checked instantly, so this method might destroy the
+handle before it returns.
 =item $handle->timeout_reset
 =item $handle->rtimeout_reset
       if $cb && $self->{low_water_mark} >= (length $self->{wbuf}) + (length $self->{_tls_wbuf});
 }
 =item $handle->push_write ($data)
-Queues the given scalar to be written. You can push as much data as you
+Queues the given scalar to be written. You can push as much data as
-want (only limited by the available memory), as C<AnyEvent::Handle>
+you want (only limited by the available memory and C<wbuf_max>), as
-buffers it independently of the kernel.
+C<AnyEvent::Handle> buffers it independently of the kernel.
 This method may invoke callbacks (and therefore the handle might be
 destroyed after it returns).
 =cut
       $cb->() unless $self->{autocork};
       # if still data left in wbuf, we need to poll
       $self->{_ww} = AE::io $self->{fh}, 1, $cb
          if length $self->{wbuf};
+      if (
+         defined $self->{wbuf_max}
+         && $self->{wbuf_max} < length $self->{wbuf}
+      ) {
+         $self->_error (Errno::ENOSPC, 1), return;
+      }
    };
 }
 our %WH;
 before it was actually written. One way to do that is to replace your
 C<on_drain> handler by a callback that shuts down the socket (and set
 C<low_water_mark> to C<0>). This method is a shorthand for just that, and
 replaces the C<on_drain> callback with:
-   sub { shutdown $_[0]{fh}, 1 }    # for push_shutdown
+   sub { shutdown $_[0]{fh}, 1 }
 This simply shuts down the write side and signals an EOF condition to the
 the peer.
 You can rely on the normal read queue and C<on_eof> handling
    sub {
       # accept
       if ($$rbuf =~ $accept) {
          $data .= substr $$rbuf, 0, $+[0], "";
-         $cb->($self, $data);
+         $cb->($_[0], $data);
          return 1;
       }
       # reject
       if ($reject && $$rbuf =~ $reject) {
-         $self->_error (Errno::EBADMSG);
+         $_[0]->_error (Errno::EBADMSG);
       }
       # skip
       if ($skip && $$rbuf =~ $skip) {
          $data .= substr $$rbuf, 0, $+[0], "";
    my ($self, $cb) = @_;
    sub {
       unless ($_[0]{rbuf} =~ s/^(0|[1-9][0-9]*)://) {
          if ($_[0]{rbuf} =~ /[^0-9]/) {
-            $self->_error (Errno::EBADMSG);
+            $_[0]->_error (Errno::EBADMSG);
          }
          return;
       }
       my $len = $1;
-      $self->unshift_read (chunk => $len, sub {
+      $_[0]->unshift_read (chunk => $len, sub {
          my $string = $_[1];
          $_[0]->unshift_read (chunk => 1, sub {
             if ($_[1] eq ",") {
                $cb->($_[0], $string);
             } else {
-               $self->_error (Errno::EBADMSG);
+               $_[0]->_error (Errno::EBADMSG);
             }
          });
       });
       1
    my $data;
    my $rbuf = \$self->{rbuf};
    sub {
-      my $ref = eval { $json->incr_parse ($self->{rbuf}) };
+      my $ref = eval { $json->incr_parse ($_[0]{rbuf}) };
       if ($ref) {
-         $self->{rbuf} = $json->incr_text;
+         $_[0]{rbuf} = $json->incr_text;
          $json->incr_text = "";
-         $cb->($self, $ref);
+         $cb->($_[0], $ref);
          1
       } elsif ($@) {
          # error case
          $json->incr_skip;
-         $self->{rbuf} = $json->incr_text;
+         $_[0]{rbuf} = $json->incr_text;
          $json->incr_text = "";
-         $self->_error (Errno::EBADMSG);
+         $_[0]->_error (Errno::EBADMSG);
          ()
       } else {
-         $self->{rbuf} = "";
+         $_[0]{rbuf} = "";
          ()
       }
    }
 };
          # read remaining chunk
          $_[0]->unshift_read (chunk => $len, sub {
             if (my $ref = eval { Storable::thaw ($_[1]) }) {
                $cb->($_[0], $ref);
             } else {
-               $self->_error (Errno::EBADMSG);
+               $_[0]->_error (Errno::EBADMSG);
             }
          });
       }
       1
 Note that AnyEvent::Handle will automatically C<start_read> for you when
 you change the C<on_read> callback or push/unshift a read callback, and it
 will automatically C<stop_read> for you when neither C<on_read> is set nor
 there are any read requests in the queue.
-These methods will have no effect when in TLS mode (as TLS doesn't support
+In older versions of this module (<= 5.3), these methods had no effect,
-half-duplex connections).
+as TLS does not support half-duplex connections. In current versions they
+work as expected, as this behaviour is required to avoid certain resource
+attacks, where the program would be forced to read (and buffer) arbitrary
+amounts of data before being able to send some data. The drawback is that
+some readings of the the SSL/TLS specifications basically require this
+attack to be working, as SSL/TLS implementations might stall sending data
+during a rehandshake.
+As a guideline, during the initial handshake, you should not stop reading,
+and as a client, it might cause problems, depending on your applciation.
 =cut
 sub stop_read {
    my ($self) = @_;
-   delete $self->{_rw} unless $self->{tls};
+   delete $self->{_rw};
 }
 sub start_read {
    my ($self) = @_;
    Net::SSLeay::CTX_set_mode ($tls, 1|2);
    $self->{_rbio} = Net::SSLeay::BIO_new (Net::SSLeay::BIO_s_mem ());
    $self->{_wbio} = Net::SSLeay::BIO_new (Net::SSLeay::BIO_s_mem ());
-   Net::SSLeay::BIO_write ($self->{_rbio}, delete $self->{rbuf});
+   Net::SSLeay::BIO_write ($self->{_rbio}, $self->{rbuf});
+   $self->{rbuf} = "";
    Net::SSLeay::set_bio ($tls, $self->{_rbio}, $self->{_wbio});
    $self->{_on_starttls} = sub { $_[0]{on_starttls}(@_) }
       if $self->{on_starttls};
    $self->{tls_ctx}->_put_session (delete $self->{tls})
       if $self->{tls} > 0;
    delete @$self{qw(_rbio _wbio _tls_wbuf _on_starttls)};
 }
+=item $handle->resettls
+This rarely-used method simply resets and TLS state on the handle, usually
+causing data loss.
+One case where it may be useful is when you want to skip over the data in
+the stream but you are not interested in interpreting it, so data loss is
+no concern.
+=cut
+*resettls = \&_freetls;
 sub DESTROY {
    my ($self) = @_;
    &_freetls;
 It is only safe to "forget" the reference inside EOF or error callbacks,
 from within all other callbacks, you need to explicitly call the C<<
 ->destroy >> method.
+=item Why is my C<on_eof> callback never called?
+Probably because your C<on_error> callback is being called instead: When
+you have outstanding requests in your read queue, then an EOF is
+considered an error as you clearly expected some data.
+To avoid this, make sure you have an empty read queue whenever your handle
+is supposed to be "idle" (i.e. connection closes are O.K.). You cna set
+an C<on_read> handler that simply pushes the first read requests in the
+queue.
+See also the next question, which explains this in a bit more detail.
+=item How can I serve requests in a loop?
+Most protocols consist of some setup phase (authentication for example)
+followed by a request handling phase, where the server waits for requests
+and handles them, in a loop.
+There are two important variants: The first (traditional, better) variant
+handles requests until the server gets some QUIT command, causing it to
+close the connection first (highly desirable for a busy TCP server). A
+client dropping the connection is an error, which means this variant can
+detect an unexpected detection close.
+To handle this case, always make sure you have a on-empty read queue, by
+pushing the "read request start" handler on it:
+   # we assume a request starts with a single line
+   my @start_request; @start_request = (line => sub {
+      my ($hdl, $line) = @_;
+      ... handle request
+      # push next request read, possibly from a nested callback
+      $hdl->push_read (@start_request);
+   });
+   # auth done, now go into request handling loop
+   # now push the first @start_request
+   $hdl->push_read (@start_request);
+By always having an outstanding C<push_read>, the handle always expects
+some data and raises the C<EPIPE> error when the connction is dropped
+unexpectedly.
+The second variant is a protocol where the client can drop the connection
+at any time. For TCP, this means that the server machine may run out of
+sockets easier, and in general, it means you cnanot distinguish a protocl
+failure/client crash from a normal connection close. Nevertheless, these
+kinds of protocols are common (and sometimes even the best solution to the
+problem).
+Having an outstanding read request at all times is possible if you ignore
+C<EPIPE> errors, but this doesn't help with when the client drops the
+connection during a request, which would still be an error.
+A better solution is to push the initial request read in an C<on_read>
+callback. This avoids an error, as when the server doesn't expect data
+(i.e. is idly waiting for the next request, an EOF will not raise an
+error, but simply result in an C<on_eof> callback. It is also a bit slower
+and simpler:
+   # auth done, now go into request handling loop
+   $hdl->on_read (sub {
+      my ($hdl) = @_;
+      # called each time we receive data but the read queue is empty
+      # simply start read the request
+      $hdl->push_read (line => sub {
+         my ($hdl, $line) = @_;
+         ... handle request
+         # do nothing special when the request has been handled, just
+         # let the request queue go empty.
+      });
+   });
 =item I get different callback invocations in TLS mode/Why can't I pause
 reading?
 Unlike, say, TCP, TLS connections do not consist of two independent
 communication channels, one for each direction. Or put differently, the
    $handle->on_read (sub { });
    $handle->on_eof (undef);
    $handle->on_error (sub {
       my $data = delete $_[0]{rbuf};
    });
+Note that this example removes the C<rbuf> member from the handle object,
+which is not normally allowed by the API. It is expressly permitted in
+this case only, as the handle object needs to be destroyed afterwards.
 The reason to use C<on_error> is that TCP connections, due to latencies
 and packets loss, might get closed quite violently with an error, when in
 fact all data has been received.

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent/lib/AnyEvent/Handle.pm (file contents): Revision 1.205 by root, Mon Nov 15 17:11:00 2010 UTC vs. Revision 1.220 by root, Sun Jul 24 13:10:43 2011 UTC

Diff Legend

Comparing AnyEvent/lib/AnyEvent/Handle.pm (file contents):
Revision 1.205 by root, Mon Nov 15 17:11:00 2010 UTC vs.
Revision 1.220 by root, Sun Jul 24 13:10:43 2011 UTC