[ViewVC] Diff of: cvs/AnyEvent/lib/AnyEvent/Handle.pm

Comparing AnyEvent/lib/AnyEvent/Handle.pm (file contents):
Revision 1.201 by root, Wed Oct 13 01:15:57 2010 UTC vs.
Revision 1.212 by root, Fri Dec 31 04:50:44 2010 UTC

    }
    \&$func
 }
+sub MAX_READ_SIZE() { 131072 }
 =head1 METHODS
 =over 4
 =item $handle = B<new> AnyEvent::Handle fh => $filehandle, key => value...
 =over 4
 =item on_prepare => $cb->($handle)
 This (rarely used) callback is called before a new connection is
-attempted, but after the file handle has been created. It could be used to
+attempted, but after the file handle has been created (you can access that
+file handle via C<< $handle->{fh} >>). It could be used to prepare the
-prepare the file handle with parameters required for the actual connect
+file handle with parameters required for the actual connect (as opposed to
-(as opposed to settings that can be changed when the connection is already
+settings that can be changed when the connection is already established).
-established).
 The return value of this callback should be the connect timeout value in
 seconds (or C<0>, or C<undef>, or the empty list, to indicate that the
 default timeout is to be used).
 Some errors are fatal (which is indicated by C<$fatal> being true). On
 fatal errors the handle object will be destroyed (by a call to C<< ->
 destroy >>) after invoking the error callback (which means you are free to
 examine the handle object). Examples of fatal errors are an EOF condition
-with active (but unsatisifable) read watchers (C<EPIPE>) or I/O errors. In
+with active (but unsatisfiable) read watchers (C<EPIPE>) or I/O errors. In
 cases where the other side can close the connection at will, it is
 often easiest to not report C<EPIPE> errors in this callback.
 AnyEvent::Handle tries to find an appropriate error code for you to check
 against, but in some cases (TLS errors), this does not work well. It is
 For example, a server accepting connections from untrusted sources should
 be configured to accept only so-and-so much data that it cannot act on
 (for example, when expecting a line, an attacker could send an unlimited
 amount of data without a callback ever being called as long as the line
 isn't finished).
+=item wbuf_max => <bytes>
+If defined, then a fatal error will be raised (with C<$!> set to C<ENOSPC>)
+when the write buffer ever (strictly) exceeds this size. This is useful to
+avoid some forms of denial-of-service attacks.
+Although the units of this parameter is bytes, this is the I<raw> number
+of bytes not yet accepted by the kernel. This can make a difference when
+you e.g. use TLS, as TLS typically makes your write data larger (but it
+can also make it smaller due to compression).
+As an example of when this limit is useful, take a chat server that sends
+chat messages to a client. If the client does not read those in a timely
+manner then the send buffer in the server would grow unbounded.
 =item autocork => <boolean>
 When disabled (the default), C<push_write> will try to immediately
 write the data to the handle if possible. This avoids having to register
 already have occured on BSD systems), but at least it will protect you
 from most attacks.
 =item read_size => <bytes>
-The default read block size (the number of bytes this module will
+The initial read block size, the number of bytes this module will try to
-try to read during each loop iteration, which affects memory
+read during each loop iteration. Each handle object will consume at least
-requirements). Default: C<8192>.
+this amount of memory for the read buffer as well, so when handling many
+connections requirements). See also C<max_read_size>. Default: C<2048>.
+=item max_read_size => <bytes>
+The maximum read buffer size used by the dynamic adjustment
+algorithm: Each time AnyEvent::Handle can read C<read_size> bytes in
+one go it will double C<read_size> up to the maximum given by this
+option. Default: C<131072> or C<read_size>, whichever is higher.
 =item low_water_mark => <bytes>
 Sets the number of bytes (default: C<0>) that make up an "empty" write
 buffer: If the buffer reaches this size or gets even samller it is
 Use the C<< ->starttls >> method if you need to start TLS negotiation later.
 =item tls_ctx => $anyevent_tls
 Use the given C<AnyEvent::TLS> object to create the new TLS connection
-(unless a connection object was specified directly). If this parameter is
+(unless a connection object was specified directly). If this
-missing, then AnyEvent::Handle will use C<AnyEvent::Handle::TLS_CTX>.
+parameter is missing (or C<undef>), then AnyEvent::Handle will use
+C<AnyEvent::Handle::TLS_CTX>.
 Instead of an object, you can also specify a hash reference with C<< key
 => value >> pairs. Those will be passed to L<AnyEvent::TLS> to create a
 new TLS context object.
             AnyEvent::Socket::tcp_connect (
                $self->{connect}[0],
                $self->{connect}[1],
                sub {
                   my ($fh, $host, $port, $retry) = @_;
+                  delete $self->{_connect}; # no longer needed
                   if ($fh) {
                      $self->{fh} = $fh;
                      delete $self->{_skip_drain_rbuf};
                },
                sub {
                   local $self->{fh} = $_[0];
                   $self->{on_prepare}
-                     ?  $self->{on_prepare}->($self)
+                     ? $self->{on_prepare}->($self)
                      : ()
                }
             );
       }
    AnyEvent::Util::fh_nonblocking $self->{fh}, 1;
    $self->{_activity}  =
    $self->{_ractivity} =
    $self->{_wactivity} = AE::now;
+   $self->{read_size} ||= 2048;
+   $self->{max_read_size} = $self->{read_size}
+      if $self->{read_size} > ($self->{max_read_size} || MAX_READ_SIZE);
    $self->timeout   (delete $self->{timeout}  ) if $self->{timeout};
    $self->rtimeout  (delete $self->{rtimeout} ) if $self->{rtimeout};
    $self->wtimeout  (delete $self->{wtimeout} ) if $self->{wtimeout};
 =item $handle->rbuf_max ($max_octets)
 Configures the C<rbuf_max> setting (C<undef> disables it).
+=item $handle->wbuf_max ($max_octets)
+Configures the C<wbuf_max> setting (C<undef> disables it).
 =cut
 sub rbuf_max {
    $_[0]{rbuf_max} = $_[1];
+}
+sub rbuf_max {
+   $_[0]{wbuf_max} = $_[1];
 }
 #############################################################################
 =item $handle->timeout ($seconds)
       if $cb && $self->{low_water_mark} >= (length $self->{wbuf}) + (length $self->{_tls_wbuf});
 }
 =item $handle->push_write ($data)
-Queues the given scalar to be written. You can push as much data as you
+Queues the given scalar to be written. You can push as much data as
-want (only limited by the available memory), as C<AnyEvent::Handle>
+you want (only limited by the available memory and C<wbuf_max>), as
-buffers it independently of the kernel.
+C<AnyEvent::Handle> buffers it independently of the kernel.
 This method may invoke callbacks (and therefore the handle might be
 destroyed after it returns).
 =cut
       $cb->() unless $self->{autocork};
       # if still data left in wbuf, we need to poll
       $self->{_ww} = AE::io $self->{fh}, 1, $cb
          if length $self->{wbuf};
+      if (
+         defined $self->{wbuf_max}
+         && $self->{wbuf_max} < length $self->{wbuf}
+      ) {
+         $self->_error (Errno::ENOSPC, 1), return;
+      }
    };
 }
 our %WH;
    unless ($self->{_rw} || $self->{_eof} || !$self->{fh}) {
       Scalar::Util::weaken $self;
       $self->{_rw} = AE::io $self->{fh}, 0, sub {
          my $rbuf = \($self->{tls} ? my $buf : $self->{rbuf});
-         my $len = sysread $self->{fh}, $$rbuf, $self->{read_size} || 8192, length $$rbuf;
+         my $len = sysread $self->{fh}, $$rbuf, $self->{read_size}, length $$rbuf;
          if ($len > 0) {
             $self->{_activity} = $self->{_ractivity} = AE::now;
             if ($self->{tls}) {
                Net::SSLeay::BIO_write ($self->{_rbio}, $$rbuf);
                &_dotls ($self);
             } else {
                $self->_drain_rbuf;
+            }
+            if ($len == $self->{read_size}) {
+               $self->{read_size} *= 2;
+               $self->{read_size} = $self->{max_read_size} || MAX_READ_SIZE
+                  if $self->{read_size} > ($self->{max_read_size} || MAX_READ_SIZE);
             }
          } elsif (defined $len) {
             delete $self->{_rw};
             $self->{_eof} = 1;
       push @linger, AE::io $fh, 1, sub {
          my $len = syswrite $fh, $wbuf, length $wbuf;
          if ($len > 0) {
             substr $wbuf, 0, $len, "";
-         } else {
+         } elsif (defined $len || ($! != EAGAIN && $! != EINTR && $! != WSAEWOULDBLOCK)) {
             @linger = (); # end
          }
       };
       push @linger, AE::timer $linger, 0, sub {
          @linger = ();
 It is only safe to "forget" the reference inside EOF or error callbacks,
 from within all other callbacks, you need to explicitly call the C<<
 ->destroy >> method.
+=item Why is my C<on_eof> callback never called?
+Probably because your C<on_error> callback is being called instead: When
+you have outstanding requests in your read queue, then an EOF is
+considered an error as you clearly expected some data.
+To avoid this, make sure you have an empty read queue whenever your handle
+is supposed to be "idle" (i.e. connection closes are O.K.). You cna set
+an C<on_read> handler that simply pushes the first read requests in the
+queue.
+See also the next question, which explains this in a bit more detail.
+=item How can I serve requests in a loop?
+Most protocols consist of some setup phase (authentication for example)
+followed by a request handling phase, where the server waits for requests
+and handles them, in a loop.
+There are two important variants: The first (traditional, better) variant
+handles requests until the server gets some QUIT command, causing it to
+close the connection first (highly desirable for a busy TCP server). A
+client dropping the connection is an error, which means this variant can
+detect an unexpected detection close.
+To handle this case, always make sure you have a on-empty read queue, by
+pushing the "read request start" handler on it:
+   # we assume a request starts with a single line
+   my @start_request; @start_request = (line => sub {
+      my ($hdl, $line) = @_;
+      ... handle request
+      # push next request read, possibly from a nested callback
+      $hdl->push_read (@start_request);
+   });
+   # auth done, now go into request handling loop
+   # now push the first @start_request
+   $hdl->push_read (@start_request);
+By always having an outstanding C<push_read>, the handle always expects
+some data and raises the C<EPIPE> error when the connction is dropped
+unexpectedly.
+The second variant is a protocol where the client can drop the connection
+at any time. For TCP, this means that the server machine may run out of
+sockets easier, and in general, it means you cnanot distinguish a protocl
+failure/client crash from a normal connection close. Nevertheless, these
+kinds of protocols are common (and sometimes even the best solution to the
+problem).
+Having an outstanding read request at all times is possible if you ignore
+C<EPIPE> errors, but this doesn't help with when the client drops the
+connection during a request, which would still be an error.
+A better solution is to push the initial request read in an C<on_read>
+callback. This avoids an error, as when the server doesn't expect data
+(i.e. is idly waiting for the next request, an EOF will not raise an
+error, but simply result in an C<on_eof> callback. It is also a bit slower
+and simpler:
+   # auth done, now go into request handling loop
+   $hdl->on_read (sub {
+      my ($hdl) = @_;
+      # called each time we receive data but the read queue is empty
+      # simply start read the request
+      $hdl->push_read (line => sub {
+         my ($hdl, $line) = @_;
+         ... handle request
+         # do nothing special when the request has been handled, just
+         # let the request queue go empty.
+      });
+   });
 =item I get different callback invocations in TLS mode/Why can't I pause
 reading?
 Unlike, say, TCP, TLS connections do not consist of two independent
 communication channels, one for each direction. Or put differently, the

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent/lib/AnyEvent/Handle.pm (file contents): Revision 1.201 by root, Wed Oct 13 01:15:57 2010 UTC vs. Revision 1.212 by root, Fri Dec 31 04:50:44 2010 UTC

Diff Legend

Comparing AnyEvent/lib/AnyEvent/Handle.pm (file contents):
Revision 1.201 by root, Wed Oct 13 01:15:57 2010 UTC vs.
Revision 1.212 by root, Fri Dec 31 04:50:44 2010 UTC