ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent/Handle.pm
Revision: 1.43
Committed: Wed May 28 23:57:38 2008 UTC (15 years, 11 months ago) by root
Branch: MAIN
Changes since 1.42: +105 -5 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 elmex 1.1 package AnyEvent::Handle;
2    
3 elmex 1.6 no warnings;
4 elmex 1.1 use strict;
5    
6 root 1.8 use AnyEvent ();
7 root 1.42 use AnyEvent::Util qw(WSAEWOULDBLOCK);
8 root 1.8 use Scalar::Util ();
9     use Carp ();
10     use Fcntl ();
11 root 1.43 use Errno qw(EAGAIN EINTR);
12     use Time::HiRes qw(time);
13 elmex 1.1
14     =head1 NAME
15    
16 root 1.22 AnyEvent::Handle - non-blocking I/O on file handles via AnyEvent
17 elmex 1.1
18     =cut
19    
20 root 1.15 our $VERSION = '0.04';
21 elmex 1.1
22     =head1 SYNOPSIS
23    
24     use AnyEvent;
25     use AnyEvent::Handle;
26    
27     my $cv = AnyEvent->condvar;
28    
29 root 1.31 my $handle =
30 elmex 1.2 AnyEvent::Handle->new (
31     fh => \*STDIN,
32     on_eof => sub {
33     $cv->broadcast;
34     },
35     );
36    
37 root 1.31 # send some request line
38     $handle->push_write ("getinfo\015\012");
39    
40     # read the response line
41     $handle->push_read (line => sub {
42     my ($handle, $line) = @_;
43     warn "read line <$line>\n";
44     $cv->send;
45     });
46    
47     $cv->recv;
48 elmex 1.1
49     =head1 DESCRIPTION
50    
51 root 1.8 This module is a helper module to make it easier to do event-based I/O on
52 elmex 1.13 filehandles. For utility functions for doing non-blocking connects and accepts
53     on sockets see L<AnyEvent::Util>.
54 root 1.8
55     In the following, when the documentation refers to of "bytes" then this
56     means characters. As sysread and syswrite are used for all I/O, their
57     treatment of characters applies to this module as well.
58 elmex 1.1
59 root 1.8 All callbacks will be invoked with the handle object as their first
60     argument.
61 elmex 1.1
62     =head1 METHODS
63    
64     =over 4
65    
66     =item B<new (%args)>
67    
68 root 1.8 The constructor supports these arguments (all as key => value pairs).
69 elmex 1.1
70     =over 4
71    
72 root 1.8 =item fh => $filehandle [MANDATORY]
73 elmex 1.1
74     The filehandle this L<AnyEvent::Handle> object will operate on.
75    
76 root 1.8 NOTE: The filehandle will be set to non-blocking (using
77     AnyEvent::Util::fh_nonblocking).
78    
79 root 1.40 =item on_eof => $cb->($handle)
80 root 1.10
81     Set the callback to be called on EOF.
82 root 1.8
83 root 1.16 While not mandatory, it is highly recommended to set an eof callback,
84     otherwise you might end up with a closed socket while you are still
85     waiting for data.
86    
87 root 1.40 =item on_error => $cb->($handle)
88 root 1.10
89     This is the fatal error callback, that is called when, well, a fatal error
90 elmex 1.20 occurs, such as not being able to resolve the hostname, failure to connect
91 root 1.10 or a read error.
92 root 1.8
93     The object will not be in a usable state when this callback has been
94     called.
95    
96 root 1.10 On callback entrance, the value of C<$!> contains the operating system
97 root 1.43 error (or C<ENOSPC>, C<EPIPE>, C<ETIMEDOUT> or C<EBADMSG>).
98 root 1.8
99 root 1.38 The callback should throw an exception. If it returns, then
100 root 1.37 AnyEvent::Handle will C<croak> for you.
101    
102 root 1.10 While not mandatory, it is I<highly> recommended to set this callback, as
103     you will not be notified of errors otherwise. The default simply calls
104     die.
105 root 1.8
106 root 1.40 =item on_read => $cb->($handle)
107 root 1.8
108     This sets the default read callback, which is called when data arrives
109 root 1.10 and no read request is in the queue.
110 root 1.8
111     To access (and remove data from) the read buffer, use the C<< ->rbuf >>
112 root 1.40 method or access the C<$handle->{rbuf}> member directly.
113 root 1.8
114     When an EOF condition is detected then AnyEvent::Handle will first try to
115     feed all the remaining data to the queued callbacks and C<on_read> before
116     calling the C<on_eof> callback. If no progress can be made, then a fatal
117     error will be raised (with C<$!> set to C<EPIPE>).
118 elmex 1.1
119 root 1.40 =item on_drain => $cb->($handle)
120 elmex 1.1
121 root 1.8 This sets the callback that is called when the write buffer becomes empty
122     (or when the callback is set and the buffer is empty already).
123 elmex 1.1
124 root 1.8 To append to the write buffer, use the C<< ->push_write >> method.
125 elmex 1.2
126 root 1.43 =item timeout => $fractional_seconds
127    
128     If non-zero, then this enables an "inactivity" timeout: whenever this many
129     seconds pass without a successful read or write on the underlying file
130     handle, the C<on_timeout> callback will be invoked (and if that one is
131     missing, an C<ETIMEDOUT> errror will be raised).
132    
133     Note that timeout processing is also active when you currently do not have
134     any outstanding read or write requests: If you plan to keep the connection
135     idle then you should disable the timout temporarily or ignore the timeout
136     in the C<on_timeout> callback.
137    
138     Zero (the default) disables this timeout.
139    
140     =item on_timeout => $cb->($handle)
141    
142     Called whenever the inactivity timeout passes. If you return from this
143     callback, then the timeout will be reset as if some activity had happened,
144     so this condition is not fatal in any way.
145    
146 root 1.8 =item rbuf_max => <bytes>
147 elmex 1.2
148 root 1.8 If defined, then a fatal error will be raised (with C<$!> set to C<ENOSPC>)
149     when the read buffer ever (strictly) exceeds this size. This is useful to
150     avoid denial-of-service attacks.
151 elmex 1.2
152 root 1.8 For example, a server accepting connections from untrusted sources should
153     be configured to accept only so-and-so much data that it cannot act on
154     (for example, when expecting a line, an attacker could send an unlimited
155     amount of data without a callback ever being called as long as the line
156     isn't finished).
157 elmex 1.2
158 root 1.8 =item read_size => <bytes>
159 elmex 1.2
160 root 1.8 The default read block size (the amount of bytes this module will try to read
161     on each [loop iteration). Default: C<4096>.
162    
163     =item low_water_mark => <bytes>
164    
165     Sets the amount of bytes (default: C<0>) that make up an "empty" write
166     buffer: If the write reaches this size or gets even samller it is
167     considered empty.
168 elmex 1.2
169 root 1.19 =item tls => "accept" | "connect" | Net::SSLeay::SSL object
170    
171     When this parameter is given, it enables TLS (SSL) mode, that means it
172     will start making tls handshake and will transparently encrypt/decrypt
173     data.
174    
175 root 1.26 TLS mode requires Net::SSLeay to be installed (it will be loaded
176     automatically when you try to create a TLS handle).
177    
178 root 1.19 For the TLS server side, use C<accept>, and for the TLS client side of a
179     connection, use C<connect> mode.
180    
181     You can also provide your own TLS connection object, but you have
182     to make sure that you call either C<Net::SSLeay::set_connect_state>
183     or C<Net::SSLeay::set_accept_state> on it before you pass it to
184     AnyEvent::Handle.
185    
186 root 1.26 See the C<starttls> method if you need to start TLs negotiation later.
187    
188 root 1.19 =item tls_ctx => $ssl_ctx
189    
190     Use the given Net::SSLeay::CTX object to create the new TLS connection
191     (unless a connection object was specified directly). If this parameter is
192     missing, then AnyEvent::Handle will use C<AnyEvent::Handle::TLS_CTX>.
193    
194 root 1.40 =item json => JSON or JSON::XS object
195    
196     This is the json coder object used by the C<json> read and write types.
197    
198 root 1.41 If you don't supply it, then AnyEvent::Handle will create and use a
199     suitable one, which will write and expect UTF-8 encoded JSON texts.
200 root 1.40
201     Note that you are responsible to depend on the JSON module if you want to
202     use this functionality, as AnyEvent does not have a dependency itself.
203    
204 root 1.38 =item filter_r => $cb
205    
206     =item filter_w => $cb
207    
208     These exist, but are undocumented at this time.
209    
210 elmex 1.1 =back
211    
212     =cut
213    
214     sub new {
215 root 1.8 my $class = shift;
216    
217     my $self = bless { @_ }, $class;
218    
219     $self->{fh} or Carp::croak "mandatory argument fh is missing";
220    
221     AnyEvent::Util::fh_nonblocking $self->{fh}, 1;
222 elmex 1.1
223 root 1.19 if ($self->{tls}) {
224     require Net::SSLeay;
225     $self->starttls (delete $self->{tls}, delete $self->{tls_ctx});
226     }
227    
228 root 1.43 # $self->on_eof (delete $self->{on_eof} ) if $self->{on_eof}; # nop
229     # $self->on_error (delete $self->{on_error}) if $self->{on_error}; # nop
230     # $self->on_read (delete $self->{on_read} ) if $self->{on_read}; # nop
231 root 1.8 $self->on_drain (delete $self->{on_drain}) if $self->{on_drain};
232 root 1.43
233     $self->{_activity} = time;
234     $self->_timeout;
235 elmex 1.1
236 root 1.10 $self->start_read;
237    
238 root 1.8 $self
239     }
240 elmex 1.2
241 root 1.8 sub _shutdown {
242     my ($self) = @_;
243 elmex 1.2
244 root 1.38 delete $self->{_rw};
245     delete $self->{_ww};
246 root 1.8 delete $self->{fh};
247     }
248    
249     sub error {
250     my ($self) = @_;
251    
252     {
253     local $!;
254     $self->_shutdown;
255 elmex 1.1 }
256    
257 root 1.37 $self->{on_error}($self)
258     if $self->{on_error};
259    
260     Carp::croak "AnyEvent::Handle uncaught fatal error: $!";
261 elmex 1.1 }
262    
263 root 1.8 =item $fh = $handle->fh
264 elmex 1.1
265 root 1.22 This method returns the file handle of the L<AnyEvent::Handle> object.
266 elmex 1.1
267     =cut
268    
269 root 1.38 sub fh { $_[0]{fh} }
270 elmex 1.1
271 root 1.8 =item $handle->on_error ($cb)
272 elmex 1.1
273 root 1.8 Replace the current C<on_error> callback (see the C<on_error> constructor argument).
274 elmex 1.1
275 root 1.8 =cut
276    
277     sub on_error {
278     $_[0]{on_error} = $_[1];
279     }
280    
281     =item $handle->on_eof ($cb)
282    
283     Replace the current C<on_eof> callback (see the C<on_eof> constructor argument).
284 elmex 1.1
285     =cut
286    
287 root 1.8 sub on_eof {
288     $_[0]{on_eof} = $_[1];
289     }
290    
291 root 1.43 =item $handle->on_timeout ($cb)
292    
293     Replace the current C<on_timeout> callback, or disables the callback
294     (but not the timeout) if C<$cb> = C<undef>. See C<timeout> constructor
295     argument.
296    
297     =cut
298    
299     sub on_timeout {
300     $_[0]{on_timeout} = $_[1];
301     }
302    
303     #############################################################################
304    
305     =item $handle->timeout ($seconds)
306    
307     Configures (or disables) the inactivity timeout.
308    
309     =cut
310    
311     sub timeout {
312     my ($self, $timeout) = @_;
313    
314     $self->{timeout} = $timeout;
315     $self->_timeout;
316     }
317    
318     # reset the timeout watcher, as neccessary
319     # also check for time-outs
320     sub _timeout {
321     my ($self) = @_;
322    
323     if ($self->{timeout}) {
324     my $NOW = time;
325    
326     # when would the timeout trigger?
327     my $after = $self->{_activity} + $self->{timeout} - $NOW;
328    
329     warn "next to in $after\n";#d#
330    
331     # now or in the past already?
332     if ($after <= 0) {
333     $self->{_activity} = $NOW;
334    
335     if ($self->{on_timeout}) {
336     $self->{on_timeout}->($self);
337     } else {
338     $! = Errno::ETIMEDOUT;
339     $self->error;
340     }
341    
342     # callbakx could have changed timeout value, optimise
343     return unless $self->{timeout};
344    
345     # calculate new after
346     $after = $self->{timeout};
347     }
348    
349     Scalar::Util::weaken $self;
350    
351     warn "after $after\n";#d#
352     $self->{_tw} ||= AnyEvent->timer (after => $after, cb => sub {
353     delete $self->{_tw};
354     $self->_timeout;
355     });
356     } else {
357     delete $self->{_tw};
358     }
359     }
360    
361 root 1.9 #############################################################################
362    
363     =back
364    
365     =head2 WRITE QUEUE
366    
367     AnyEvent::Handle manages two queues per handle, one for writing and one
368     for reading.
369    
370     The write queue is very simple: you can add data to its end, and
371     AnyEvent::Handle will automatically try to get rid of it for you.
372    
373 elmex 1.20 When data could be written and the write buffer is shorter then the low
374 root 1.9 water mark, the C<on_drain> callback will be invoked.
375    
376     =over 4
377    
378 root 1.8 =item $handle->on_drain ($cb)
379    
380     Sets the C<on_drain> callback or clears it (see the description of
381     C<on_drain> in the constructor).
382    
383     =cut
384    
385     sub on_drain {
386 elmex 1.1 my ($self, $cb) = @_;
387    
388 root 1.8 $self->{on_drain} = $cb;
389    
390     $cb->($self)
391     if $cb && $self->{low_water_mark} >= length $self->{wbuf};
392     }
393    
394     =item $handle->push_write ($data)
395    
396     Queues the given scalar to be written. You can push as much data as you
397     want (only limited by the available memory), as C<AnyEvent::Handle>
398     buffers it independently of the kernel.
399    
400     =cut
401    
402 root 1.17 sub _drain_wbuf {
403     my ($self) = @_;
404 root 1.8
405 root 1.38 if (!$self->{_ww} && length $self->{wbuf}) {
406 root 1.35
407 root 1.8 Scalar::Util::weaken $self;
408 root 1.35
409 root 1.8 my $cb = sub {
410     my $len = syswrite $self->{fh}, $self->{wbuf};
411    
412 root 1.29 if ($len >= 0) {
413 root 1.8 substr $self->{wbuf}, 0, $len, "";
414    
415 root 1.43 $self->{_activity} = time;
416    
417 root 1.8 $self->{on_drain}($self)
418     if $self->{low_water_mark} >= length $self->{wbuf}
419     && $self->{on_drain};
420    
421 root 1.38 delete $self->{_ww} unless length $self->{wbuf};
422 root 1.42 } elsif ($! != EAGAIN && $! != EINTR && $! != WSAEWOULDBLOCK) {
423 root 1.8 $self->error;
424 elmex 1.1 }
425 root 1.8 };
426    
427 root 1.35 # try to write data immediately
428     $cb->();
429 root 1.8
430 root 1.35 # if still data left in wbuf, we need to poll
431 root 1.38 $self->{_ww} = AnyEvent->io (fh => $self->{fh}, poll => "w", cb => $cb)
432 root 1.35 if length $self->{wbuf};
433 root 1.8 };
434     }
435    
436 root 1.30 our %WH;
437    
438     sub register_write_type($$) {
439     $WH{$_[0]} = $_[1];
440     }
441    
442 root 1.17 sub push_write {
443     my $self = shift;
444    
445 root 1.29 if (@_ > 1) {
446     my $type = shift;
447    
448     @_ = ($WH{$type} or Carp::croak "unsupported type passed to AnyEvent::Handle::push_write")
449     ->($self, @_);
450     }
451    
452 root 1.17 if ($self->{filter_w}) {
453 root 1.18 $self->{filter_w}->($self, \$_[0]);
454 root 1.17 } else {
455     $self->{wbuf} .= $_[0];
456     $self->_drain_wbuf;
457     }
458     }
459    
460 root 1.29 =item $handle->push_write (type => @args)
461    
462     =item $handle->unshift_write (type => @args)
463    
464     Instead of formatting your data yourself, you can also let this module do
465     the job by specifying a type and type-specific arguments.
466    
467 root 1.30 Predefined types are (if you have ideas for additional types, feel free to
468     drop by and tell us):
469 root 1.29
470     =over 4
471    
472     =item netstring => $string
473    
474     Formats the given value as netstring
475     (http://cr.yp.to/proto/netstrings.txt, this is not a recommendation to use them).
476    
477 root 1.30 =back
478    
479 root 1.29 =cut
480    
481     register_write_type netstring => sub {
482     my ($self, $string) = @_;
483    
484     sprintf "%d:%s,", (length $string), $string
485     };
486    
487 root 1.39 =item json => $array_or_hashref
488    
489 root 1.40 Encodes the given hash or array reference into a JSON object. Unless you
490     provide your own JSON object, this means it will be encoded to JSON text
491     in UTF-8.
492    
493     JSON objects (and arrays) are self-delimiting, so you can write JSON at
494     one end of a handle and read them at the other end without using any
495     additional framing.
496    
497 root 1.41 The generated JSON text is guaranteed not to contain any newlines: While
498     this module doesn't need delimiters after or between JSON texts to be
499     able to read them, many other languages depend on that.
500    
501     A simple RPC protocol that interoperates easily with others is to send
502     JSON arrays (or objects, although arrays are usually the better choice as
503     they mimic how function argument passing works) and a newline after each
504     JSON text:
505    
506     $handle->push_write (json => ["method", "arg1", "arg2"]); # whatever
507     $handle->push_write ("\012");
508    
509     An AnyEvent::Handle receiver would simply use the C<json> read type and
510     rely on the fact that the newline will be skipped as leading whitespace:
511    
512     $handle->push_read (json => sub { my $array = $_[1]; ... });
513    
514     Other languages could read single lines terminated by a newline and pass
515     this line into their JSON decoder of choice.
516    
517 root 1.40 =cut
518    
519     register_write_type json => sub {
520     my ($self, $ref) = @_;
521    
522     require JSON;
523    
524     $self->{json} ? $self->{json}->encode ($ref)
525     : JSON::encode_json ($ref)
526     };
527    
528     =item AnyEvent::Handle::register_write_type type => $coderef->($handle, @args)
529 root 1.30
530     This function (not method) lets you add your own types to C<push_write>.
531     Whenever the given C<type> is used, C<push_write> will invoke the code
532     reference with the handle object and the remaining arguments.
533 root 1.29
534 root 1.30 The code reference is supposed to return a single octet string that will
535     be appended to the write buffer.
536 root 1.29
537 root 1.30 Note that this is a function, and all types registered this way will be
538     global, so try to use unique names.
539 root 1.29
540 root 1.30 =cut
541 root 1.29
542 root 1.8 #############################################################################
543    
544 root 1.9 =back
545    
546     =head2 READ QUEUE
547    
548     AnyEvent::Handle manages two queues per handle, one for writing and one
549     for reading.
550    
551     The read queue is more complex than the write queue. It can be used in two
552     ways, the "simple" way, using only C<on_read> and the "complex" way, using
553     a queue.
554    
555     In the simple case, you just install an C<on_read> callback and whenever
556     new data arrives, it will be called. You can then remove some data (if
557     enough is there) from the read buffer (C<< $handle->rbuf >>) if you want
558     or not.
559    
560     In the more complex case, you want to queue multiple callbacks. In this
561     case, AnyEvent::Handle will call the first queued callback each time new
562     data arrives and removes it when it has done its job (see C<push_read>,
563     below).
564    
565     This way you can, for example, push three line-reads, followed by reading
566     a chunk of data, and AnyEvent::Handle will execute them in order.
567    
568     Example 1: EPP protocol parser. EPP sends 4 byte length info, followed by
569     the specified number of bytes which give an XML datagram.
570    
571     # in the default state, expect some header bytes
572     $handle->on_read (sub {
573     # some data is here, now queue the length-header-read (4 octets)
574     shift->unshift_read_chunk (4, sub {
575     # header arrived, decode
576     my $len = unpack "N", $_[1];
577    
578     # now read the payload
579     shift->unshift_read_chunk ($len, sub {
580     my $xml = $_[1];
581     # handle xml
582     });
583     });
584     });
585    
586     Example 2: Implement a client for a protocol that replies either with
587     "OK" and another line or "ERROR" for one request, and 64 bytes for the
588     second request. Due tot he availability of a full queue, we can just
589     pipeline sending both requests and manipulate the queue as necessary in
590     the callbacks:
591    
592     # request one
593     $handle->push_write ("request 1\015\012");
594    
595     # we expect "ERROR" or "OK" as response, so push a line read
596     $handle->push_read_line (sub {
597     # if we got an "OK", we have to _prepend_ another line,
598     # so it will be read before the second request reads its 64 bytes
599     # which are already in the queue when this callback is called
600     # we don't do this in case we got an error
601     if ($_[1] eq "OK") {
602     $_[0]->unshift_read_line (sub {
603     my $response = $_[1];
604     ...
605     });
606     }
607     });
608    
609     # request two
610     $handle->push_write ("request 2\015\012");
611    
612     # simply read 64 bytes, always
613     $handle->push_read_chunk (64, sub {
614     my $response = $_[1];
615     ...
616     });
617    
618     =over 4
619    
620 root 1.10 =cut
621    
622 root 1.8 sub _drain_rbuf {
623     my ($self) = @_;
624 elmex 1.1
625 root 1.17 if (
626     defined $self->{rbuf_max}
627     && $self->{rbuf_max} < length $self->{rbuf}
628     ) {
629 root 1.37 $! = &Errno::ENOSPC;
630     $self->error;
631 root 1.17 }
632    
633 root 1.11 return if $self->{in_drain};
634 root 1.8 local $self->{in_drain} = 1;
635 elmex 1.1
636 root 1.8 while (my $len = length $self->{rbuf}) {
637     no strict 'refs';
638 root 1.38 if (my $cb = shift @{ $self->{_queue} }) {
639 root 1.29 unless ($cb->($self)) {
640 root 1.38 if ($self->{_eof}) {
641 root 1.10 # no progress can be made (not enough data and no data forthcoming)
642 root 1.37 $! = &Errno::EPIPE;
643     $self->error;
644 root 1.10 }
645    
646 root 1.38 unshift @{ $self->{_queue} }, $cb;
647 root 1.8 return;
648     }
649     } elsif ($self->{on_read}) {
650     $self->{on_read}($self);
651    
652     if (
653 root 1.38 $self->{_eof} # if no further data will arrive
654 root 1.8 && $len == length $self->{rbuf} # and no data has been consumed
655 root 1.38 && !@{ $self->{_queue} } # and the queue is still empty
656 root 1.8 && $self->{on_read} # and we still want to read data
657     ) {
658     # then no progress can be made
659 root 1.37 $! = &Errno::EPIPE;
660     $self->error;
661 elmex 1.1 }
662 root 1.8 } else {
663     # read side becomes idle
664 root 1.38 delete $self->{_rw};
665 root 1.8 return;
666     }
667     }
668    
669 root 1.38 if ($self->{_eof}) {
670 root 1.8 $self->_shutdown;
671 root 1.16 $self->{on_eof}($self)
672     if $self->{on_eof};
673 root 1.8 }
674 elmex 1.1 }
675    
676 root 1.8 =item $handle->on_read ($cb)
677 elmex 1.1
678 root 1.8 This replaces the currently set C<on_read> callback, or clears it (when
679     the new callback is C<undef>). See the description of C<on_read> in the
680     constructor.
681 elmex 1.1
682 root 1.8 =cut
683    
684     sub on_read {
685     my ($self, $cb) = @_;
686 elmex 1.1
687 root 1.8 $self->{on_read} = $cb;
688 elmex 1.1 }
689    
690 root 1.8 =item $handle->rbuf
691    
692     Returns the read buffer (as a modifiable lvalue).
693 elmex 1.1
694 root 1.8 You can access the read buffer directly as the C<< ->{rbuf} >> member, if
695     you want.
696 elmex 1.1
697 root 1.8 NOTE: The read buffer should only be used or modified if the C<on_read>,
698     C<push_read> or C<unshift_read> methods are used. The other read methods
699     automatically manage the read buffer.
700 elmex 1.1
701     =cut
702    
703 elmex 1.2 sub rbuf : lvalue {
704 root 1.8 $_[0]{rbuf}
705 elmex 1.2 }
706 elmex 1.1
707 root 1.8 =item $handle->push_read ($cb)
708    
709     =item $handle->unshift_read ($cb)
710    
711     Append the given callback to the end of the queue (C<push_read>) or
712     prepend it (C<unshift_read>).
713    
714     The callback is called each time some additional read data arrives.
715 elmex 1.1
716 elmex 1.20 It must check whether enough data is in the read buffer already.
717 elmex 1.1
718 root 1.8 If not enough data is available, it must return the empty list or a false
719     value, in which case it will be called repeatedly until enough data is
720     available (or an error condition is detected).
721    
722     If enough data was available, then the callback must remove all data it is
723     interested in (which can be none at all) and return a true value. After returning
724     true, it will be removed from the queue.
725 elmex 1.1
726     =cut
727    
728 root 1.30 our %RH;
729    
730     sub register_read_type($$) {
731     $RH{$_[0]} = $_[1];
732     }
733    
734 root 1.8 sub push_read {
735 root 1.28 my $self = shift;
736     my $cb = pop;
737    
738     if (@_) {
739     my $type = shift;
740    
741     $cb = ($RH{$type} or Carp::croak "unsupported type passed to AnyEvent::Handle::push_read")
742     ->($self, $cb, @_);
743     }
744 elmex 1.1
745 root 1.38 push @{ $self->{_queue} }, $cb;
746 root 1.8 $self->_drain_rbuf;
747 elmex 1.1 }
748    
749 root 1.8 sub unshift_read {
750 root 1.28 my $self = shift;
751     my $cb = pop;
752    
753     if (@_) {
754     my $type = shift;
755    
756     $cb = ($RH{$type} or Carp::croak "unsupported type passed to AnyEvent::Handle::unshift_read")
757     ->($self, $cb, @_);
758     }
759    
760 root 1.8
761 root 1.38 unshift @{ $self->{_queue} }, $cb;
762 root 1.8 $self->_drain_rbuf;
763     }
764 elmex 1.1
765 root 1.28 =item $handle->push_read (type => @args, $cb)
766 elmex 1.1
767 root 1.28 =item $handle->unshift_read (type => @args, $cb)
768 elmex 1.1
769 root 1.28 Instead of providing a callback that parses the data itself you can chose
770     between a number of predefined parsing formats, for chunks of data, lines
771     etc.
772 elmex 1.1
773 root 1.30 Predefined types are (if you have ideas for additional types, feel free to
774     drop by and tell us):
775 root 1.28
776     =over 4
777    
778 root 1.40 =item chunk => $octets, $cb->($handle, $data)
779 root 1.28
780     Invoke the callback only once C<$octets> bytes have been read. Pass the
781     data read to the callback. The callback will never be called with less
782     data.
783    
784     Example: read 2 bytes.
785    
786     $handle->push_read (chunk => 2, sub {
787     warn "yay ", unpack "H*", $_[1];
788     });
789 elmex 1.1
790     =cut
791    
792 root 1.28 register_read_type chunk => sub {
793     my ($self, $cb, $len) = @_;
794 elmex 1.1
795 root 1.8 sub {
796     $len <= length $_[0]{rbuf} or return;
797 elmex 1.12 $cb->($_[0], substr $_[0]{rbuf}, 0, $len, "");
798 root 1.8 1
799     }
800 root 1.28 };
801 root 1.8
802 root 1.28 # compatibility with older API
803 root 1.8 sub push_read_chunk {
804 root 1.28 $_[0]->push_read (chunk => $_[1], $_[2]);
805 root 1.8 }
806 elmex 1.1
807 root 1.8 sub unshift_read_chunk {
808 root 1.28 $_[0]->unshift_read (chunk => $_[1], $_[2]);
809 elmex 1.1 }
810    
811 root 1.40 =item line => [$eol, ]$cb->($handle, $line, $eol)
812 elmex 1.1
813 root 1.8 The callback will be called only once a full line (including the end of
814     line marker, C<$eol>) has been read. This line (excluding the end of line
815     marker) will be passed to the callback as second argument (C<$line>), and
816     the end of line marker as the third argument (C<$eol>).
817 elmex 1.1
818 root 1.8 The end of line marker, C<$eol>, can be either a string, in which case it
819     will be interpreted as a fixed record end marker, or it can be a regex
820     object (e.g. created by C<qr>), in which case it is interpreted as a
821     regular expression.
822 elmex 1.1
823 root 1.8 The end of line marker argument C<$eol> is optional, if it is missing (NOT
824     undef), then C<qr|\015?\012|> is used (which is good for most internet
825     protocols).
826 elmex 1.1
827 root 1.8 Partial lines at the end of the stream will never be returned, as they are
828     not marked by the end of line marker.
829 elmex 1.1
830 root 1.8 =cut
831 elmex 1.1
832 root 1.28 register_read_type line => sub {
833     my ($self, $cb, $eol) = @_;
834 elmex 1.1
835 root 1.28 $eol = qr|(\015?\012)| if @_ < 3;
836 root 1.14 $eol = quotemeta $eol unless ref $eol;
837     $eol = qr|^(.*?)($eol)|s;
838 elmex 1.1
839 root 1.8 sub {
840     $_[0]{rbuf} =~ s/$eol// or return;
841 elmex 1.1
842 elmex 1.12 $cb->($_[0], $1, $2);
843 root 1.8 1
844     }
845 root 1.28 };
846 elmex 1.1
847 root 1.28 # compatibility with older API
848 root 1.8 sub push_read_line {
849 root 1.28 my $self = shift;
850     $self->push_read (line => @_);
851 root 1.10 }
852    
853     sub unshift_read_line {
854 root 1.28 my $self = shift;
855     $self->unshift_read (line => @_);
856 root 1.10 }
857    
858 root 1.40 =item netstring => $cb->($handle, $string)
859 root 1.29
860     A netstring (http://cr.yp.to/proto/netstrings.txt, this is not an endorsement).
861    
862     Throws an error with C<$!> set to EBADMSG on format violations.
863    
864     =cut
865    
866     register_read_type netstring => sub {
867     my ($self, $cb) = @_;
868    
869     sub {
870     unless ($_[0]{rbuf} =~ s/^(0|[1-9][0-9]*)://) {
871     if ($_[0]{rbuf} =~ /[^0-9]/) {
872     $! = &Errno::EBADMSG;
873     $self->error;
874     }
875     return;
876     }
877    
878     my $len = $1;
879    
880     $self->unshift_read (chunk => $len, sub {
881     my $string = $_[1];
882     $_[0]->unshift_read (chunk => 1, sub {
883     if ($_[1] eq ",") {
884     $cb->($_[0], $string);
885     } else {
886     $! = &Errno::EBADMSG;
887     $self->error;
888     }
889     });
890     });
891    
892     1
893     }
894     };
895    
896 root 1.40 =item regex => $accept[, $reject[, $skip], $cb->($handle, $data)
897 root 1.36
898     Makes a regex match against the regex object C<$accept> and returns
899     everything up to and including the match.
900    
901     Example: read a single line terminated by '\n'.
902    
903     $handle->push_read (regex => qr<\n>, sub { ... });
904    
905     If C<$reject> is given and not undef, then it determines when the data is
906     to be rejected: it is matched against the data when the C<$accept> regex
907     does not match and generates an C<EBADMSG> error when it matches. This is
908     useful to quickly reject wrong data (to avoid waiting for a timeout or a
909     receive buffer overflow).
910    
911     Example: expect a single decimal number followed by whitespace, reject
912     anything else (not the use of an anchor).
913    
914     $handle->push_read (regex => qr<^[0-9]+\s>, qr<[^0-9]>, sub { ... });
915    
916     If C<$skip> is given and not C<undef>, then it will be matched against
917     the receive buffer when neither C<$accept> nor C<$reject> match,
918     and everything preceding and including the match will be accepted
919     unconditionally. This is useful to skip large amounts of data that you
920     know cannot be matched, so that the C<$accept> or C<$reject> regex do not
921     have to start matching from the beginning. This is purely an optimisation
922     and is usually worth only when you expect more than a few kilobytes.
923    
924     Example: expect a http header, which ends at C<\015\012\015\012>. Since we
925     expect the header to be very large (it isn't in practise, but...), we use
926     a skip regex to skip initial portions. The skip regex is tricky in that
927     it only accepts something not ending in either \015 or \012, as these are
928     required for the accept regex.
929    
930     $handle->push_read (regex =>
931     qr<\015\012\015\012>,
932     undef, # no reject
933     qr<^.*[^\015\012]>,
934     sub { ... });
935    
936     =cut
937    
938     register_read_type regex => sub {
939     my ($self, $cb, $accept, $reject, $skip) = @_;
940    
941     my $data;
942     my $rbuf = \$self->{rbuf};
943    
944     sub {
945     # accept
946     if ($$rbuf =~ $accept) {
947     $data .= substr $$rbuf, 0, $+[0], "";
948     $cb->($self, $data);
949     return 1;
950     }
951    
952     # reject
953     if ($reject && $$rbuf =~ $reject) {
954     $! = &Errno::EBADMSG;
955     $self->error;
956     }
957    
958     # skip
959     if ($skip && $$rbuf =~ $skip) {
960     $data .= substr $$rbuf, 0, $+[0], "";
961     }
962    
963     ()
964     }
965     };
966    
967 root 1.40 =item json => $cb->($handle, $hash_or_arrayref)
968    
969     Reads a JSON object or array, decodes it and passes it to the callback.
970    
971     If a C<json> object was passed to the constructor, then that will be used
972     for the final decode, otherwise it will create a JSON coder expecting UTF-8.
973    
974     This read type uses the incremental parser available with JSON version
975     2.09 (and JSON::XS version 2.2) and above. You have to provide a
976     dependency on your own: this module will load the JSON module, but
977     AnyEvent does not depend on it itself.
978    
979     Since JSON texts are fully self-delimiting, the C<json> read and write
980 root 1.41 types are an ideal simple RPC protocol: just exchange JSON datagrams. See
981     the C<json> write type description, above, for an actual example.
982 root 1.40
983     =cut
984    
985     register_read_type json => sub {
986     my ($self, $cb, $accept, $reject, $skip) = @_;
987    
988     require JSON;
989    
990     my $data;
991     my $rbuf = \$self->{rbuf};
992    
993 root 1.41 my $json = $self->{json} ||= JSON->new->utf8;
994 root 1.40
995     sub {
996     my $ref = $json->incr_parse ($self->{rbuf});
997    
998     if ($ref) {
999     $self->{rbuf} = $json->incr_text;
1000     $json->incr_text = "";
1001     $cb->($self, $ref);
1002    
1003     1
1004     } else {
1005     $self->{rbuf} = "";
1006     ()
1007     }
1008     }
1009     };
1010    
1011 root 1.28 =back
1012    
1013 root 1.40 =item AnyEvent::Handle::register_read_type type => $coderef->($handle, $cb, @args)
1014 root 1.30
1015     This function (not method) lets you add your own types to C<push_read>.
1016    
1017     Whenever the given C<type> is used, C<push_read> will invoke the code
1018     reference with the handle object, the callback and the remaining
1019     arguments.
1020    
1021     The code reference is supposed to return a callback (usually a closure)
1022     that works as a plain read callback (see C<< ->push_read ($cb) >>).
1023    
1024     It should invoke the passed callback when it is done reading (remember to
1025 root 1.40 pass C<$handle> as first argument as all other callbacks do that).
1026 root 1.30
1027     Note that this is a function, and all types registered this way will be
1028     global, so try to use unique names.
1029    
1030     For examples, see the source of this module (F<perldoc -m AnyEvent::Handle>,
1031     search for C<register_read_type>)).
1032    
1033 root 1.10 =item $handle->stop_read
1034    
1035     =item $handle->start_read
1036    
1037 root 1.18 In rare cases you actually do not want to read anything from the
1038 root 1.10 socket. In this case you can call C<stop_read>. Neither C<on_read> no
1039 root 1.22 any queued callbacks will be executed then. To start reading again, call
1040 root 1.10 C<start_read>.
1041    
1042     =cut
1043    
1044     sub stop_read {
1045     my ($self) = @_;
1046 elmex 1.1
1047 root 1.38 delete $self->{_rw};
1048 root 1.8 }
1049 elmex 1.1
1050 root 1.10 sub start_read {
1051     my ($self) = @_;
1052    
1053 root 1.38 unless ($self->{_rw} || $self->{_eof}) {
1054 root 1.10 Scalar::Util::weaken $self;
1055    
1056 root 1.38 $self->{_rw} = AnyEvent->io (fh => $self->{fh}, poll => "r", cb => sub {
1057 root 1.17 my $rbuf = $self->{filter_r} ? \my $buf : \$self->{rbuf};
1058     my $len = sysread $self->{fh}, $$rbuf, $self->{read_size} || 8192, length $$rbuf;
1059 root 1.10
1060     if ($len > 0) {
1061 root 1.43 $self->{_activity} = time;
1062    
1063 root 1.17 $self->{filter_r}
1064 root 1.18 ? $self->{filter_r}->($self, $rbuf)
1065 root 1.17 : $self->_drain_rbuf;
1066 root 1.10
1067     } elsif (defined $len) {
1068 root 1.38 delete $self->{_rw};
1069 root 1.43 delete $self->{_ww};
1070     delete $self->{_tw};
1071 root 1.38 $self->{_eof} = 1;
1072 root 1.17 $self->_drain_rbuf;
1073 root 1.10
1074 root 1.42 } elsif ($! != EAGAIN && $! != EINTR && $! != WSAEWOULDBLOCK) {
1075 root 1.10 return $self->error;
1076     }
1077     });
1078     }
1079 elmex 1.1 }
1080    
1081 root 1.19 sub _dotls {
1082     my ($self) = @_;
1083    
1084 root 1.38 if (length $self->{_tls_wbuf}) {
1085     while ((my $len = Net::SSLeay::write ($self->{tls}, $self->{_tls_wbuf})) > 0) {
1086     substr $self->{_tls_wbuf}, 0, $len, "";
1087 root 1.22 }
1088 root 1.19 }
1089    
1090 root 1.38 if (defined (my $buf = Net::SSLeay::BIO_read ($self->{_wbio}))) {
1091 root 1.19 $self->{wbuf} .= $buf;
1092     $self->_drain_wbuf;
1093     }
1094    
1095 root 1.23 while (defined (my $buf = Net::SSLeay::read ($self->{tls}))) {
1096     $self->{rbuf} .= $buf;
1097     $self->_drain_rbuf;
1098     }
1099    
1100 root 1.24 my $err = Net::SSLeay::get_error ($self->{tls}, -1);
1101    
1102     if ($err!= Net::SSLeay::ERROR_WANT_READ ()) {
1103 root 1.23 if ($err == Net::SSLeay::ERROR_SYSCALL ()) {
1104     $self->error;
1105     } elsif ($err == Net::SSLeay::ERROR_SSL ()) {
1106     $! = &Errno::EIO;
1107     $self->error;
1108 root 1.19 }
1109 root 1.23
1110     # all others are fine for our purposes
1111 root 1.19 }
1112     }
1113    
1114 root 1.25 =item $handle->starttls ($tls[, $tls_ctx])
1115    
1116     Instead of starting TLS negotiation immediately when the AnyEvent::Handle
1117     object is created, you can also do that at a later time by calling
1118     C<starttls>.
1119    
1120     The first argument is the same as the C<tls> constructor argument (either
1121     C<"connect">, C<"accept"> or an existing Net::SSLeay object).
1122    
1123     The second argument is the optional C<Net::SSLeay::CTX> object that is
1124     used when AnyEvent::Handle has to create its own TLS connection object.
1125    
1126 root 1.38 The TLS connection object will end up in C<< $handle->{tls} >> after this
1127     call and can be used or changed to your liking. Note that the handshake
1128     might have already started when this function returns.
1129    
1130 root 1.25 =cut
1131    
1132 root 1.19 # TODO: maybe document...
1133     sub starttls {
1134     my ($self, $ssl, $ctx) = @_;
1135    
1136 root 1.25 $self->stoptls;
1137    
1138 root 1.19 if ($ssl eq "accept") {
1139     $ssl = Net::SSLeay::new ($ctx || TLS_CTX ());
1140     Net::SSLeay::set_accept_state ($ssl);
1141     } elsif ($ssl eq "connect") {
1142     $ssl = Net::SSLeay::new ($ctx || TLS_CTX ());
1143     Net::SSLeay::set_connect_state ($ssl);
1144     }
1145    
1146     $self->{tls} = $ssl;
1147    
1148 root 1.21 # basically, this is deep magic (because SSL_read should have the same issues)
1149     # but the openssl maintainers basically said: "trust us, it just works".
1150     # (unfortunately, we have to hardcode constants because the abysmally misdesigned
1151     # and mismaintained ssleay-module doesn't even offer them).
1152 root 1.27 # http://www.mail-archive.com/openssl-dev@openssl.org/msg22420.html
1153 root 1.21 Net::SSLeay::CTX_set_mode ($self->{tls},
1154 root 1.34 (eval { local $SIG{__DIE__}; Net::SSLeay::MODE_ENABLE_PARTIAL_WRITE () } || 1)
1155     | (eval { local $SIG{__DIE__}; Net::SSLeay::MODE_ACCEPT_MOVING_WRITE_BUFFER () } || 2));
1156 root 1.21
1157 root 1.38 $self->{_rbio} = Net::SSLeay::BIO_new (Net::SSLeay::BIO_s_mem ());
1158     $self->{_wbio} = Net::SSLeay::BIO_new (Net::SSLeay::BIO_s_mem ());
1159 root 1.19
1160 root 1.38 Net::SSLeay::set_bio ($ssl, $self->{_rbio}, $self->{_wbio});
1161 root 1.19
1162     $self->{filter_w} = sub {
1163 root 1.38 $_[0]{_tls_wbuf} .= ${$_[1]};
1164 root 1.19 &_dotls;
1165     };
1166     $self->{filter_r} = sub {
1167 root 1.38 Net::SSLeay::BIO_write ($_[0]{_rbio}, ${$_[1]});
1168 root 1.19 &_dotls;
1169     };
1170     }
1171    
1172 root 1.25 =item $handle->stoptls
1173    
1174     Destroys the SSL connection, if any. Partial read or write data will be
1175     lost.
1176    
1177     =cut
1178    
1179     sub stoptls {
1180     my ($self) = @_;
1181    
1182     Net::SSLeay::free (delete $self->{tls}) if $self->{tls};
1183 root 1.38
1184     delete $self->{_rbio};
1185     delete $self->{_wbio};
1186     delete $self->{_tls_wbuf};
1187 root 1.25 delete $self->{filter_r};
1188     delete $self->{filter_w};
1189     }
1190    
1191 root 1.19 sub DESTROY {
1192     my $self = shift;
1193    
1194 root 1.25 $self->stoptls;
1195 root 1.19 }
1196    
1197     =item AnyEvent::Handle::TLS_CTX
1198    
1199     This function creates and returns the Net::SSLeay::CTX object used by
1200     default for TLS mode.
1201    
1202     The context is created like this:
1203    
1204     Net::SSLeay::load_error_strings;
1205     Net::SSLeay::SSLeay_add_ssl_algorithms;
1206     Net::SSLeay::randomize;
1207    
1208     my $CTX = Net::SSLeay::CTX_new;
1209    
1210     Net::SSLeay::CTX_set_options $CTX, Net::SSLeay::OP_ALL
1211    
1212     =cut
1213    
1214     our $TLS_CTX;
1215    
1216     sub TLS_CTX() {
1217     $TLS_CTX || do {
1218     require Net::SSLeay;
1219    
1220     Net::SSLeay::load_error_strings ();
1221     Net::SSLeay::SSLeay_add_ssl_algorithms ();
1222     Net::SSLeay::randomize ();
1223    
1224     $TLS_CTX = Net::SSLeay::CTX_new ();
1225    
1226     Net::SSLeay::CTX_set_options ($TLS_CTX, Net::SSLeay::OP_ALL ());
1227    
1228     $TLS_CTX
1229     }
1230     }
1231    
1232 elmex 1.1 =back
1233    
1234 root 1.38 =head1 SUBCLASSING AnyEvent::Handle
1235    
1236     In many cases, you might want to subclass AnyEvent::Handle.
1237    
1238     To make this easier, a given version of AnyEvent::Handle uses these
1239     conventions:
1240    
1241     =over 4
1242    
1243     =item * all constructor arguments become object members.
1244    
1245     At least initially, when you pass a C<tls>-argument to the constructor it
1246     will end up in C<< $handle->{tls} >>. Those members might be changes or
1247     mutated later on (for example C<tls> will hold the TLS connection object).
1248    
1249     =item * other object member names are prefixed with an C<_>.
1250    
1251     All object members not explicitly documented (internal use) are prefixed
1252     with an underscore character, so the remaining non-C<_>-namespace is free
1253     for use for subclasses.
1254    
1255     =item * all members not documented here and not prefixed with an underscore
1256     are free to use in subclasses.
1257    
1258     Of course, new versions of AnyEvent::Handle may introduce more "public"
1259     member variables, but thats just life, at least it is documented.
1260    
1261     =back
1262    
1263 elmex 1.1 =head1 AUTHOR
1264    
1265 root 1.8 Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.
1266 elmex 1.1
1267     =cut
1268    
1269     1; # End of AnyEvent::Handle