ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent/Handle.pm
Revision: 1.142
Committed: Mon Jul 6 20:24:47 2009 UTC (14 years, 11 months ago) by root
Branch: MAIN
Changes since 1.141: +74 -17 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 elmex 1.1 package AnyEvent::Handle;
2    
3 elmex 1.6 no warnings;
4 root 1.79 use strict qw(subs vars);
5 elmex 1.1
6 root 1.8 use AnyEvent ();
7 root 1.42 use AnyEvent::Util qw(WSAEWOULDBLOCK);
8 root 1.8 use Scalar::Util ();
9     use Carp ();
10     use Fcntl ();
11 root 1.43 use Errno qw(EAGAIN EINTR);
12 elmex 1.1
13     =head1 NAME
14    
15 root 1.22 AnyEvent::Handle - non-blocking I/O on file handles via AnyEvent
16 elmex 1.1
17     =cut
18    
19 root 1.136 our $VERSION = 4.452;
20 elmex 1.1
21     =head1 SYNOPSIS
22    
23     use AnyEvent;
24     use AnyEvent::Handle;
25    
26     my $cv = AnyEvent->condvar;
27    
28 root 1.31 my $handle =
29 elmex 1.2 AnyEvent::Handle->new (
30     fh => \*STDIN,
31     on_eof => sub {
32 root 1.107 $cv->send;
33 elmex 1.2 },
34     );
35    
36 root 1.31 # send some request line
37     $handle->push_write ("getinfo\015\012");
38    
39     # read the response line
40     $handle->push_read (line => sub {
41     my ($handle, $line) = @_;
42     warn "read line <$line>\n";
43     $cv->send;
44     });
45    
46     $cv->recv;
47 elmex 1.1
48     =head1 DESCRIPTION
49    
50 root 1.8 This module is a helper module to make it easier to do event-based I/O on
51 elmex 1.13 filehandles. For utility functions for doing non-blocking connects and accepts
52     on sockets see L<AnyEvent::Util>.
53 root 1.8
54 root 1.84 The L<AnyEvent::Intro> tutorial contains some well-documented
55     AnyEvent::Handle examples.
56    
57 root 1.8 In the following, when the documentation refers to of "bytes" then this
58     means characters. As sysread and syswrite are used for all I/O, their
59     treatment of characters applies to this module as well.
60 elmex 1.1
61 root 1.8 All callbacks will be invoked with the handle object as their first
62     argument.
63 elmex 1.1
64     =head1 METHODS
65    
66     =over 4
67    
68 root 1.131 =item $handle = B<new> AnyEvent::TLS fh => $filehandle, key => value...
69 elmex 1.1
70 root 1.131 The constructor supports these arguments (all as C<< key => value >> pairs).
71 elmex 1.1
72     =over 4
73    
74 root 1.8 =item fh => $filehandle [MANDATORY]
75 elmex 1.1
76     The filehandle this L<AnyEvent::Handle> object will operate on.
77    
78 root 1.83 NOTE: The filehandle will be set to non-blocking mode (using
79     C<AnyEvent::Util::fh_nonblocking>) by the constructor and needs to stay in
80     that mode.
81 root 1.8
82 root 1.40 =item on_eof => $cb->($handle)
83 root 1.10
84 root 1.74 Set the callback to be called when an end-of-file condition is detected,
85 root 1.52 i.e. in the case of a socket, when the other side has closed the
86     connection cleanly.
87 root 1.8
88 root 1.82 For sockets, this just means that the other side has stopped sending data,
89 root 1.101 you can still try to write data, and, in fact, one can return from the EOF
90 root 1.82 callback and continue writing data, as only the read part has been shut
91     down.
92    
93 root 1.101 While not mandatory, it is I<highly> recommended to set an EOF callback,
94 root 1.16 otherwise you might end up with a closed socket while you are still
95     waiting for data.
96    
97 root 1.80 If an EOF condition has been detected but no C<on_eof> callback has been
98     set, then a fatal error will be raised with C<$!> set to <0>.
99    
100 root 1.133 =item on_error => $cb->($handle, $fatal, $message)
101 root 1.10
102 root 1.52 This is the error callback, which is called when, well, some error
103     occured, such as not being able to resolve the hostname, failure to
104     connect or a read error.
105    
106     Some errors are fatal (which is indicated by C<$fatal> being true). On
107 root 1.82 fatal errors the handle object will be shut down and will not be usable
108 root 1.88 (but you are free to look at the current C<< ->rbuf >>). Examples of fatal
109 root 1.82 errors are an EOF condition with active (but unsatisifable) read watchers
110     (C<EPIPE>) or I/O errors.
111    
112 root 1.133 AnyEvent::Handle tries to find an appropriate error code for you to check
113     against, but in some cases (TLS errors), this does not work well. It is
114     recommended to always output the C<$message> argument in human-readable
115     error messages (it's usually the same as C<"$!">).
116    
117 root 1.82 Non-fatal errors can be retried by simply returning, but it is recommended
118     to simply ignore this parameter and instead abondon the handle object
119     when this callback is invoked. Examples of non-fatal errors are timeouts
120     C<ETIMEDOUT>) or badly-formatted data (C<EBADMSG>).
121 root 1.8
122 root 1.10 On callback entrance, the value of C<$!> contains the operating system
123 root 1.133 error code (or C<ENOSPC>, C<EPIPE>, C<ETIMEDOUT>, C<EBADMSG> or
124     C<EPROTO>).
125 root 1.8
126 root 1.10 While not mandatory, it is I<highly> recommended to set this callback, as
127     you will not be notified of errors otherwise. The default simply calls
128 root 1.52 C<croak>.
129 root 1.8
130 root 1.40 =item on_read => $cb->($handle)
131 root 1.8
132     This sets the default read callback, which is called when data arrives
133 root 1.61 and no read request is in the queue (unlike read queue callbacks, this
134     callback will only be called when at least one octet of data is in the
135     read buffer).
136 root 1.8
137     To access (and remove data from) the read buffer, use the C<< ->rbuf >>
138 root 1.139 method or access the C<< $handle->{rbuf} >> member directly. Note that you
139 root 1.117 must not enlarge or modify the read buffer, you can only remove data at
140     the beginning from it.
141 root 1.8
142     When an EOF condition is detected then AnyEvent::Handle will first try to
143     feed all the remaining data to the queued callbacks and C<on_read> before
144     calling the C<on_eof> callback. If no progress can be made, then a fatal
145     error will be raised (with C<$!> set to C<EPIPE>).
146 elmex 1.1
147 root 1.40 =item on_drain => $cb->($handle)
148 elmex 1.1
149 root 1.8 This sets the callback that is called when the write buffer becomes empty
150     (or when the callback is set and the buffer is empty already).
151 elmex 1.1
152 root 1.8 To append to the write buffer, use the C<< ->push_write >> method.
153 elmex 1.2
154 root 1.69 This callback is useful when you don't want to put all of your write data
155     into the queue at once, for example, when you want to write the contents
156     of some file to the socket you might not want to read the whole file into
157     memory and push it into the queue, but instead only read more data from
158     the file when the write queue becomes empty.
159    
160 root 1.43 =item timeout => $fractional_seconds
161    
162     If non-zero, then this enables an "inactivity" timeout: whenever this many
163     seconds pass without a successful read or write on the underlying file
164     handle, the C<on_timeout> callback will be invoked (and if that one is
165 root 1.88 missing, a non-fatal C<ETIMEDOUT> error will be raised).
166 root 1.43
167     Note that timeout processing is also active when you currently do not have
168     any outstanding read or write requests: If you plan to keep the connection
169     idle then you should disable the timout temporarily or ignore the timeout
170 root 1.88 in the C<on_timeout> callback, in which case AnyEvent::Handle will simply
171     restart the timeout.
172 root 1.43
173     Zero (the default) disables this timeout.
174    
175     =item on_timeout => $cb->($handle)
176    
177     Called whenever the inactivity timeout passes. If you return from this
178     callback, then the timeout will be reset as if some activity had happened,
179     so this condition is not fatal in any way.
180    
181 root 1.8 =item rbuf_max => <bytes>
182 elmex 1.2
183 root 1.8 If defined, then a fatal error will be raised (with C<$!> set to C<ENOSPC>)
184     when the read buffer ever (strictly) exceeds this size. This is useful to
185 root 1.88 avoid some forms of denial-of-service attacks.
186 elmex 1.2
187 root 1.8 For example, a server accepting connections from untrusted sources should
188     be configured to accept only so-and-so much data that it cannot act on
189     (for example, when expecting a line, an attacker could send an unlimited
190     amount of data without a callback ever being called as long as the line
191     isn't finished).
192 elmex 1.2
193 root 1.70 =item autocork => <boolean>
194    
195     When disabled (the default), then C<push_write> will try to immediately
196 root 1.88 write the data to the handle, if possible. This avoids having to register
197     a write watcher and wait for the next event loop iteration, but can
198     be inefficient if you write multiple small chunks (on the wire, this
199     disadvantage is usually avoided by your kernel's nagle algorithm, see
200     C<no_delay>, but this option can save costly syscalls).
201 root 1.70
202     When enabled, then writes will always be queued till the next event loop
203     iteration. This is efficient when you do many small writes per iteration,
204 root 1.88 but less efficient when you do a single write only per iteration (or when
205     the write buffer often is full). It also increases write latency.
206 root 1.70
207     =item no_delay => <boolean>
208    
209     When doing small writes on sockets, your operating system kernel might
210     wait a bit for more data before actually sending it out. This is called
211     the Nagle algorithm, and usually it is beneficial.
212    
213 root 1.88 In some situations you want as low a delay as possible, which can be
214     accomplishd by setting this option to a true value.
215 root 1.70
216 root 1.88 The default is your opertaing system's default behaviour (most likely
217     enabled), this option explicitly enables or disables it, if possible.
218 root 1.70
219 root 1.8 =item read_size => <bytes>
220 elmex 1.2
221 root 1.88 The default read block size (the amount of bytes this module will
222     try to read during each loop iteration, which affects memory
223     requirements). Default: C<8192>.
224 root 1.8
225     =item low_water_mark => <bytes>
226    
227     Sets the amount of bytes (default: C<0>) that make up an "empty" write
228     buffer: If the write reaches this size or gets even samller it is
229     considered empty.
230 elmex 1.2
231 root 1.88 Sometimes it can be beneficial (for performance reasons) to add data to
232     the write buffer before it is fully drained, but this is a rare case, as
233     the operating system kernel usually buffers data as well, so the default
234     is good in almost all cases.
235    
236 root 1.62 =item linger => <seconds>
237    
238     If non-zero (default: C<3600>), then the destructor of the
239 root 1.88 AnyEvent::Handle object will check whether there is still outstanding
240     write data and will install a watcher that will write this data to the
241     socket. No errors will be reported (this mostly matches how the operating
242     system treats outstanding data at socket close time).
243 root 1.62
244 root 1.88 This will not work for partial TLS data that could not be encoded
245 root 1.93 yet. This data will be lost. Calling the C<stoptls> method in time might
246     help.
247 root 1.62
248 root 1.133 =item peername => $string
249    
250 root 1.134 A string used to identify the remote site - usually the DNS hostname
251     (I<not> IDN!) used to create the connection, rarely the IP address.
252 root 1.131
253 root 1.133 Apart from being useful in error messages, this string is also used in TLS
254 root 1.138 peername verification (see C<verify_peername> in L<AnyEvent::TLS>).
255 root 1.131
256 root 1.19 =item tls => "accept" | "connect" | Net::SSLeay::SSL object
257    
258 root 1.85 When this parameter is given, it enables TLS (SSL) mode, that means
259 root 1.88 AnyEvent will start a TLS handshake as soon as the conenction has been
260     established and will transparently encrypt/decrypt data afterwards.
261 root 1.19
262 root 1.133 All TLS protocol errors will be signalled as C<EPROTO>, with an
263     appropriate error message.
264    
265 root 1.26 TLS mode requires Net::SSLeay to be installed (it will be loaded
266 root 1.88 automatically when you try to create a TLS handle): this module doesn't
267     have a dependency on that module, so if your module requires it, you have
268     to add the dependency yourself.
269 root 1.26
270 root 1.85 Unlike TCP, TLS has a server and client side: for the TLS server side, use
271     C<accept>, and for the TLS client side of a connection, use C<connect>
272     mode.
273 root 1.19
274     You can also provide your own TLS connection object, but you have
275     to make sure that you call either C<Net::SSLeay::set_connect_state>
276     or C<Net::SSLeay::set_accept_state> on it before you pass it to
277 root 1.131 AnyEvent::Handle. Also, this module will take ownership of this connection
278     object.
279    
280     At some future point, AnyEvent::Handle might switch to another TLS
281     implementation, then the option to use your own session object will go
282     away.
283 root 1.19
284 root 1.109 B<IMPORTANT:> since Net::SSLeay "objects" are really only integers,
285     passing in the wrong integer will lead to certain crash. This most often
286     happens when one uses a stylish C<< tls => 1 >> and is surprised about the
287     segmentation fault.
288    
289 root 1.88 See the C<< ->starttls >> method for when need to start TLS negotiation later.
290 root 1.26
291 root 1.131 =item tls_ctx => $anyevent_tls
292 root 1.19
293 root 1.131 Use the given C<AnyEvent::TLS> object to create the new TLS connection
294 root 1.19 (unless a connection object was specified directly). If this parameter is
295     missing, then AnyEvent::Handle will use C<AnyEvent::Handle::TLS_CTX>.
296    
297 root 1.131 Instead of an object, you can also specify a hash reference with C<< key
298     => value >> pairs. Those will be passed to L<AnyEvent::TLS> to create a
299     new TLS context object.
300    
301 root 1.142 =item on_starttls => $cb->($handle, $success)
302    
303     This callback will be invoked when the TLS/SSL handshake has finished. If
304     C<$success> is true, then the TLS handshake succeeded, otherwise it failed
305     (C<on_stoptls> will not be called in this case).
306    
307     The session in C<< $handle->{tls} >> can still be examined in this
308     callback, even when the handshake was not successful.
309    
310     =item on_stoptls => $cb->($handle)
311    
312     When a SSLv3/TLS shutdown/close notify/EOF is detected and this callback is
313     set, then it will be invoked after freeing the TLS session. If it is not,
314     then a TLS shutdown condition will be treated like a normal EOF condition
315     on the handle.
316    
317     The session in C<< $handle->{tls} >> can still be examined in this
318     callback.
319    
320     This callback will only be called on TLS shutdowns, not when the
321     underlying handle signals EOF.
322    
323 root 1.40 =item json => JSON or JSON::XS object
324    
325     This is the json coder object used by the C<json> read and write types.
326    
327 root 1.41 If you don't supply it, then AnyEvent::Handle will create and use a
328 root 1.86 suitable one (on demand), which will write and expect UTF-8 encoded JSON
329     texts.
330 root 1.40
331     Note that you are responsible to depend on the JSON module if you want to
332     use this functionality, as AnyEvent does not have a dependency itself.
333    
334 elmex 1.1 =back
335    
336     =cut
337    
338     sub new {
339 root 1.8 my $class = shift;
340     my $self = bless { @_ }, $class;
341    
342     $self->{fh} or Carp::croak "mandatory argument fh is missing";
343    
344     AnyEvent::Util::fh_nonblocking $self->{fh}, 1;
345 elmex 1.1
346 root 1.131 $self->{_activity} = AnyEvent->now;
347     $self->_timeout;
348    
349     $self->no_delay (delete $self->{no_delay}) if exists $self->{no_delay};
350    
351 root 1.94 $self->starttls (delete $self->{tls}, delete $self->{tls_ctx})
352     if $self->{tls};
353 root 1.19
354 root 1.70 $self->on_drain (delete $self->{on_drain}) if exists $self->{on_drain};
355 root 1.10
356 root 1.66 $self->start_read
357 root 1.67 if $self->{on_read};
358 root 1.66
359 root 1.131 $self->{fh} && $self
360 root 1.8 }
361 elmex 1.2
362 root 1.8 sub _shutdown {
363     my ($self) = @_;
364 elmex 1.2
365 elmex 1.132 delete @$self{qw(_tw _rw _ww fh wbuf on_read _queue)};
366 root 1.131 $self->{_eof} = 1; # tell starttls et. al to stop trying
367 root 1.52
368 root 1.92 &_freetls;
369 root 1.8 }
370    
371 root 1.52 sub _error {
372 root 1.133 my ($self, $errno, $fatal, $message) = @_;
373 root 1.8
374 root 1.52 $self->_shutdown
375     if $fatal;
376 elmex 1.1
377 root 1.52 $! = $errno;
378 root 1.133 $message ||= "$!";
379 root 1.37
380 root 1.52 if ($self->{on_error}) {
381 root 1.133 $self->{on_error}($self, $fatal, $message);
382 root 1.100 } elsif ($self->{fh}) {
383 root 1.133 Carp::croak "AnyEvent::Handle uncaught error: $message";
384 root 1.52 }
385 elmex 1.1 }
386    
387 root 1.8 =item $fh = $handle->fh
388 elmex 1.1
389 root 1.88 This method returns the file handle used to create the L<AnyEvent::Handle> object.
390 elmex 1.1
391     =cut
392    
393 root 1.38 sub fh { $_[0]{fh} }
394 elmex 1.1
395 root 1.8 =item $handle->on_error ($cb)
396 elmex 1.1
397 root 1.8 Replace the current C<on_error> callback (see the C<on_error> constructor argument).
398 elmex 1.1
399 root 1.8 =cut
400    
401     sub on_error {
402     $_[0]{on_error} = $_[1];
403     }
404    
405     =item $handle->on_eof ($cb)
406    
407     Replace the current C<on_eof> callback (see the C<on_eof> constructor argument).
408 elmex 1.1
409     =cut
410    
411 root 1.8 sub on_eof {
412     $_[0]{on_eof} = $_[1];
413     }
414    
415 root 1.43 =item $handle->on_timeout ($cb)
416    
417 root 1.88 Replace the current C<on_timeout> callback, or disables the callback (but
418     not the timeout) if C<$cb> = C<undef>. See the C<timeout> constructor
419     argument and method.
420 root 1.43
421     =cut
422    
423     sub on_timeout {
424     $_[0]{on_timeout} = $_[1];
425     }
426    
427 root 1.70 =item $handle->autocork ($boolean)
428    
429     Enables or disables the current autocork behaviour (see C<autocork>
430 root 1.105 constructor argument). Changes will only take effect on the next write.
431 root 1.70
432     =cut
433    
434 root 1.105 sub autocork {
435     $_[0]{autocork} = $_[1];
436     }
437    
438 root 1.70 =item $handle->no_delay ($boolean)
439    
440     Enables or disables the C<no_delay> setting (see constructor argument of
441     the same name for details).
442    
443     =cut
444    
445     sub no_delay {
446     $_[0]{no_delay} = $_[1];
447    
448     eval {
449     local $SIG{__DIE__};
450     setsockopt $_[0]{fh}, &Socket::IPPROTO_TCP, &Socket::TCP_NODELAY, int $_[1];
451     };
452     }
453    
454 root 1.142 =item $handle->on_starttls ($cb)
455    
456     Replace the current C<on_starttls> callback (see the C<on_starttls> constructor argument).
457    
458     =cut
459    
460     sub on_starttls {
461     $_[0]{on_starttls} = $_[1];
462     }
463    
464     =item $handle->on_stoptls ($cb)
465    
466     Replace the current C<on_stoptls> callback (see the C<on_stoptls> constructor argument).
467    
468     =cut
469    
470     sub on_starttls {
471     $_[0]{on_stoptls} = $_[1];
472     }
473    
474 root 1.43 #############################################################################
475    
476     =item $handle->timeout ($seconds)
477    
478     Configures (or disables) the inactivity timeout.
479    
480     =cut
481    
482     sub timeout {
483     my ($self, $timeout) = @_;
484    
485     $self->{timeout} = $timeout;
486     $self->_timeout;
487     }
488    
489     # reset the timeout watcher, as neccessary
490     # also check for time-outs
491     sub _timeout {
492     my ($self) = @_;
493    
494     if ($self->{timeout}) {
495 root 1.44 my $NOW = AnyEvent->now;
496 root 1.43
497     # when would the timeout trigger?
498     my $after = $self->{_activity} + $self->{timeout} - $NOW;
499    
500     # now or in the past already?
501     if ($after <= 0) {
502     $self->{_activity} = $NOW;
503    
504     if ($self->{on_timeout}) {
505 root 1.48 $self->{on_timeout}($self);
506 root 1.43 } else {
507 root 1.52 $self->_error (&Errno::ETIMEDOUT);
508 root 1.43 }
509    
510 root 1.56 # callback could have changed timeout value, optimise
511 root 1.43 return unless $self->{timeout};
512    
513     # calculate new after
514     $after = $self->{timeout};
515     }
516    
517     Scalar::Util::weaken $self;
518 root 1.56 return unless $self; # ->error could have destroyed $self
519 root 1.43
520     $self->{_tw} ||= AnyEvent->timer (after => $after, cb => sub {
521     delete $self->{_tw};
522     $self->_timeout;
523     });
524     } else {
525     delete $self->{_tw};
526     }
527     }
528    
529 root 1.9 #############################################################################
530    
531     =back
532    
533     =head2 WRITE QUEUE
534    
535     AnyEvent::Handle manages two queues per handle, one for writing and one
536     for reading.
537    
538     The write queue is very simple: you can add data to its end, and
539     AnyEvent::Handle will automatically try to get rid of it for you.
540    
541 elmex 1.20 When data could be written and the write buffer is shorter then the low
542 root 1.9 water mark, the C<on_drain> callback will be invoked.
543    
544     =over 4
545    
546 root 1.8 =item $handle->on_drain ($cb)
547    
548     Sets the C<on_drain> callback or clears it (see the description of
549     C<on_drain> in the constructor).
550    
551     =cut
552    
553     sub on_drain {
554 elmex 1.1 my ($self, $cb) = @_;
555    
556 root 1.8 $self->{on_drain} = $cb;
557    
558     $cb->($self)
559 root 1.93 if $cb && $self->{low_water_mark} >= (length $self->{wbuf}) + (length $self->{_tls_wbuf});
560 root 1.8 }
561    
562     =item $handle->push_write ($data)
563    
564     Queues the given scalar to be written. You can push as much data as you
565     want (only limited by the available memory), as C<AnyEvent::Handle>
566     buffers it independently of the kernel.
567    
568     =cut
569    
570 root 1.17 sub _drain_wbuf {
571     my ($self) = @_;
572 root 1.8
573 root 1.38 if (!$self->{_ww} && length $self->{wbuf}) {
574 root 1.35
575 root 1.8 Scalar::Util::weaken $self;
576 root 1.35
577 root 1.8 my $cb = sub {
578     my $len = syswrite $self->{fh}, $self->{wbuf};
579    
580 root 1.29 if ($len >= 0) {
581 root 1.8 substr $self->{wbuf}, 0, $len, "";
582    
583 root 1.44 $self->{_activity} = AnyEvent->now;
584 root 1.43
585 root 1.8 $self->{on_drain}($self)
586 root 1.93 if $self->{low_water_mark} >= (length $self->{wbuf}) + (length $self->{_tls_wbuf})
587 root 1.8 && $self->{on_drain};
588    
589 root 1.38 delete $self->{_ww} unless length $self->{wbuf};
590 root 1.42 } elsif ($! != EAGAIN && $! != EINTR && $! != WSAEWOULDBLOCK) {
591 root 1.52 $self->_error ($!, 1);
592 elmex 1.1 }
593 root 1.8 };
594    
595 root 1.35 # try to write data immediately
596 root 1.70 $cb->() unless $self->{autocork};
597 root 1.8
598 root 1.35 # if still data left in wbuf, we need to poll
599 root 1.38 $self->{_ww} = AnyEvent->io (fh => $self->{fh}, poll => "w", cb => $cb)
600 root 1.35 if length $self->{wbuf};
601 root 1.8 };
602     }
603    
604 root 1.30 our %WH;
605    
606     sub register_write_type($$) {
607     $WH{$_[0]} = $_[1];
608     }
609    
610 root 1.17 sub push_write {
611     my $self = shift;
612    
613 root 1.29 if (@_ > 1) {
614     my $type = shift;
615    
616     @_ = ($WH{$type} or Carp::croak "unsupported type passed to AnyEvent::Handle::push_write")
617     ->($self, @_);
618     }
619    
620 root 1.93 if ($self->{tls}) {
621     $self->{_tls_wbuf} .= $_[0];
622 root 1.97
623 root 1.93 &_dotls ($self);
624 root 1.17 } else {
625     $self->{wbuf} .= $_[0];
626     $self->_drain_wbuf;
627     }
628     }
629    
630 root 1.29 =item $handle->push_write (type => @args)
631    
632     Instead of formatting your data yourself, you can also let this module do
633     the job by specifying a type and type-specific arguments.
634    
635 root 1.30 Predefined types are (if you have ideas for additional types, feel free to
636     drop by and tell us):
637 root 1.29
638     =over 4
639    
640     =item netstring => $string
641    
642     Formats the given value as netstring
643     (http://cr.yp.to/proto/netstrings.txt, this is not a recommendation to use them).
644    
645     =cut
646    
647     register_write_type netstring => sub {
648     my ($self, $string) = @_;
649    
650 root 1.96 (length $string) . ":$string,"
651 root 1.29 };
652    
653 root 1.61 =item packstring => $format, $data
654    
655     An octet string prefixed with an encoded length. The encoding C<$format>
656     uses the same format as a Perl C<pack> format, but must specify a single
657     integer only (only one of C<cCsSlLqQiInNvVjJw> is allowed, plus an
658     optional C<!>, C<< < >> or C<< > >> modifier).
659    
660     =cut
661    
662     register_write_type packstring => sub {
663     my ($self, $format, $string) = @_;
664    
665 root 1.65 pack "$format/a*", $string
666 root 1.61 };
667    
668 root 1.39 =item json => $array_or_hashref
669    
670 root 1.40 Encodes the given hash or array reference into a JSON object. Unless you
671     provide your own JSON object, this means it will be encoded to JSON text
672     in UTF-8.
673    
674     JSON objects (and arrays) are self-delimiting, so you can write JSON at
675     one end of a handle and read them at the other end without using any
676     additional framing.
677    
678 root 1.41 The generated JSON text is guaranteed not to contain any newlines: While
679     this module doesn't need delimiters after or between JSON texts to be
680     able to read them, many other languages depend on that.
681    
682     A simple RPC protocol that interoperates easily with others is to send
683     JSON arrays (or objects, although arrays are usually the better choice as
684     they mimic how function argument passing works) and a newline after each
685     JSON text:
686    
687     $handle->push_write (json => ["method", "arg1", "arg2"]); # whatever
688     $handle->push_write ("\012");
689    
690     An AnyEvent::Handle receiver would simply use the C<json> read type and
691     rely on the fact that the newline will be skipped as leading whitespace:
692    
693     $handle->push_read (json => sub { my $array = $_[1]; ... });
694    
695     Other languages could read single lines terminated by a newline and pass
696     this line into their JSON decoder of choice.
697    
698 root 1.40 =cut
699    
700     register_write_type json => sub {
701     my ($self, $ref) = @_;
702    
703     require JSON;
704    
705     $self->{json} ? $self->{json}->encode ($ref)
706     : JSON::encode_json ($ref)
707     };
708    
709 root 1.63 =item storable => $reference
710    
711     Freezes the given reference using L<Storable> and writes it to the
712     handle. Uses the C<nfreeze> format.
713    
714     =cut
715    
716     register_write_type storable => sub {
717     my ($self, $ref) = @_;
718    
719     require Storable;
720    
721 root 1.65 pack "w/a*", Storable::nfreeze ($ref)
722 root 1.63 };
723    
724 root 1.53 =back
725    
726 root 1.133 =item $handle->push_shutdown
727    
728     Sometimes you know you want to close the socket after writing your data
729     before it was actually written. One way to do that is to replace your
730 root 1.142 C<on_drain> handler by a callback that shuts down the socket (and set
731     C<low_water_mark> to C<0>). This method is a shorthand for just that, and
732     replaces the C<on_drain> callback with:
733 root 1.133
734     sub { shutdown $_[0]{fh}, 1 } # for push_shutdown
735    
736     This simply shuts down the write side and signals an EOF condition to the
737     the peer.
738    
739     You can rely on the normal read queue and C<on_eof> handling
740     afterwards. This is the cleanest way to close a connection.
741    
742     =cut
743    
744     sub push_shutdown {
745 root 1.142 my ($self) = @_;
746    
747     delete $self->{low_water_mark};
748     $self->on_drain (sub { shutdown $_[0]{fh}, 1 });
749 root 1.133 }
750    
751 root 1.40 =item AnyEvent::Handle::register_write_type type => $coderef->($handle, @args)
752 root 1.30
753     This function (not method) lets you add your own types to C<push_write>.
754     Whenever the given C<type> is used, C<push_write> will invoke the code
755     reference with the handle object and the remaining arguments.
756 root 1.29
757 root 1.30 The code reference is supposed to return a single octet string that will
758     be appended to the write buffer.
759 root 1.29
760 root 1.30 Note that this is a function, and all types registered this way will be
761     global, so try to use unique names.
762 root 1.29
763 root 1.30 =cut
764 root 1.29
765 root 1.8 #############################################################################
766    
767 root 1.9 =back
768    
769     =head2 READ QUEUE
770    
771     AnyEvent::Handle manages two queues per handle, one for writing and one
772     for reading.
773    
774     The read queue is more complex than the write queue. It can be used in two
775     ways, the "simple" way, using only C<on_read> and the "complex" way, using
776     a queue.
777    
778     In the simple case, you just install an C<on_read> callback and whenever
779     new data arrives, it will be called. You can then remove some data (if
780 root 1.69 enough is there) from the read buffer (C<< $handle->rbuf >>). Or you cna
781     leave the data there if you want to accumulate more (e.g. when only a
782     partial message has been received so far).
783 root 1.9
784     In the more complex case, you want to queue multiple callbacks. In this
785     case, AnyEvent::Handle will call the first queued callback each time new
786 root 1.61 data arrives (also the first time it is queued) and removes it when it has
787     done its job (see C<push_read>, below).
788 root 1.9
789     This way you can, for example, push three line-reads, followed by reading
790     a chunk of data, and AnyEvent::Handle will execute them in order.
791    
792     Example 1: EPP protocol parser. EPP sends 4 byte length info, followed by
793     the specified number of bytes which give an XML datagram.
794    
795     # in the default state, expect some header bytes
796     $handle->on_read (sub {
797     # some data is here, now queue the length-header-read (4 octets)
798 root 1.52 shift->unshift_read (chunk => 4, sub {
799 root 1.9 # header arrived, decode
800     my $len = unpack "N", $_[1];
801    
802     # now read the payload
803 root 1.52 shift->unshift_read (chunk => $len, sub {
804 root 1.9 my $xml = $_[1];
805     # handle xml
806     });
807     });
808     });
809    
810 root 1.69 Example 2: Implement a client for a protocol that replies either with "OK"
811     and another line or "ERROR" for the first request that is sent, and 64
812     bytes for the second request. Due to the availability of a queue, we can
813     just pipeline sending both requests and manipulate the queue as necessary
814     in the callbacks.
815    
816     When the first callback is called and sees an "OK" response, it will
817     C<unshift> another line-read. This line-read will be queued I<before> the
818     64-byte chunk callback.
819 root 1.9
820 root 1.69 # request one, returns either "OK + extra line" or "ERROR"
821 root 1.9 $handle->push_write ("request 1\015\012");
822    
823     # we expect "ERROR" or "OK" as response, so push a line read
824 root 1.52 $handle->push_read (line => sub {
825 root 1.9 # if we got an "OK", we have to _prepend_ another line,
826     # so it will be read before the second request reads its 64 bytes
827     # which are already in the queue when this callback is called
828     # we don't do this in case we got an error
829     if ($_[1] eq "OK") {
830 root 1.52 $_[0]->unshift_read (line => sub {
831 root 1.9 my $response = $_[1];
832     ...
833     });
834     }
835     });
836    
837 root 1.69 # request two, simply returns 64 octets
838 root 1.9 $handle->push_write ("request 2\015\012");
839    
840     # simply read 64 bytes, always
841 root 1.52 $handle->push_read (chunk => 64, sub {
842 root 1.9 my $response = $_[1];
843     ...
844     });
845    
846     =over 4
847    
848 root 1.10 =cut
849    
850 root 1.8 sub _drain_rbuf {
851     my ($self) = @_;
852 elmex 1.1
853 root 1.59 local $self->{_in_drain} = 1;
854    
855 root 1.17 if (
856     defined $self->{rbuf_max}
857     && $self->{rbuf_max} < length $self->{rbuf}
858     ) {
859 root 1.82 $self->_error (&Errno::ENOSPC, 1), return;
860 root 1.17 }
861    
862 root 1.59 while () {
863 root 1.117 # we need to use a separate tls read buffer, as we must not receive data while
864     # we are draining the buffer, and this can only happen with TLS.
865 root 1.116 $self->{rbuf} .= delete $self->{_tls_rbuf} if exists $self->{_tls_rbuf};
866 root 1.115
867 root 1.59 my $len = length $self->{rbuf};
868 elmex 1.1
869 root 1.38 if (my $cb = shift @{ $self->{_queue} }) {
870 root 1.29 unless ($cb->($self)) {
871 root 1.38 if ($self->{_eof}) {
872 root 1.10 # no progress can be made (not enough data and no data forthcoming)
873 root 1.82 $self->_error (&Errno::EPIPE, 1), return;
874 root 1.10 }
875    
876 root 1.38 unshift @{ $self->{_queue} }, $cb;
877 root 1.55 last;
878 root 1.8 }
879     } elsif ($self->{on_read}) {
880 root 1.61 last unless $len;
881    
882 root 1.8 $self->{on_read}($self);
883    
884     if (
885 root 1.55 $len == length $self->{rbuf} # if no data has been consumed
886     && !@{ $self->{_queue} } # and the queue is still empty
887     && $self->{on_read} # but we still have on_read
888 root 1.8 ) {
889 root 1.55 # no further data will arrive
890     # so no progress can be made
891 root 1.82 $self->_error (&Errno::EPIPE, 1), return
892 root 1.55 if $self->{_eof};
893    
894     last; # more data might arrive
895 elmex 1.1 }
896 root 1.8 } else {
897     # read side becomes idle
898 root 1.93 delete $self->{_rw} unless $self->{tls};
899 root 1.55 last;
900 root 1.8 }
901     }
902    
903 root 1.80 if ($self->{_eof}) {
904     if ($self->{on_eof}) {
905     $self->{on_eof}($self)
906     } else {
907 root 1.140 $self->_error (0, 1, "Unexpected end-of-file");
908 root 1.80 }
909     }
910 root 1.55
911     # may need to restart read watcher
912     unless ($self->{_rw}) {
913     $self->start_read
914     if $self->{on_read} || @{ $self->{_queue} };
915     }
916 elmex 1.1 }
917    
918 root 1.8 =item $handle->on_read ($cb)
919 elmex 1.1
920 root 1.8 This replaces the currently set C<on_read> callback, or clears it (when
921     the new callback is C<undef>). See the description of C<on_read> in the
922     constructor.
923 elmex 1.1
924 root 1.8 =cut
925    
926     sub on_read {
927     my ($self, $cb) = @_;
928 elmex 1.1
929 root 1.8 $self->{on_read} = $cb;
930 root 1.59 $self->_drain_rbuf if $cb && !$self->{_in_drain};
931 elmex 1.1 }
932    
933 root 1.8 =item $handle->rbuf
934    
935     Returns the read buffer (as a modifiable lvalue).
936 elmex 1.1
937 root 1.117 You can access the read buffer directly as the C<< ->{rbuf} >>
938     member, if you want. However, the only operation allowed on the
939     read buffer (apart from looking at it) is removing data from its
940     beginning. Otherwise modifying or appending to it is not allowed and will
941     lead to hard-to-track-down bugs.
942 elmex 1.1
943 root 1.8 NOTE: The read buffer should only be used or modified if the C<on_read>,
944     C<push_read> or C<unshift_read> methods are used. The other read methods
945     automatically manage the read buffer.
946 elmex 1.1
947     =cut
948    
949 elmex 1.2 sub rbuf : lvalue {
950 root 1.8 $_[0]{rbuf}
951 elmex 1.2 }
952 elmex 1.1
953 root 1.8 =item $handle->push_read ($cb)
954    
955     =item $handle->unshift_read ($cb)
956    
957     Append the given callback to the end of the queue (C<push_read>) or
958     prepend it (C<unshift_read>).
959    
960     The callback is called each time some additional read data arrives.
961 elmex 1.1
962 elmex 1.20 It must check whether enough data is in the read buffer already.
963 elmex 1.1
964 root 1.8 If not enough data is available, it must return the empty list or a false
965     value, in which case it will be called repeatedly until enough data is
966     available (or an error condition is detected).
967    
968     If enough data was available, then the callback must remove all data it is
969     interested in (which can be none at all) and return a true value. After returning
970     true, it will be removed from the queue.
971 elmex 1.1
972     =cut
973    
974 root 1.30 our %RH;
975    
976     sub register_read_type($$) {
977     $RH{$_[0]} = $_[1];
978     }
979    
980 root 1.8 sub push_read {
981 root 1.28 my $self = shift;
982     my $cb = pop;
983    
984     if (@_) {
985     my $type = shift;
986    
987     $cb = ($RH{$type} or Carp::croak "unsupported type passed to AnyEvent::Handle::push_read")
988     ->($self, $cb, @_);
989     }
990 elmex 1.1
991 root 1.38 push @{ $self->{_queue} }, $cb;
992 root 1.59 $self->_drain_rbuf unless $self->{_in_drain};
993 elmex 1.1 }
994    
995 root 1.8 sub unshift_read {
996 root 1.28 my $self = shift;
997     my $cb = pop;
998    
999     if (@_) {
1000     my $type = shift;
1001    
1002     $cb = ($RH{$type} or Carp::croak "unsupported type passed to AnyEvent::Handle::unshift_read")
1003     ->($self, $cb, @_);
1004     }
1005    
1006 root 1.8
1007 root 1.38 unshift @{ $self->{_queue} }, $cb;
1008 root 1.59 $self->_drain_rbuf unless $self->{_in_drain};
1009 root 1.8 }
1010 elmex 1.1
1011 root 1.28 =item $handle->push_read (type => @args, $cb)
1012 elmex 1.1
1013 root 1.28 =item $handle->unshift_read (type => @args, $cb)
1014 elmex 1.1
1015 root 1.28 Instead of providing a callback that parses the data itself you can chose
1016     between a number of predefined parsing formats, for chunks of data, lines
1017     etc.
1018 elmex 1.1
1019 root 1.30 Predefined types are (if you have ideas for additional types, feel free to
1020     drop by and tell us):
1021 root 1.28
1022     =over 4
1023    
1024 root 1.40 =item chunk => $octets, $cb->($handle, $data)
1025 root 1.28
1026     Invoke the callback only once C<$octets> bytes have been read. Pass the
1027     data read to the callback. The callback will never be called with less
1028     data.
1029    
1030     Example: read 2 bytes.
1031    
1032     $handle->push_read (chunk => 2, sub {
1033     warn "yay ", unpack "H*", $_[1];
1034     });
1035 elmex 1.1
1036     =cut
1037    
1038 root 1.28 register_read_type chunk => sub {
1039     my ($self, $cb, $len) = @_;
1040 elmex 1.1
1041 root 1.8 sub {
1042     $len <= length $_[0]{rbuf} or return;
1043 elmex 1.12 $cb->($_[0], substr $_[0]{rbuf}, 0, $len, "");
1044 root 1.8 1
1045     }
1046 root 1.28 };
1047 root 1.8
1048 root 1.40 =item line => [$eol, ]$cb->($handle, $line, $eol)
1049 elmex 1.1
1050 root 1.8 The callback will be called only once a full line (including the end of
1051     line marker, C<$eol>) has been read. This line (excluding the end of line
1052     marker) will be passed to the callback as second argument (C<$line>), and
1053     the end of line marker as the third argument (C<$eol>).
1054 elmex 1.1
1055 root 1.8 The end of line marker, C<$eol>, can be either a string, in which case it
1056     will be interpreted as a fixed record end marker, or it can be a regex
1057     object (e.g. created by C<qr>), in which case it is interpreted as a
1058     regular expression.
1059 elmex 1.1
1060 root 1.8 The end of line marker argument C<$eol> is optional, if it is missing (NOT
1061     undef), then C<qr|\015?\012|> is used (which is good for most internet
1062     protocols).
1063 elmex 1.1
1064 root 1.8 Partial lines at the end of the stream will never be returned, as they are
1065     not marked by the end of line marker.
1066 elmex 1.1
1067 root 1.8 =cut
1068 elmex 1.1
1069 root 1.28 register_read_type line => sub {
1070     my ($self, $cb, $eol) = @_;
1071 elmex 1.1
1072 root 1.76 if (@_ < 3) {
1073     # this is more than twice as fast as the generic code below
1074     sub {
1075     $_[0]{rbuf} =~ s/^([^\015\012]*)(\015?\012)// or return;
1076 elmex 1.1
1077 root 1.76 $cb->($_[0], $1, $2);
1078     1
1079     }
1080     } else {
1081     $eol = quotemeta $eol unless ref $eol;
1082     $eol = qr|^(.*?)($eol)|s;
1083    
1084     sub {
1085     $_[0]{rbuf} =~ s/$eol// or return;
1086 elmex 1.1
1087 root 1.76 $cb->($_[0], $1, $2);
1088     1
1089     }
1090 root 1.8 }
1091 root 1.28 };
1092 elmex 1.1
1093 root 1.40 =item regex => $accept[, $reject[, $skip], $cb->($handle, $data)
1094 root 1.36
1095     Makes a regex match against the regex object C<$accept> and returns
1096     everything up to and including the match.
1097    
1098     Example: read a single line terminated by '\n'.
1099    
1100     $handle->push_read (regex => qr<\n>, sub { ... });
1101    
1102     If C<$reject> is given and not undef, then it determines when the data is
1103     to be rejected: it is matched against the data when the C<$accept> regex
1104     does not match and generates an C<EBADMSG> error when it matches. This is
1105     useful to quickly reject wrong data (to avoid waiting for a timeout or a
1106     receive buffer overflow).
1107    
1108     Example: expect a single decimal number followed by whitespace, reject
1109     anything else (not the use of an anchor).
1110    
1111     $handle->push_read (regex => qr<^[0-9]+\s>, qr<[^0-9]>, sub { ... });
1112    
1113     If C<$skip> is given and not C<undef>, then it will be matched against
1114     the receive buffer when neither C<$accept> nor C<$reject> match,
1115     and everything preceding and including the match will be accepted
1116     unconditionally. This is useful to skip large amounts of data that you
1117     know cannot be matched, so that the C<$accept> or C<$reject> regex do not
1118     have to start matching from the beginning. This is purely an optimisation
1119     and is usually worth only when you expect more than a few kilobytes.
1120    
1121     Example: expect a http header, which ends at C<\015\012\015\012>. Since we
1122     expect the header to be very large (it isn't in practise, but...), we use
1123     a skip regex to skip initial portions. The skip regex is tricky in that
1124     it only accepts something not ending in either \015 or \012, as these are
1125     required for the accept regex.
1126    
1127     $handle->push_read (regex =>
1128     qr<\015\012\015\012>,
1129     undef, # no reject
1130     qr<^.*[^\015\012]>,
1131     sub { ... });
1132    
1133     =cut
1134    
1135     register_read_type regex => sub {
1136     my ($self, $cb, $accept, $reject, $skip) = @_;
1137    
1138     my $data;
1139     my $rbuf = \$self->{rbuf};
1140    
1141     sub {
1142     # accept
1143     if ($$rbuf =~ $accept) {
1144     $data .= substr $$rbuf, 0, $+[0], "";
1145     $cb->($self, $data);
1146     return 1;
1147     }
1148    
1149     # reject
1150     if ($reject && $$rbuf =~ $reject) {
1151 root 1.52 $self->_error (&Errno::EBADMSG);
1152 root 1.36 }
1153    
1154     # skip
1155     if ($skip && $$rbuf =~ $skip) {
1156     $data .= substr $$rbuf, 0, $+[0], "";
1157     }
1158    
1159     ()
1160     }
1161     };
1162    
1163 root 1.61 =item netstring => $cb->($handle, $string)
1164    
1165     A netstring (http://cr.yp.to/proto/netstrings.txt, this is not an endorsement).
1166    
1167     Throws an error with C<$!> set to EBADMSG on format violations.
1168    
1169     =cut
1170    
1171     register_read_type netstring => sub {
1172     my ($self, $cb) = @_;
1173    
1174     sub {
1175     unless ($_[0]{rbuf} =~ s/^(0|[1-9][0-9]*)://) {
1176     if ($_[0]{rbuf} =~ /[^0-9]/) {
1177     $self->_error (&Errno::EBADMSG);
1178     }
1179     return;
1180     }
1181    
1182     my $len = $1;
1183    
1184     $self->unshift_read (chunk => $len, sub {
1185     my $string = $_[1];
1186     $_[0]->unshift_read (chunk => 1, sub {
1187     if ($_[1] eq ",") {
1188     $cb->($_[0], $string);
1189     } else {
1190     $self->_error (&Errno::EBADMSG);
1191     }
1192     });
1193     });
1194    
1195     1
1196     }
1197     };
1198    
1199     =item packstring => $format, $cb->($handle, $string)
1200    
1201     An octet string prefixed with an encoded length. The encoding C<$format>
1202     uses the same format as a Perl C<pack> format, but must specify a single
1203     integer only (only one of C<cCsSlLqQiInNvVjJw> is allowed, plus an
1204     optional C<!>, C<< < >> or C<< > >> modifier).
1205    
1206 root 1.96 For example, DNS over TCP uses a prefix of C<n> (2 octet network order),
1207     EPP uses a prefix of C<N> (4 octtes).
1208 root 1.61
1209     Example: read a block of data prefixed by its length in BER-encoded
1210     format (very efficient).
1211    
1212     $handle->push_read (packstring => "w", sub {
1213     my ($handle, $data) = @_;
1214     });
1215    
1216     =cut
1217    
1218     register_read_type packstring => sub {
1219     my ($self, $cb, $format) = @_;
1220    
1221     sub {
1222     # when we can use 5.10 we can use ".", but for 5.8 we use the re-pack method
1223 root 1.76 defined (my $len = eval { unpack $format, $_[0]{rbuf} })
1224 root 1.61 or return;
1225    
1226 root 1.77 $format = length pack $format, $len;
1227 root 1.61
1228 root 1.77 # bypass unshift if we already have the remaining chunk
1229     if ($format + $len <= length $_[0]{rbuf}) {
1230     my $data = substr $_[0]{rbuf}, $format, $len;
1231     substr $_[0]{rbuf}, 0, $format + $len, "";
1232     $cb->($_[0], $data);
1233     } else {
1234     # remove prefix
1235     substr $_[0]{rbuf}, 0, $format, "";
1236    
1237     # read remaining chunk
1238     $_[0]->unshift_read (chunk => $len, $cb);
1239     }
1240 root 1.61
1241     1
1242     }
1243     };
1244    
1245 root 1.40 =item json => $cb->($handle, $hash_or_arrayref)
1246    
1247 root 1.110 Reads a JSON object or array, decodes it and passes it to the
1248     callback. When a parse error occurs, an C<EBADMSG> error will be raised.
1249 root 1.40
1250     If a C<json> object was passed to the constructor, then that will be used
1251     for the final decode, otherwise it will create a JSON coder expecting UTF-8.
1252    
1253     This read type uses the incremental parser available with JSON version
1254     2.09 (and JSON::XS version 2.2) and above. You have to provide a
1255     dependency on your own: this module will load the JSON module, but
1256     AnyEvent does not depend on it itself.
1257    
1258     Since JSON texts are fully self-delimiting, the C<json> read and write
1259 root 1.41 types are an ideal simple RPC protocol: just exchange JSON datagrams. See
1260     the C<json> write type description, above, for an actual example.
1261 root 1.40
1262     =cut
1263    
1264     register_read_type json => sub {
1265 root 1.63 my ($self, $cb) = @_;
1266 root 1.40
1267 root 1.135 my $json = $self->{json} ||=
1268     eval { require JSON::XS; JSON::XS->new->utf8 }
1269     || do { require JSON; JSON->new->utf8 };
1270 root 1.40
1271     my $data;
1272     my $rbuf = \$self->{rbuf};
1273    
1274     sub {
1275 root 1.113 my $ref = eval { $json->incr_parse ($self->{rbuf}) };
1276 root 1.110
1277 root 1.113 if ($ref) {
1278     $self->{rbuf} = $json->incr_text;
1279     $json->incr_text = "";
1280     $cb->($self, $ref);
1281 root 1.110
1282     1
1283 root 1.113 } elsif ($@) {
1284 root 1.111 # error case
1285 root 1.110 $json->incr_skip;
1286 root 1.40
1287     $self->{rbuf} = $json->incr_text;
1288     $json->incr_text = "";
1289    
1290 root 1.110 $self->_error (&Errno::EBADMSG);
1291 root 1.114
1292 root 1.113 ()
1293     } else {
1294     $self->{rbuf} = "";
1295 root 1.114
1296 root 1.113 ()
1297     }
1298 root 1.40 }
1299     };
1300    
1301 root 1.63 =item storable => $cb->($handle, $ref)
1302    
1303     Deserialises a L<Storable> frozen representation as written by the
1304     C<storable> write type (BER-encoded length prefix followed by nfreeze'd
1305     data).
1306    
1307     Raises C<EBADMSG> error if the data could not be decoded.
1308    
1309     =cut
1310    
1311     register_read_type storable => sub {
1312     my ($self, $cb) = @_;
1313    
1314     require Storable;
1315    
1316     sub {
1317     # when we can use 5.10 we can use ".", but for 5.8 we use the re-pack method
1318 root 1.76 defined (my $len = eval { unpack "w", $_[0]{rbuf} })
1319 root 1.63 or return;
1320    
1321 root 1.77 my $format = length pack "w", $len;
1322 root 1.63
1323 root 1.77 # bypass unshift if we already have the remaining chunk
1324     if ($format + $len <= length $_[0]{rbuf}) {
1325     my $data = substr $_[0]{rbuf}, $format, $len;
1326     substr $_[0]{rbuf}, 0, $format + $len, "";
1327     $cb->($_[0], Storable::thaw ($data));
1328     } else {
1329     # remove prefix
1330     substr $_[0]{rbuf}, 0, $format, "";
1331    
1332     # read remaining chunk
1333     $_[0]->unshift_read (chunk => $len, sub {
1334     if (my $ref = eval { Storable::thaw ($_[1]) }) {
1335     $cb->($_[0], $ref);
1336     } else {
1337     $self->_error (&Errno::EBADMSG);
1338     }
1339     });
1340     }
1341    
1342     1
1343 root 1.63 }
1344     };
1345    
1346 root 1.28 =back
1347    
1348 root 1.40 =item AnyEvent::Handle::register_read_type type => $coderef->($handle, $cb, @args)
1349 root 1.30
1350     This function (not method) lets you add your own types to C<push_read>.
1351    
1352     Whenever the given C<type> is used, C<push_read> will invoke the code
1353     reference with the handle object, the callback and the remaining
1354     arguments.
1355    
1356     The code reference is supposed to return a callback (usually a closure)
1357     that works as a plain read callback (see C<< ->push_read ($cb) >>).
1358    
1359     It should invoke the passed callback when it is done reading (remember to
1360 root 1.40 pass C<$handle> as first argument as all other callbacks do that).
1361 root 1.30
1362     Note that this is a function, and all types registered this way will be
1363     global, so try to use unique names.
1364    
1365     For examples, see the source of this module (F<perldoc -m AnyEvent::Handle>,
1366     search for C<register_read_type>)).
1367    
1368 root 1.10 =item $handle->stop_read
1369    
1370     =item $handle->start_read
1371    
1372 root 1.18 In rare cases you actually do not want to read anything from the
1373 root 1.58 socket. In this case you can call C<stop_read>. Neither C<on_read> nor
1374 root 1.22 any queued callbacks will be executed then. To start reading again, call
1375 root 1.10 C<start_read>.
1376    
1377 root 1.56 Note that AnyEvent::Handle will automatically C<start_read> for you when
1378     you change the C<on_read> callback or push/unshift a read callback, and it
1379     will automatically C<stop_read> for you when neither C<on_read> is set nor
1380     there are any read requests in the queue.
1381    
1382 root 1.93 These methods will have no effect when in TLS mode (as TLS doesn't support
1383     half-duplex connections).
1384    
1385 root 1.10 =cut
1386    
1387     sub stop_read {
1388     my ($self) = @_;
1389 elmex 1.1
1390 root 1.93 delete $self->{_rw} unless $self->{tls};
1391 root 1.8 }
1392 elmex 1.1
1393 root 1.10 sub start_read {
1394     my ($self) = @_;
1395    
1396 root 1.38 unless ($self->{_rw} || $self->{_eof}) {
1397 root 1.10 Scalar::Util::weaken $self;
1398    
1399 root 1.38 $self->{_rw} = AnyEvent->io (fh => $self->{fh}, poll => "r", cb => sub {
1400 root 1.93 my $rbuf = \($self->{tls} ? my $buf : $self->{rbuf});
1401 root 1.17 my $len = sysread $self->{fh}, $$rbuf, $self->{read_size} || 8192, length $$rbuf;
1402 root 1.10
1403     if ($len > 0) {
1404 root 1.44 $self->{_activity} = AnyEvent->now;
1405 root 1.43
1406 root 1.93 if ($self->{tls}) {
1407     Net::SSLeay::BIO_write ($self->{_rbio}, $$rbuf);
1408 root 1.97
1409 root 1.93 &_dotls ($self);
1410     } else {
1411     $self->_drain_rbuf unless $self->{_in_drain};
1412     }
1413 root 1.10
1414     } elsif (defined $len) {
1415 root 1.38 delete $self->{_rw};
1416     $self->{_eof} = 1;
1417 root 1.59 $self->_drain_rbuf unless $self->{_in_drain};
1418 root 1.10
1419 root 1.42 } elsif ($! != EAGAIN && $! != EINTR && $! != WSAEWOULDBLOCK) {
1420 root 1.52 return $self->_error ($!, 1);
1421 root 1.10 }
1422     });
1423     }
1424 elmex 1.1 }
1425    
1426 root 1.133 our $ERROR_SYSCALL;
1427     our $ERROR_WANT_READ;
1428    
1429     sub _tls_error {
1430     my ($self, $err) = @_;
1431    
1432     return $self->_error ($!, 1)
1433     if $err == Net::SSLeay::ERROR_SYSCALL ();
1434    
1435 root 1.137 my $err =Net::SSLeay::ERR_error_string (Net::SSLeay::ERR_get_error ());
1436    
1437     # reduce error string to look less scary
1438     $err =~ s/^error:[0-9a-fA-F]{8}:[^:]+:([^:]+):/\L$1: /;
1439    
1440     $self->_error (&Errno::EPROTO, 1, $err);
1441 root 1.133 }
1442    
1443 root 1.97 # poll the write BIO and send the data if applicable
1444 root 1.133 # also decode read data if possible
1445     # this is basiclaly our TLS state machine
1446     # more efficient implementations are possible with openssl,
1447     # but not with the buggy and incomplete Net::SSLeay.
1448 root 1.19 sub _dotls {
1449     my ($self) = @_;
1450    
1451 root 1.97 my $tmp;
1452 root 1.56
1453 root 1.38 if (length $self->{_tls_wbuf}) {
1454 root 1.97 while (($tmp = Net::SSLeay::write ($self->{tls}, $self->{_tls_wbuf})) > 0) {
1455     substr $self->{_tls_wbuf}, 0, $tmp, "";
1456 root 1.22 }
1457 root 1.133
1458     $tmp = Net::SSLeay::get_error ($self->{tls}, $tmp);
1459     return $self->_tls_error ($tmp)
1460     if $tmp != $ERROR_WANT_READ
1461 root 1.142 && ($tmp != $ERROR_SYSCALL || $!);
1462 root 1.19 }
1463    
1464 root 1.97 while (defined ($tmp = Net::SSLeay::read ($self->{tls}))) {
1465     unless (length $tmp) {
1466 root 1.92 &_freetls;
1467 root 1.142 if ($self->{on_stoptls}) {
1468     $self->{on_stoptls}($self);
1469     return;
1470     } else {
1471     # let's treat SSL-eof as we treat normal EOF
1472     delete $self->{_rw};
1473     $self->{_eof} = 1;
1474     }
1475 root 1.56 }
1476 root 1.91
1477 root 1.116 $self->{_tls_rbuf} .= $tmp;
1478 root 1.91 $self->_drain_rbuf unless $self->{_in_drain};
1479 root 1.92 $self->{tls} or return; # tls session might have gone away in callback
1480 root 1.23 }
1481    
1482 root 1.97 $tmp = Net::SSLeay::get_error ($self->{tls}, -1);
1483 root 1.133 return $self->_tls_error ($tmp)
1484     if $tmp != $ERROR_WANT_READ
1485 root 1.142 && ($tmp != $ERROR_SYSCALL || $!);
1486 root 1.91
1487 root 1.97 while (length ($tmp = Net::SSLeay::BIO_read ($self->{_wbio}))) {
1488     $self->{wbuf} .= $tmp;
1489 root 1.91 $self->_drain_wbuf;
1490     }
1491 root 1.142
1492     $self->{_on_starttls}
1493     and Net::SSLeay::state ($self->{tls}) == Net::SSLeay::ST_OK ()
1494     and (delete $self->{_on_starttls})->($self, 1);
1495 root 1.19 }
1496    
1497 root 1.25 =item $handle->starttls ($tls[, $tls_ctx])
1498    
1499     Instead of starting TLS negotiation immediately when the AnyEvent::Handle
1500     object is created, you can also do that at a later time by calling
1501     C<starttls>.
1502    
1503     The first argument is the same as the C<tls> constructor argument (either
1504     C<"connect">, C<"accept"> or an existing Net::SSLeay object).
1505    
1506 root 1.131 The second argument is the optional C<AnyEvent::TLS> object that is used
1507     when AnyEvent::Handle has to create its own TLS connection object, or
1508     a hash reference with C<< key => value >> pairs that will be used to
1509     construct a new context.
1510    
1511     The TLS connection object will end up in C<< $handle->{tls} >>, the TLS
1512     context in C<< $handle->{tls_ctx} >> after this call and can be used or
1513     changed to your liking. Note that the handshake might have already started
1514     when this function returns.
1515 root 1.38
1516 root 1.92 If it an error to start a TLS handshake more than once per
1517     AnyEvent::Handle object (this is due to bugs in OpenSSL).
1518    
1519 root 1.25 =cut
1520    
1521 root 1.137 our %TLS_CACHE; #TODO not yet documented, should we?
1522    
1523 root 1.19 sub starttls {
1524     my ($self, $ssl, $ctx) = @_;
1525    
1526 root 1.94 require Net::SSLeay;
1527    
1528 root 1.102 Carp::croak "it is an error to call starttls more than once on an AnyEvent::Handle object"
1529 root 1.92 if $self->{tls};
1530 root 1.131
1531 root 1.142 $ERROR_SYSCALL = Net::SSLeay::ERROR_SYSCALL ();
1532     $ERROR_WANT_READ = Net::SSLeay::ERROR_WANT_READ ();
1533 root 1.133
1534 root 1.131 $ctx ||= $self->{tls_ctx};
1535    
1536     if ("HASH" eq ref $ctx) {
1537     require AnyEvent::TLS;
1538    
1539     local $Carp::CarpLevel = 1; # skip ourselves when creating a new context
1540 root 1.137
1541     if ($ctx->{cache}) {
1542     my $key = $ctx+0;
1543     $ctx = $TLS_CACHE{$key} ||= new AnyEvent::TLS %$ctx;
1544     } else {
1545     $ctx = new AnyEvent::TLS %$ctx;
1546     }
1547 root 1.131 }
1548 root 1.92
1549 root 1.131 $self->{tls_ctx} = $ctx || TLS_CTX ();
1550 root 1.133 $self->{tls} = $ssl = $self->{tls_ctx}->_get_session ($ssl, $self, $self->{peername});
1551 root 1.19
1552 root 1.21 # basically, this is deep magic (because SSL_read should have the same issues)
1553     # but the openssl maintainers basically said: "trust us, it just works".
1554     # (unfortunately, we have to hardcode constants because the abysmally misdesigned
1555     # and mismaintained ssleay-module doesn't even offer them).
1556 root 1.27 # http://www.mail-archive.com/openssl-dev@openssl.org/msg22420.html
1557 root 1.87 #
1558     # in short: this is a mess.
1559     #
1560 root 1.93 # note that we do not try to keep the length constant between writes as we are required to do.
1561 root 1.87 # we assume that most (but not all) of this insanity only applies to non-blocking cases,
1562 root 1.93 # and we drive openssl fully in blocking mode here. Or maybe we don't - openssl seems to
1563     # have identity issues in that area.
1564 root 1.131 # Net::SSLeay::CTX_set_mode ($ssl,
1565     # (eval { local $SIG{__DIE__}; Net::SSLeay::MODE_ENABLE_PARTIAL_WRITE () } || 1)
1566     # | (eval { local $SIG{__DIE__}; Net::SSLeay::MODE_ACCEPT_MOVING_WRITE_BUFFER () } || 2));
1567     Net::SSLeay::CTX_set_mode ($ssl, 1|2);
1568 root 1.21
1569 root 1.38 $self->{_rbio} = Net::SSLeay::BIO_new (Net::SSLeay::BIO_s_mem ());
1570     $self->{_wbio} = Net::SSLeay::BIO_new (Net::SSLeay::BIO_s_mem ());
1571 root 1.19
1572 root 1.38 Net::SSLeay::set_bio ($ssl, $self->{_rbio}, $self->{_wbio});
1573 root 1.19
1574 root 1.142 $self->{_on_starttls} = sub { $_[0]{on_starttls}(@_) }
1575     if exists $self->{on_starttls};
1576    
1577 root 1.93 &_dotls; # need to trigger the initial handshake
1578     $self->start_read; # make sure we actually do read
1579 root 1.19 }
1580    
1581 root 1.25 =item $handle->stoptls
1582    
1583 root 1.92 Shuts down the SSL connection - this makes a proper EOF handshake by
1584     sending a close notify to the other side, but since OpenSSL doesn't
1585     support non-blocking shut downs, it is not possible to re-use the stream
1586     afterwards.
1587 root 1.25
1588     =cut
1589    
1590     sub stoptls {
1591     my ($self) = @_;
1592    
1593 root 1.92 if ($self->{tls}) {
1594 root 1.94 Net::SSLeay::shutdown ($self->{tls});
1595 root 1.92
1596     &_dotls;
1597    
1598 root 1.142 # # we don't give a shit. no, we do, but we can't. no...#d#
1599     # # we, we... have to use openssl :/#d#
1600     # &_freetls;#d#
1601 root 1.92 }
1602     }
1603    
1604     sub _freetls {
1605     my ($self) = @_;
1606    
1607     return unless $self->{tls};
1608 root 1.38
1609 root 1.142 $self->{_on_starttls}
1610     and (delete $self->{_on_starttls})->($self, undef);
1611    
1612 root 1.131 $self->{tls_ctx}->_put_session (delete $self->{tls});
1613 root 1.92
1614 root 1.93 delete @$self{qw(_rbio _wbio _tls_wbuf)};
1615 root 1.25 }
1616    
1617 root 1.19 sub DESTROY {
1618 root 1.120 my ($self) = @_;
1619 root 1.19
1620 root 1.92 &_freetls;
1621 root 1.62
1622     my $linger = exists $self->{linger} ? $self->{linger} : 3600;
1623    
1624     if ($linger && length $self->{wbuf}) {
1625     my $fh = delete $self->{fh};
1626     my $wbuf = delete $self->{wbuf};
1627    
1628     my @linger;
1629    
1630     push @linger, AnyEvent->io (fh => $fh, poll => "w", cb => sub {
1631     my $len = syswrite $fh, $wbuf, length $wbuf;
1632    
1633     if ($len > 0) {
1634     substr $wbuf, 0, $len, "";
1635     } else {
1636     @linger = (); # end
1637     }
1638     });
1639     push @linger, AnyEvent->timer (after => $linger, cb => sub {
1640     @linger = ();
1641     });
1642     }
1643 root 1.19 }
1644    
1645 root 1.99 =item $handle->destroy
1646    
1647 root 1.101 Shuts down the handle object as much as possible - this call ensures that
1648 root 1.141 no further callbacks will be invoked and as many resources as possible
1649     will be freed. You must not call any methods on the object afterwards.
1650 root 1.99
1651 root 1.101 Normally, you can just "forget" any references to an AnyEvent::Handle
1652     object and it will simply shut down. This works in fatal error and EOF
1653     callbacks, as well as code outside. It does I<NOT> work in a read or write
1654     callback, so when you want to destroy the AnyEvent::Handle object from
1655     within such an callback. You I<MUST> call C<< ->destroy >> explicitly in
1656     that case.
1657    
1658 root 1.99 The handle might still linger in the background and write out remaining
1659     data, as specified by the C<linger> option, however.
1660    
1661     =cut
1662    
1663     sub destroy {
1664     my ($self) = @_;
1665    
1666     $self->DESTROY;
1667     %$self = ();
1668     }
1669    
1670 root 1.19 =item AnyEvent::Handle::TLS_CTX
1671    
1672 root 1.131 This function creates and returns the AnyEvent::TLS object used by default
1673     for TLS mode.
1674 root 1.19
1675 root 1.131 The context is created by calling L<AnyEvent::TLS> without any arguments.
1676 root 1.19
1677     =cut
1678    
1679     our $TLS_CTX;
1680    
1681     sub TLS_CTX() {
1682 root 1.131 $TLS_CTX ||= do {
1683     require AnyEvent::TLS;
1684 root 1.19
1685 root 1.131 new AnyEvent::TLS
1686 root 1.19 }
1687     }
1688    
1689 elmex 1.1 =back
1690    
1691 root 1.95
1692     =head1 NONFREQUENTLY ASKED QUESTIONS
1693    
1694     =over 4
1695    
1696 root 1.101 =item I C<undef> the AnyEvent::Handle reference inside my callback and
1697     still get further invocations!
1698    
1699     That's because AnyEvent::Handle keeps a reference to itself when handling
1700     read or write callbacks.
1701    
1702     It is only safe to "forget" the reference inside EOF or error callbacks,
1703     from within all other callbacks, you need to explicitly call the C<<
1704     ->destroy >> method.
1705    
1706     =item I get different callback invocations in TLS mode/Why can't I pause
1707     reading?
1708    
1709     Unlike, say, TCP, TLS connections do not consist of two independent
1710     communication channels, one for each direction. Or put differently. The
1711     read and write directions are not independent of each other: you cannot
1712     write data unless you are also prepared to read, and vice versa.
1713    
1714     This can mean than, in TLS mode, you might get C<on_error> or C<on_eof>
1715     callback invocations when you are not expecting any read data - the reason
1716     is that AnyEvent::Handle always reads in TLS mode.
1717    
1718     During the connection, you have to make sure that you always have a
1719     non-empty read-queue, or an C<on_read> watcher. At the end of the
1720     connection (or when you no longer want to use it) you can call the
1721     C<destroy> method.
1722    
1723 root 1.95 =item How do I read data until the other side closes the connection?
1724    
1725 root 1.96 If you just want to read your data into a perl scalar, the easiest way
1726     to achieve this is by setting an C<on_read> callback that does nothing,
1727     clearing the C<on_eof> callback and in the C<on_error> callback, the data
1728     will be in C<$_[0]{rbuf}>:
1729 root 1.95
1730     $handle->on_read (sub { });
1731     $handle->on_eof (undef);
1732     $handle->on_error (sub {
1733     my $data = delete $_[0]{rbuf};
1734     undef $handle;
1735     });
1736    
1737     The reason to use C<on_error> is that TCP connections, due to latencies
1738     and packets loss, might get closed quite violently with an error, when in
1739     fact, all data has been received.
1740    
1741 root 1.101 It is usually better to use acknowledgements when transferring data,
1742 root 1.95 to make sure the other side hasn't just died and you got the data
1743     intact. This is also one reason why so many internet protocols have an
1744     explicit QUIT command.
1745    
1746 root 1.96 =item I don't want to destroy the handle too early - how do I wait until
1747     all data has been written?
1748 root 1.95
1749     After writing your last bits of data, set the C<on_drain> callback
1750     and destroy the handle in there - with the default setting of
1751     C<low_water_mark> this will be called precisely when all data has been
1752     written to the socket:
1753    
1754     $handle->push_write (...);
1755     $handle->on_drain (sub {
1756     warn "all data submitted to the kernel\n";
1757     undef $handle;
1758     });
1759    
1760     =back
1761    
1762    
1763 root 1.38 =head1 SUBCLASSING AnyEvent::Handle
1764    
1765     In many cases, you might want to subclass AnyEvent::Handle.
1766    
1767     To make this easier, a given version of AnyEvent::Handle uses these
1768     conventions:
1769    
1770     =over 4
1771    
1772     =item * all constructor arguments become object members.
1773    
1774     At least initially, when you pass a C<tls>-argument to the constructor it
1775 root 1.75 will end up in C<< $handle->{tls} >>. Those members might be changed or
1776 root 1.38 mutated later on (for example C<tls> will hold the TLS connection object).
1777    
1778     =item * other object member names are prefixed with an C<_>.
1779    
1780     All object members not explicitly documented (internal use) are prefixed
1781     with an underscore character, so the remaining non-C<_>-namespace is free
1782     for use for subclasses.
1783    
1784     =item * all members not documented here and not prefixed with an underscore
1785     are free to use in subclasses.
1786    
1787     Of course, new versions of AnyEvent::Handle may introduce more "public"
1788     member variables, but thats just life, at least it is documented.
1789    
1790     =back
1791    
1792 elmex 1.1 =head1 AUTHOR
1793    
1794 root 1.8 Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.
1795 elmex 1.1
1796     =cut
1797    
1798     1; # End of AnyEvent::Handle