ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.57 by root, Mon Sep 6 06:31:32 2010 UTC vs.
Revision 1.62 by root, Thu Dec 30 04:30:24 2010 UTC

43 43
44use Errno (); 44use Errno ();
45 45
46use AnyEvent 5.0 (); 46use AnyEvent 5.0 ();
47use AnyEvent::Util (); 47use AnyEvent::Util ();
48use AnyEvent::Socket ();
49use AnyEvent::Handle (); 48use AnyEvent::Handle ();
50 49
51use base Exporter::; 50use base Exporter::;
52 51
53our $VERSION = '1.46'; 52our $VERSION = '1.46';
92must be an absolute http or https URL. 91must be an absolute http or https URL.
93 92
94When called in void context, nothing is returned. In other contexts, 93When called in void context, nothing is returned. In other contexts,
95C<http_request> returns a "cancellation guard" - you have to keep the 94C<http_request> returns a "cancellation guard" - you have to keep the
96object at least alive until the callback get called. If the object gets 95object at least alive until the callback get called. If the object gets
97destroyed before the callbakc is called, the request will be cancelled. 96destroyed before the callback is called, the request will be cancelled.
98 97
99The callback will be called with the response body data as first argument 98The callback will be called with the response body data as first argument
100(or C<undef> if an error occured), and a hash-ref with response headers as 99(or C<undef> if an error occured), and a hash-ref with response headers as
101second argument. 100second argument.
102 101
148Whether to recurse requests or not, e.g. on redirects, authentication 147Whether to recurse requests or not, e.g. on redirects, authentication
149retries and so on, and how often to do so. 148retries and so on, and how often to do so.
150 149
151=item headers => hashref 150=item headers => hashref
152 151
153The request headers to use. Currently, C<http_request> may provide its 152The request headers to use, with the header name (I<MUST be in lowercase>)
154own C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers 153as key and header value as hash value.
155and will provide defaults for C<User-Agent:> and C<Referer:> (this can be 154
155Currently, http_request> may provide its own C<host>, C<content-length>,
156C<connection> and C<cookie> headers and will provide defaults for
157C<user-agent> and C<referer> (this can be suppressed by using a value of
156suppressed by using C<undef> for these headers in which case they won't be 158C<undef> for these headers in which case they won't be sent at all).
157sent at all).
158 159
159=item timeout => $seconds 160=item timeout => $seconds
160 161
161The time-out to use for various stages - each connect attempt will reset 162The time-out to use for various stages - each connect attempt will reset
162the timeout, as will read or write activity, i.e. this is not an overall 163the timeout, as will read or write activity, i.e. this is not an overall
211connect (for exmaple, to bind it on a given IP address). This parameter 212connect (for exmaple, to bind it on a given IP address). This parameter
212overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect> 213overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect>
213and behaves exactly the same way (e.g. it has to provide a 214and behaves exactly the same way (e.g. it has to provide a
214timeout). See the description for the C<$prepare_cb> argument of 215timeout). See the description for the C<$prepare_cb> argument of
215C<AnyEvent::Socket::tcp_connect> for details. 216C<AnyEvent::Socket::tcp_connect> for details.
217
218=item tcp_connect => $callback->($host, $service, $connect_cb, $prepare_cb)
219
220In even rarer cases you want total control over how AnyEvent::HTTP
221establishes connections. Normally it uses L<AnyEvent::Socket::tcp_connect>
222to do this, but you can provide your own C<tcp_connect> function -
223obviously, it has to follow the same calling conventions, except that it
224may always return a connection guard object.
225
226There are probably lots of weird uses for this function, starting from
227tracing the hosts C<http_request> actually tries to connect, to (inexact
228but fast) host => IP address caching or even socks protocol support.
216 229
217=item on_header => $callback->($headers) 230=item on_header => $callback->($headers)
218 231
219When specified, this callback will be called with the header hash as soon 232When specified, this callback will be called with the header hash as soon
220as headers have been successfully received from the remote server (not on 233as headers have been successfully received from the remote server (not on
454 _get_slot $uhost, sub { 467 _get_slot $uhost, sub {
455 $state{slot_guard} = shift; 468 $state{slot_guard} = shift;
456 469
457 return unless $state{connect_guard}; 470 return unless $state{connect_guard};
458 471
459 $state{connect_guard} = AnyEvent::Socket::tcp_connect $rhost, $rport, sub { 472 my $tcp_connect = $arg{tcp_connect}
473 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
474
475 $state{connect_guard} = $tcp_connect->(
476 $rhost,
477 $rport,
478 sub {
460 $state{fh} = shift 479 $state{fh} = shift
461 or do { 480 or do {
462 my $err = "$!"; 481 my $err = "$!";
463 %state = (); 482 %state = ();
464 return $cb->(undef, { Status => 599, Reason => $err, @pseudo }); 483 return $cb->(undef, { Status => 599, Reason => $err, @pseudo });
465 }; 484 };
466 485
467 pop; # free memory, save a tree 486 pop; # free memory, save a tree
468 487
469 return unless delete $state{connect_guard}; 488 return unless delete $state{connect_guard};
470 489
471 # get handle 490 # get handle
472 $state{handle} = new AnyEvent::Handle 491 $state{handle} = new AnyEvent::Handle
473 fh => $state{fh}, 492 fh => $state{fh},
474 peername => $rhost, 493 peername => $rhost,
475 tls_ctx => $arg{tls_ctx}, 494 tls_ctx => $arg{tls_ctx},
476 # these need to be reconfigured on keepalive handles 495 # these need to be reconfigured on keepalive handles
477 timeout => $timeout, 496 timeout => $timeout,
478 on_error => sub { 497 on_error => sub {
479 %state = (); 498 %state = ();
480 $cb->(undef, { Status => 599, Reason => $_[2], @pseudo }); 499 $cb->(undef, { Status => 599, Reason => $_[2], @pseudo });
481 }, 500 },
482 on_eof => sub { 501 on_eof => sub {
483 %state = (); 502 %state = ();
484 $cb->(undef, { Status => 599, Reason => "Unexpected end-of-file", @pseudo }); 503 $cb->(undef, { Status => 599, Reason => "Unexpected end-of-file", @pseudo });
485 }, 504 },
486 ; 505 ;
487 506
488 # limit the number of persistent connections 507 # limit the number of persistent connections
489 # keepalive not yet supported 508 # keepalive not yet supported
490# if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) { 509# if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) {
491# ++$KA_COUNT{$_[1]}; 510# ++$KA_COUNT{$_[1]};
492# $state{handle}{ka_count_guard} = AnyEvent::Util::guard { 511# $state{handle}{ka_count_guard} = AnyEvent::Util::guard {
493# --$KA_COUNT{$_[1]} 512# --$KA_COUNT{$_[1]}
494# }; 513# };
495# $hdr{connection} = "keep-alive"; 514# $hdr{connection} = "keep-alive";
496# } else { 515# } else {
497 delete $hdr{connection}; 516 delete $hdr{connection};
498# } 517# }
499 518
500 $state{handle}->starttls ("connect") if $rscheme eq "https"; 519 $state{handle}->starttls ("connect") if $rscheme eq "https";
501 520
502 # handle actual, non-tunneled, request 521 # handle actual, non-tunneled, request
503 my $handle_actual_request = sub { 522 my $handle_actual_request = sub {
504 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls}; 523 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
505 524
506 # send request 525 # send request
507 $state{handle}->push_write ( 526 $state{handle}->push_write (
508 "$method $rpath HTTP/1.0\015\012" 527 "$method $rpath HTTP/1.0\015\012"
509 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr) 528 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
510 . "\015\012" 529 . "\015\012"
511 . (delete $arg{body}) 530 . (delete $arg{body})
512 ); 531 );
513 532
514 # return if error occured during push_write() 533 # return if error occured during push_write()
515 return unless %state; 534 return unless %state;
516 535
517 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use 536 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use
518 537
519 # status line and headers 538 # status line and headers
520 $state{handle}->push_read (line => $qr_nlnl, sub { 539 $state{handle}->push_read (line => $qr_nlnl, sub {
521 for ("$_[1]") { 540 for ("$_[1]") {
522 y/\015//d; # weed out any \015, as they show up in the weirdest of places. 541 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
523 542
524 /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )? \015?\012/igxc 543 /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )? \015?\012/igxc
525 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Invalid server response", @pseudo })); 544 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Invalid server response", @pseudo }));
526 545
527 push @pseudo, 546 push @pseudo,
528 HTTPVersion => $1, 547 HTTPVersion => $1,
529 Status => $2, 548 Status => $2,
530 Reason => $3, 549 Reason => $3,
531 ; 550 ;
532 551
533 # things seen, not parsed: 552 # things seen, not parsed:
534 # p3pP="NON CUR OTPi OUR NOR UNI" 553 # p3pP="NON CUR OTPi OUR NOR UNI"
535 554
536 $hdr{lc $1} .= ",$2" 555 $hdr{lc $1} .= ",$2"
537 while /\G 556 while /\G
538 ([^:\000-\037]*): 557 ([^:\000-\037]*):
539 [\011\040]* 558 [\011\040]*
540 ((?: [^\012]+ | \012[\011\040] )*) 559 ((?: [^\012]+ | \012[\011\040] )*)
541 \012 560 \012
542 /gxc; 561 /gxc;
543 562
544 /\G$/ 563 /\G$/
545 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Garbled response headers", @pseudo })); 564 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Garbled response headers", @pseudo }));
546 }
547
548 # remove the "," prefix we added to all headers above
549 substr $_, 0, 1, ""
550 for values %hdr;
551
552 # patch in all pseudo headers
553 %hdr = (%hdr, @pseudo);
554
555 # redirect handling
556 # microsoft and other shitheads don't give a shit for following standards,
557 # try to support some common forms of broken Location headers.
558 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
559 $hdr{location} =~ s/^\.\/+//;
560
561 my $url = "$rscheme://$uhost:$uport";
562
563 unless ($hdr{location} =~ s/^\///) {
564 $url .= $upath;
565 $url =~ s/\/[^\/]*$//;
566 } 565 }
567 566
567 # remove the "," prefix we added to all headers above
568 substr $_, 0, 1, ""
569 for values %hdr;
570
571 # patch in all pseudo headers
572 %hdr = (%hdr, @pseudo);
573
574 # redirect handling
575 # microsoft and other shitheads don't give a shit for following standards,
576 # try to support some common forms of broken Location headers.
577 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
578 $hdr{location} =~ s/^\.\/+//;
579
580 my $url = "$rscheme://$uhost:$uport";
581
582 unless ($hdr{location} =~ s/^\///) {
583 $url .= $upath;
584 $url =~ s/\/[^\/]*$//;
585 }
586
568 $hdr{location} = "$url/$hdr{location}"; 587 $hdr{location} = "$url/$hdr{location}";
569 }
570
571 my $redirect;
572
573 if ($recurse) {
574 my $status = $hdr{Status};
575
576 # industry standard is to redirect POST as GET for
577 # 301, 302 and 303, in contrast to http/1.0 and 1.1.
578 # also, the UA should ask the user for 301 and 307 and POST,
579 # industry standard seems to be to simply follow.
580 # we go with the industry standard.
581 if ($status == 301 or $status == 302 or $status == 303) {
582 # HTTP/1.1 is unclear on how to mutate the method
583 $method = "GET" unless $method eq "HEAD";
584 $redirect = 1;
585 } elsif ($status == 307) {
586 $redirect = 1;
587 } 588 }
588 }
589 589
590 my $finish = sub { 590 my $redirect;
591 $state{handle}->destroy if $state{handle};
592 %state = ();
593 591
594 # set-cookie processing 592 if ($recurse) {
595 if ($arg{cookie_jar}) { 593 my $status = $hdr{Status};
596 for ($_[1]{"set-cookie"}) {
597 # parse NAME=VALUE
598 my @kv;
599 594
600 while (/\G\s* ([^=;,[:space:]]+) \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) )/gcxs) { 595 # industry standard is to redirect POST as GET for
596 # 301, 302 and 303, in contrast to http/1.0 and 1.1.
597 # also, the UA should ask the user for 301 and 307 and POST,
598 # industry standard seems to be to simply follow.
599 # we go with the industry standard.
600 if ($status == 301 or $status == 302 or $status == 303) {
601 # HTTP/1.1 is unclear on how to mutate the method
602 $method = "GET" unless $method eq "HEAD";
601 my $name = $1; 603 $redirect = 1;
602 my $value = $3; 604 } elsif ($status == 307) {
603
604 unless ($value) {
605 $value = $2;
606 $value =~ s/\\(.)/$1/gs;
607 }
608
609 push @kv, $name => $value;
610
611 last unless /\G\s*;/gc;
612 }
613
614 last unless @kv;
615
616 my $name = shift @kv;
617 my %kv = (value => shift @kv, @kv);
618
619 my $cdom;
620 my $cpath = (delete $kv{path}) || "/";
621
622 if (exists $kv{domain}) {
623 $cdom = delete $kv{domain};
624
625 $cdom =~ s/^\.?/./; # make sure it starts with a "."
626
627 next if $cdom =~ /\.$/;
628
629 # this is not rfc-like and not netscape-like. go figure.
630 my $ndots = $cdom =~ y/.//;
631 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2);
632 } else {
633 $cdom = $uhost;
634 }
635
636 # store it 605 $redirect = 1;
637 $arg{cookie_jar}{version} = 1;
638 $arg{cookie_jar}{$cdom}{$cpath}{$name} = \%kv;
639
640 redo if /\G\s*,/gc;
641 } 606 }
642 } 607 }
643 608
609 my $finish = sub {
610 $state{handle}->destroy if $state{handle};
611 %state = ();
612
613 # set-cookie processing
614 if ($arg{cookie_jar}) {
615 for ($_[1]{"set-cookie"}) {
616 # parse NAME=VALUE
617 my @kv;
618
619 while (/\G\s* ([^=;,[:space:]]+) \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) )/gcxs) {
620 my $name = $1;
621 my $value = $3;
622
623 unless ($value) {
624 $value = $2;
625 $value =~ s/\\(.)/$1/gs;
626 }
627
628 push @kv, $name => $value;
629
630 last unless /\G\s*;/gc;
631 }
632
633 last unless @kv;
634
635 my $name = shift @kv;
636 my %kv = (value => shift @kv, @kv);
637
638 my $cdom;
639 my $cpath = (delete $kv{path}) || "/";
640
641 if (exists $kv{domain}) {
642 $cdom = delete $kv{domain};
643
644 $cdom =~ s/^\.?/./; # make sure it starts with a "."
645
646 next if $cdom =~ /\.$/;
647
648 # this is not rfc-like and not netscape-like. go figure.
649 my $ndots = $cdom =~ y/.//;
650 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2);
651 } else {
652 $cdom = $uhost;
653 }
654
655 # store it
656 $arg{cookie_jar}{version} = 1;
657 $arg{cookie_jar}{$cdom}{$cpath}{$name} = \%kv;
658
659 redo if /\G\s*,/gc;
660 }
661 }
662
644 if ($redirect && exists $hdr{location}) { 663 if ($redirect && exists $hdr{location}) {
645 # we ignore any errors, as it is very common to receive 664 # we ignore any errors, as it is very common to receive
646 # Content-Length != 0 but no actual body 665 # Content-Length != 0 but no actual body
647 # we also access %hdr, as $_[1] might be an erro 666 # we also access %hdr, as $_[1] might be an erro
648 http_request ( 667 http_request (
649 $method => $hdr{location}, 668 $method => $hdr{location},
650 %arg, 669 %arg,
651 recurse => $recurse - 1, 670 recurse => $recurse - 1,
652 Redirect => \@_, 671 Redirect => \@_,
653 $cb); 672 $cb);
673 } else {
674 $cb->($_[0], $_[1]);
675 }
676 };
677
678 my $len = $hdr{"content-length"};
679
680 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
681 $finish->(undef, { Status => 598, Reason => "Request cancelled by on_header", @pseudo });
682 } elsif (
683 $hdr{Status} =~ /^(?:1..|[23]04)$/
684 or $method eq "HEAD"
685 or (defined $len && !$len)
686 ) {
687 # no body
688 $finish->("", \%hdr);
654 } else { 689 } else {
655 $cb->($_[0], $_[1]);
656 }
657 };
658
659 my $len = $hdr{"content-length"};
660
661 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
662 $finish->(undef, { Status => 598, Reason => "Request cancelled by on_header", @pseudo });
663 } elsif (
664 $hdr{Status} =~ /^(?:1..|[23]04)$/
665 or $method eq "HEAD"
666 or (defined $len && !$len)
667 ) {
668 # no body
669 $finish->("", \%hdr);
670 } else {
671 # body handling, four different code paths 690 # body handling, four different code paths
672 # for want_body_handle, on_body (2x), normal (2x) 691 # for want_body_handle, on_body (2x), normal (2x)
673 # we might read too much here, but it does not matter yet (no pers. connections) 692 # we might read too much here, but it does not matter yet (no pers. connections)
674 if (!$redirect && $arg{want_body_handle}) { 693 if (!$redirect && $arg{want_body_handle}) {
675 $_[0]->on_eof (undef); 694 $_[0]->on_eof (undef);
676 $_[0]->on_error (undef); 695 $_[0]->on_error (undef);
677 $_[0]->on_read (undef); 696 $_[0]->on_read (undef);
678 697
679 $finish->(delete $state{handle}, \%hdr); 698 $finish->(delete $state{handle}, \%hdr);
680 699
681 } elsif ($arg{on_body}) { 700 } elsif ($arg{on_body}) {
682 $_[0]->on_error (sub { $finish->(undef, { Status => 599, Reason => $_[2], @pseudo }) }); 701 $_[0]->on_error (sub { $finish->(undef, { Status => 599, Reason => $_[2], @pseudo }) });
683 if ($len) { 702 if ($len) {
703 $_[0]->on_eof (undef);
704 $_[0]->on_read (sub {
705 $len -= length $_[0]{rbuf};
706
707 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
708 or $finish->(undef, { Status => 598, Reason => "Request cancelled by on_body", @pseudo });
709
710 $len > 0
711 or $finish->("", \%hdr);
712 });
713 } else {
714 $_[0]->on_eof (sub {
715 $finish->("", \%hdr);
716 });
717 $_[0]->on_read (sub {
718 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
719 or $finish->(undef, { Status => 598, Reason => "Request cancelled by on_body", @pseudo });
720 });
721 }
722 } else {
684 $_[0]->on_eof (undef); 723 $_[0]->on_eof (undef);
685 $_[0]->on_read (sub {
686 $len -= length $_[0]{rbuf};
687 724
688 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
689 or $finish->(undef, { Status => 598, Reason => "Request cancelled by on_body", @pseudo });
690
691 $len > 0
692 or $finish->("", \%hdr);
693 });
694 } else {
695 $_[0]->on_eof (sub {
696 $finish->("", \%hdr);
697 });
698 $_[0]->on_read (sub {
699 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
700 or $finish->(undef, { Status => 598, Reason => "Request cancelled by on_body", @pseudo });
701 });
702 }
703 } else {
704 $_[0]->on_eof (undef);
705
706 if ($len) { 725 if ($len) {
707 $_[0]->on_error (sub { $finish->(undef, { Status => 599, Reason => $_[2], @pseudo }) }); 726 $_[0]->on_error (sub { $finish->(undef, { Status => 599, Reason => $_[2], @pseudo }) });
708 $_[0]->on_read (sub { 727 $_[0]->on_read (sub {
709 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), \%hdr) 728 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), \%hdr)
710 if $len <= length $_[0]{rbuf}; 729 if $len <= length $_[0]{rbuf};
711 }); 730 });
712 } else { 731 } else {
713 $_[0]->on_error (sub { 732 $_[0]->on_error (sub {
714 ($! == Errno::EPIPE || !$!) 733 ($! == Errno::EPIPE || !$!)
715 ? $finish->(delete $_[0]{rbuf}, \%hdr) 734 ? $finish->(delete $_[0]{rbuf}, \%hdr)
716 : $finish->(undef, { Status => 599, Reason => $_[2], @pseudo }); 735 : $finish->(undef, { Status => 599, Reason => $_[2], @pseudo });
717 }); 736 });
718 $_[0]->on_read (sub { }); 737 $_[0]->on_read (sub { });
738 }
719 } 739 }
720 } 740 }
721 } 741 });
722 }); 742 };
723 };
724 743
725 # now handle proxy-CONNECT method 744 # now handle proxy-CONNECT method
726 if ($proxy && $uscheme eq "https") { 745 if ($proxy && $uscheme eq "https") {
727 # oh dear, we have to wrap it into a connect request 746 # oh dear, we have to wrap it into a connect request
728 747
729 # maybe re-use $uauthority with patched port? 748 # maybe re-use $uauthority with patched port?
730 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012Host: $uhost\015\012\015\012"); 749 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012Host: $uhost\015\012\015\012");
731 $state{handle}->push_read (line => $qr_nlnl, sub { 750 $state{handle}->push_read (line => $qr_nlnl, sub {
732 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix 751 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
733 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Invalid proxy connect response ($_[1])", @pseudo })); 752 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Invalid proxy connect response ($_[1])", @pseudo }));
734 753
735 if ($2 == 200) { 754 if ($2 == 200) {
736 $rpath = $upath; 755 $rpath = $upath;
737 &$handle_actual_request; 756 &$handle_actual_request;
738 } else { 757 } else {
739 %state = (); 758 %state = ();
740 $cb->(undef, { Status => $2, Reason => $3, @pseudo }); 759 $cb->(undef, { Status => $2, Reason => $3, @pseudo });
760 }
741 } 761 });
742 });
743 } else { 762 } else {
744 &$handle_actual_request; 763 &$handle_actual_request;
764 }
765
745 } 766 },
746
747 }, $arg{on_prepare} || sub { $timeout }; 767 $arg{on_prepare} || sub { $timeout }
768 );
748 }; 769 };
749 770
750 defined wantarray && AnyEvent::Util::guard { %state = () } 771 defined wantarray && AnyEvent::Util::guard { %state = () }
751} 772}
752 773
787string of the form C<http://host:port> (optionally C<https:...>), croaks 808string of the form C<http://host:port> (optionally C<https:...>), croaks
788otherwise. 809otherwise.
789 810
790To clear an already-set proxy, use C<undef>. 811To clear an already-set proxy, use C<undef>.
791 812
813=item $date = AnyEvent::HTTP::format_date $timestamp
814
815Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP
816Date (RFC 2616).
817
818=item $timestamp = AnyEvent::HTTP::parse_date $date
819
820Takes a HTTP Date (RFC 2616) and returns the corresponding POSIX
821timestamp, or C<undef> if the date cannot be parsed.
822
792=item $AnyEvent::HTTP::MAX_RECURSE 823=item $AnyEvent::HTTP::MAX_RECURSE
793 824
794The default value for the C<recurse> request parameter (default: C<10>). 825The default value for the C<recurse> request parameter (default: C<10>).
795 826
796=item $AnyEvent::HTTP::USERAGENT 827=item $AnyEvent::HTTP::USERAGENT
814connections. This number of can be useful for load-leveling. 845connections. This number of can be useful for load-leveling.
815 846
816=back 847=back
817 848
818=cut 849=cut
850
851our @month = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
852our @weekday = qw(Sun Mon Tue Wed Thu Fri Sat);
853
854sub format_date($) {
855 my ($time) = @_;
856
857 # RFC 822/1123 format
858 my ($S, $M, $H, $mday, $mon, $year, $wday, $yday, undef) = gmtime $time;
859
860 sprintf "%s, %02d %s %04d %02d:%02d:%02d GMT",
861 $weekday[$wday], $mday, $month[$mon], $year + 1900,
862 $H, $M, $S;
863}
864
865sub parse_date($) {
866 my ($date) = @_;
867
868 my ($d, $m, $y, $H, $M, $S);
869
870 if ($date =~ /^[A-Z][a-z][a-z], ([0-9][0-9]) ([A-Z][a-z][a-z]) ([0-9][0-9][0-9][0-9]) ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) GMT$/) {
871 # RFC 822/1123, required by RFC 2616
872 ($d, $m, $y, $H, $M, $S) = ($1, $2, $3, $4, $5, $6);
873
874 } elsif ($date =~ /^[A-Z][a-z]+, ([0-9][0-9])-([A-Z][a-z][a-z])-([0-9][0-9]) ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) GMT$/) {
875 # RFC 850
876 ($d, $m, $y, $H, $M, $S) = ($1, $2, $3 < 69 ? $3 + 2000 : $3 + 1900, $4, $5, $6);
877
878 } elsif ($date =~ /^[A-Z][a-z][a-z] ([A-Z][a-z][a-z]) ([0-9 ][0-9]) ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ([0-9][0-9][0-9][0-9])$/) {
879 # ISO C's asctime
880 ($d, $m, $y, $H, $M, $S) = ($2, $1, $6, $3, $4, $5);
881 }
882 # other formats fail in the loop below
883
884 for (0..11) {
885 if ($m eq $month[$_]) {
886 require Time::Local;
887 return Time::Local::timegm ($S, $M, $H, $d, $_, $y);
888 }
889 }
890
891 undef
892}
819 893
820sub set_proxy($) { 894sub set_proxy($) {
821 if (length $_[0]) { 895 if (length $_[0]) {
822 $_[0] =~ m%^(https?):// ([^:/]+) (?: : (\d*) )?%ix 896 $_[0] =~ m%^(https?):// ([^:/]+) (?: : (\d*) )?%ix
823 or Carp::croak "$_[0]: invalid proxy URL"; 897 or Carp::croak "$_[0]: invalid proxy URL";
830# initialise proxy from environment 904# initialise proxy from environment
831eval { 905eval {
832 set_proxy $ENV{http_proxy}; 906 set_proxy $ENV{http_proxy};
833}; 907};
834 908
909=head2 SOCKS PROXIES
910
911Socks proxies are not directly supported by AnyEvent::HTTP. You can
912compile your perl to support socks, or use an external program such as
913F<socksify> (dante) or F<tsocks> to make your program use a socks proxy
914transparently.
915
916Alternatively, for AnyEvent::HTTP only, you can use your own
917C<tcp_connect> function that does the proxy handshake - here is an example
918that works with socks4a proxies:
919
920 use Errno;
921 use AnyEvent::Util;
922 use AnyEvent::Socket;
923 use AnyEvent::Handle;
924
925 # host, port and username of/for your socks4a proxy
926 my $socks_host = "10.0.0.23";
927 my $socks_port = 9050;
928 my $socks_user = "";
929
930 sub socks4a_connect {
931 my ($host, $port, $connect_cb, $prepare_cb) = @_;
932
933 my $hdl = new AnyEvent::Handle
934 connect => [$socks_host, $socks_port],
935 on_prepare => sub { $prepare_cb->($_[0]{fh}) },
936 on_error => sub { $connect_cb->() },
937 ;
938
939 $hdl->push_write (pack "CCnNZ*Z*", 4, 1, $port, 1, $socks_user, $host);
940
941 $hdl->push_read (chunk => 8, sub {
942 my ($hdl, $chunk) = @_;
943 my ($status, $port, $ipn) = unpack "xCna4", $chunk;
944
945 if ($status == 0x5a) {
946 $connect_cb->($hdl->{fh}, (format_address $ipn) . ":$port");
947 } else {
948 $! = Errno::ENXIO; $connect_cb->();
949 }
950 });
951
952 $hdl
953 }
954
955Use C<socks4a_connect> instead of C<tcp_connect> when doing C<http_request>s,
956possibly after switching off other proxy types:
957
958 AnyEvent::HTTP::set_proxy undef; # usually you do not want other proxies
959
960 http_get 'http://www.google.com', tcp_connect => \&socks4a_connect, sub {
961 my ($data, $headers) = @_;
962 ...
963 };
964
835=head1 SEE ALSO 965=head1 SEE ALSO
836 966
837L<AnyEvent>. 967L<AnyEvent>.
838 968
839=head1 AUTHOR 969=head1 AUTHOR

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines