ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.79 by root, Sat Jan 1 20:01:07 2011 UTC vs.
Revision 1.89 by root, Mon Jan 3 00:23:25 2011 UTC

36 36
37=cut 37=cut
38 38
39package AnyEvent::HTTP; 39package AnyEvent::HTTP;
40 40
41use strict; 41use common::sense;
42no warnings;
43 42
44use Errno (); 43use Errno ();
45 44
46use AnyEvent 5.0 (); 45use AnyEvent 5.0 ();
47use AnyEvent::Util (); 46use AnyEvent::Util ();
58our $MAX_PERSISTENT = 8; 57our $MAX_PERSISTENT = 8;
59our $PERSISTENT_TIMEOUT = 2; 58our $PERSISTENT_TIMEOUT = 2;
60our $TIMEOUT = 300; 59our $TIMEOUT = 300;
61 60
62# changing these is evil 61# changing these is evil
63our $MAX_PERSISTENT_PER_HOST = 0; 62our $MAX_PERSISTENT_PER_HOST = 2;
64our $MAX_PER_HOST = 4; 63our $MAX_PER_HOST = 4;
65 64
66our $PROXY; 65our $PROXY;
67our $ACTIVE = 0; 66our $ACTIVE = 0;
68 67
196=item cookie_jar => $hash_ref 195=item cookie_jar => $hash_ref
197 196
198Passing this parameter enables (simplified) cookie-processing, loosely 197Passing this parameter enables (simplified) cookie-processing, loosely
199based on the original netscape specification. 198based on the original netscape specification.
200 199
201The C<$hash_ref> must be an (initially empty) hash reference which will 200The C<$hash_ref> must be an (initially empty) hash reference which
202get updated automatically. It is possible to save the cookie jar to 201will get updated automatically. It is possible to save the cookie jar
203persistent storage with something like JSON or Storable, but this is not 202to persistent storage with something like JSON or Storable - see the
204recommended, as session-only cookies might survive longer than expected. 203C<AnyEvent::HTTP::cookie_jar_expire> function if you wish to remove
204expired or session-only cookies, and also for documentation on the format
205of the cookie jar.
205 206
206Note that this cookie implementation is not meant to be complete. If 207Note that this cookie implementation is not meant to be complete. If
207you want complete cookie management you have to do that on your 208you want complete cookie management you have to do that on your
208own. C<cookie_jar> is meant as a quick fix to get some cookie-using sites 209own. C<cookie_jar> is meant as a quick fix to get most cookie-using sites
209working. Cookies are a privacy disaster, do not use them unless required 210working. Cookies are a privacy disaster, do not use them unless required
210to. 211to.
211 212
212When cookie processing is enabled, the C<Cookie:> and C<Set-Cookie:> 213When cookie processing is enabled, the C<Cookie:> and C<Set-Cookie:>
213headers will be set and handled by this module, otherwise they will be 214headers will be set and handled by this module, otherwise they will be
378 push @{ $CO_SLOT{$_[0]}[1] }, $_[1]; 379 push @{ $CO_SLOT{$_[0]}[1] }, $_[1];
379 380
380 _slot_schedule $_[0]; 381 _slot_schedule $_[0];
381} 382}
382 383
384#############################################################################
385
386# expire cookies
387sub cookie_jar_expire($;$) {
388 my ($jar, $session_end) = @_;
389
390 %$jar = () if $jar->{version} != 1;
391
392 my $anow = AE::now;
393
394 while (my ($chost, $paths) = each %$jar) {
395 next unless ref $paths;
396
397 while (my ($cpath, $cookies) = each %$paths) {
398 while (my ($cookie, $kv) = each %$cookies) {
399 if (exists $kv->{_expires}) {
400 delete $cookies->{$cookie}
401 if $anow > $kv->{_expires};
402 } elsif ($session_end) {
403 delete $cookies->{$cookie};
404 }
405 }
406
407 delete $paths->{$cpath}
408 unless %$cookies;
409 }
410
411 delete $jar->{$chost}
412 unless %$paths;
413 }
414}
415
383# extract cookies from jar 416# extract cookies from jar
384sub cookie_jar_extract($$$$) { 417sub cookie_jar_extract($$$$) {
385 my ($jar, $uscheme, $uhost, $upath) = @_; 418 my ($jar, $uscheme, $uhost, $upath) = @_;
386 419
387 %$jar = () if $jar->{version} != 1; 420 %$jar = () if $jar->{version} != 1;
403 next unless $cpath eq substr $upath, 0, length $cpath; 436 next unless $cpath eq substr $upath, 0, length $cpath;
404 437
405 while (my ($cookie, $kv) = each %$cookies) { 438 while (my ($cookie, $kv) = each %$cookies) {
406 next if $uscheme ne "https" && exists $kv->{secure}; 439 next if $uscheme ne "https" && exists $kv->{secure};
407 440
408 if (exists $kv->{expires}) { 441 if (exists $kv->{_expires} and AE::now > $kv->{_expires}) {
409 if (AE::now > parse_date ($kv->{expires})) {
410 delete $cookies->{$cookie}; 442 delete $cookies->{$cookie};
411 next; 443 next;
412 }
413 } 444 }
414 445
415 my $value = $kv->{value}; 446 my $value = $kv->{value};
416 447
417 if ($value =~ /[=;,[:space:]]/) { 448 if ($value =~ /[=;,[:space:]]/) {
426 457
427 \@cookies 458 \@cookies
428} 459}
429 460
430# parse set_cookie header into jar 461# parse set_cookie header into jar
431sub cookie_jar_set_cookie($$$) { 462sub cookie_jar_set_cookie($$$$) {
432 my ($jar, $set_cookie, $uhost) = @_; 463 my ($jar, $set_cookie, $uhost, $date) = @_;
464
465 my $anow = int AE::now;
466 my $snow; # server-now
433 467
434 for ($set_cookie) { 468 for ($set_cookie) {
435 # parse NAME=VALUE 469 # parse NAME=VALUE
436 my @kv; 470 my @kv;
437 471
440 while ( 474 while (
441 m{ 475 m{
442 \G\s* 476 \G\s*
443 (?: 477 (?:
444 expires \s*=\s* ([A-Z][a-z][a-z]+,\ [^,;]+) 478 expires \s*=\s* ([A-Z][a-z][a-z]+,\ [^,;]+)
445 | ([^=;,[:space:]]+) \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) ) 479 | ([^=;,[:space:]]+) (?: \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) ) )?
446 ) 480 )
447 }gcxsi 481 }gcxsi
448 ) { 482 ) {
449 my $name = $2; 483 my $name = $2;
450 my $value = $4; 484 my $value = $4;
451 485
452 unless (defined $name) { 486 if (defined $1) {
453 # expires 487 # expires
454 $name = "expires"; 488 $name = "expires";
455 $value = $1; 489 $value = $1;
456 } elsif (!defined $value) { 490 } elsif (defined $3) {
457 # quoted 491 # quoted
458 $value = $3; 492 $value = $3;
459 $value =~ s/\\(.)/$1/gs; 493 $value =~ s/\\(.)/$1/gs;
460 } 494 }
461 495
467 last unless @kv; 501 last unless @kv;
468 502
469 my $name = shift @kv; 503 my $name = shift @kv;
470 my %kv = (value => shift @kv, @kv); 504 my %kv = (value => shift @kv, @kv);
471 505
472 $kv{expires} ||= format_date (AE::now + $kv{"max-age"})
473 if exists $kv{"max-age"}; 506 if (exists $kv{"max-age"}) {
507 $kv{_expires} = $anow + delete $kv{"max-age"};
508 } elsif (exists $kv{expires}) {
509 $snow ||= parse_date ($date) || $anow;
510 $kv{_expires} = $anow + (parse_date (delete $kv{expires}) - $snow);
511 } else {
512 delete $kv{_expires};
513 }
474 514
475 my $cdom; 515 my $cdom;
476 my $cpath = (delete $kv{path}) || "/"; 516 my $cpath = (delete $kv{path}) || "/";
477 517
478 if (exists $kv{domain}) { 518 if (exists $kv{domain}) {
489 $cdom = $uhost; 529 $cdom = $uhost;
490 } 530 }
491 531
492 # store it 532 # store it
493 $jar->{version} = 1; 533 $jar->{version} = 1;
494 $jar->{$cdom}{$cpath}{$name} = \%kv; 534 $jar->{lc $cdom}{$cpath}{$name} = \%kv;
495 535
496 redo if /\G\s*,/gc; 536 redo if /\G\s*,/gc;
497 } 537 }
498} 538}
499 539
566 : return $cb->(undef, { @pseudo, Status => 599, Reason => "Only http and https URL schemes supported" }); 606 : return $cb->(undef, { @pseudo, Status => 599, Reason => "Only http and https URL schemes supported" });
567 607
568 $uauthority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x 608 $uauthority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x
569 or return $cb->(undef, { @pseudo, Status => 599, Reason => "Unparsable URL" }); 609 or return $cb->(undef, { @pseudo, Status => 599, Reason => "Unparsable URL" });
570 610
571 my $uhost = $1; 611 my $uhost = lc $1;
572 $uport = $2 if defined $2; 612 $uport = $2 if defined $2;
573 613
574 $hdr{host} = defined $2 ? "$uhost:$2" : "$uhost" 614 $hdr{host} = defined $2 ? "$uhost:$2" : "$uhost"
575 unless exists $hdr{host}; 615 unless exists $hdr{host};
576 616
595 $rscheme = "http" unless defined $rscheme; 635 $rscheme = "http" unless defined $rscheme;
596 636
597 # don't support https requests over https-proxy transport, 637 # don't support https requests over https-proxy transport,
598 # can't be done with tls as spec'ed, unless you double-encrypt. 638 # can't be done with tls as spec'ed, unless you double-encrypt.
599 $rscheme = "http" if $uscheme eq "https" && $rscheme eq "https"; 639 $rscheme = "http" if $uscheme eq "https" && $rscheme eq "https";
640
641 $rhost = lc $rhost;
642 $rscheme = lc $rscheme;
600 } else { 643 } else {
601 ($rhost, $rport, $rscheme, $rpath) = ($uhost, $uport, $uscheme, $upath); 644 ($rhost, $rport, $rscheme, $rpath) = ($uhost, $uport, $uscheme, $upath);
602 } 645 }
603 646
604 # leave out fragment and query string, just a heuristic 647 # leave out fragment and query string, just a heuristic
606 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"}; 649 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"};
607 650
608 $hdr{"content-length"} = length $arg{body} 651 $hdr{"content-length"} = length $arg{body}
609 if length $arg{body} || $method ne "GET"; 652 if length $arg{body} || $method ne "GET";
610 653
611 $hdr{connection} = "close TE"; #1.1 654 my $idempotent = $method =~ /^(?:GET|HEAD|PUT|DELETE|OPTIONS|TRACE)$/;
655
656 # default value for keepalive is true iff the request is for an idempotent method
657 my $keepalive = exists $arg{keepalive}
658 ? $arg{keepalive}*1
659 : $idempotent ? $PERSISTENT_TIMEOUT : 0;
660
661 $hdr{connection} = ($keepalive ? "" : "close ") . "Te"; #1.1
612 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1 662 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1
613 663
614 my %state = (connect_guard => 1); 664 my %state = (connect_guard => 1);
615 665
616 _get_slot $uhost, sub {
617 $state{slot_guard} = shift;
618
619 return unless $state{connect_guard};
620
621 my $ae_error = 595; # connecting 666 my $ae_error = 595; # connecting
622 667
623 my $connect_cb = sub { 668 # handle actual, non-tunneled, request
624 $state{fh} = shift 669 my $handle_actual_request = sub {
670 $ae_error = 596; # request phase
671
672 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
673
674 # send request
675 $state{handle}->push_write (
676 "$method $rpath HTTP/1.1\015\012"
677 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
678 . "\015\012"
679 . (delete $arg{body})
680 );
681
682 # return if error occured during push_write()
683 return unless %state;
684
685 # reduce memory usage, save a kitten, also re-use it for the response headers.
686 %hdr = ();
687
688 # status line and headers
689 $state{read_response} = sub {
690 for ("$_[1]") {
691 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
692
693 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci
694 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" }));
695
696 # 100 Continue handling
697 # should not happen as we don't send expect: 100-continue,
698 # but we handle it just in case.
699 # since we send the request body regardless, if we get an error
700 # we are out of-sync, which we currently do NOT handle correctly.
701 return $state{handle}->push_read (line => $qr_nlnl, $state{read_response})
702 if $2 eq 100;
703
704 push @pseudo,
705 HTTPVersion => $1,
706 Status => $2,
707 Reason => $3,
625 or do { 708 ;
626 my $err = "$!"; 709
710 my $hdr = parse_hdr
711 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" }));
712
713 %hdr = (%$hdr, @pseudo);
714 }
715
716 # redirect handling
717 # microsoft and other shitheads don't give a shit for following standards,
718 # try to support some common forms of broken Location headers.
719 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
720 $hdr{location} =~ s/^\.\/+//;
721
722 my $url = "$rscheme://$uhost:$uport";
723
724 unless ($hdr{location} =~ s/^\///) {
725 $url .= $upath;
726 $url =~ s/\/[^\/]*$//;
727 }
728
729 $hdr{location} = "$url/$hdr{location}";
730 }
731
732 my $redirect;
733
734 if ($recurse) {
735 my $status = $hdr{Status};
736
737 # industry standard is to redirect POST as GET for
738 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
739 # also, the UA should ask the user for 301 and 307 and POST,
740 # industry standard seems to be to simply follow.
741 # we go with the industry standard.
742 if ($status == 301 or $status == 302 or $status == 303) {
743 # HTTP/1.1 is unclear on how to mutate the method
744 $method = "GET" unless $method eq "HEAD";
745 $redirect = 1;
746 } elsif ($status == 307) {
747 $redirect = 1;
748 }
749 }
750
751 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
752 my $may_keep_alive = $_[3];
753
754 $state{handle}->destroy if $state{handle};
627 %state = (); 755 %state = ();
628 return $cb->(undef, { @pseudo, Status => $ae_error, Reason => $err }); 756
757 if (defined $_[1]) {
758 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
759 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
760 }
761
762 # set-cookie processing
763 if ($arg{cookie_jar}) {
764 cookie_jar_set_cookie $arg{cookie_jar}, $hdr{"set-cookie"}, $uhost, $hdr{date};
765 }
766
767 if ($redirect && exists $hdr{location}) {
768 # we ignore any errors, as it is very common to receive
769 # Content-Length != 0 but no actual body
770 # we also access %hdr, as $_[1] might be an erro
771 http_request (
772 $method => $hdr{location},
773 %arg,
774 recurse => $recurse - 1,
775 Redirect => [$_[0], \%hdr],
776 $cb);
777 } else {
778 $cb->($_[0], \%hdr);
779 }
780 };
781
782 $ae_error = 597; # body phase
783
784 my $len = $hdr{"content-length"};
785
786 # body handling, many different code paths
787 # - no body expected
788 # - want_body_handle
789 # - te chunked
790 # - 2x length known (with or without on_body)
791 # - 2x length not known (with or without on_body)
792 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
793 $finish->(undef, 598 => "Request cancelled by on_header");
794 } elsif (
795 $hdr{Status} =~ /^(?:1..|204|205|304)$/
796 or $method eq "HEAD"
797 or (defined $len && $len == 0) # == 0, not !, because "0 " is true
798 ) {
799 # no body
800 $finish->("", undef, undef, 1);
801
802 } elsif (!$redirect && $arg{want_body_handle}) {
803 $_[0]->on_eof (undef);
804 $_[0]->on_error (undef);
805 $_[0]->on_read (undef);
806
807 $finish->(delete $state{handle});
808
809 } elsif ($hdr{"transfer-encoding"} =~ /\bchunked\b/i) {
810 my $cl = 0;
811 my $body = undef;
812 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
813
814 $state{read_chunk} = sub {
815 $_[1] =~ /^([0-9a-fA-F]+)/
816 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
817
818 my $len = hex $1;
819
820 if ($len) {
821 $cl += $len;
822
823 $_[0]->push_read (chunk => $len, sub {
824 $on_body->($_[1], \%hdr)
825 or return $finish->(undef, 598 => "Request cancelled by on_body");
826
827 $_[0]->push_read (line => sub {
828 length $_[1]
829 and return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
830 $_[0]->push_read (line => $state{read_chunk});
831 });
832 });
833 } else {
834 $hdr{"content-length"} ||= $cl;
835
836 $_[0]->push_read (line => $qr_nlnl, sub {
837 if (length $_[1]) {
838 for ("$_[1]") {
839 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
840
841 my $hdr = parse_hdr
842 or return $finish->(undef, $ae_error => "Garbled response trailers");
843
844 %hdr = (%hdr, %$hdr);
845 }
846 }
847
848 $finish->($body, undef, undef, 1);
849 });
850 }
629 }; 851 };
630 852
853 $_[0]->push_read (line => $state{read_chunk});
854
855 } elsif ($arg{on_body}) {
856 if (defined $len) {
857 $_[0]->on_read (sub {
858 $len -= length $_[0]{rbuf};
859
860 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
861 or return $finish->(undef, 598 => "Request cancelled by on_body");
862
863 $len > 0
864 or $finish->("", undef, undef, 1);
865 });
866 } else {
867 $_[0]->on_eof (sub {
868 $finish->("");
869 });
870 $_[0]->on_read (sub {
871 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
872 or $finish->(undef, 598 => "Request cancelled by on_body");
873 });
874 }
875 } else {
876 $_[0]->on_eof (undef);
877
878 if (defined $len) {
879 $_[0]->on_read (sub {
880 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), undef, undef, 1)
881 if $len <= length $_[0]{rbuf};
882 });
883 } else {
884 $_[0]->on_error (sub {
885 ($! == Errno::EPIPE || !$!)
886 ? $finish->(delete $_[0]{rbuf})
887 : $finish->(undef, $ae_error => $_[2]);
888 });
889 $_[0]->on_read (sub { });
890 }
891 }
892 };
893
894 $state{handle}->push_read (line => $qr_nlnl, $state{read_response});
895 };
896
897 my $connect_cb = sub {
898 $state{fh} = shift
899 or do {
900 my $err = "$!";
901 %state = ();
902 return $cb->(undef, { @pseudo, Status => $ae_error, Reason => $err });
903 };
904
631 return unless delete $state{connect_guard}; 905 return unless delete $state{connect_guard};
632 906
633 # get handle 907 # get handle
634 $state{handle} = new AnyEvent::Handle 908 $state{handle} = new AnyEvent::Handle
635 fh => $state{fh}, 909 fh => $state{fh},
636 peername => $rhost, 910 peername => $rhost,
637 tls_ctx => $arg{tls_ctx}, 911 tls_ctx => $arg{tls_ctx},
638 # these need to be reconfigured on keepalive handles 912 # these need to be reconfigured on keepalive handles
639 timeout => $timeout, 913 timeout => $timeout,
640 on_error => sub { 914 on_error => sub {
641 %state = (); 915 %state = ();
642 $cb->(undef, { @pseudo, Status => $ae_error, Reason => $_[2] }); 916 $cb->(undef, { @pseudo, Status => $ae_error, Reason => $_[2] });
643 }, 917 },
644 on_eof => sub { 918 on_eof => sub {
645 %state = (); 919 %state = ();
646 $cb->(undef, { @pseudo, Status => $ae_error, Reason => "Unexpected end-of-file" }); 920 $cb->(undef, { @pseudo, Status => $ae_error, Reason => "Unexpected end-of-file" });
647 }, 921 },
648 ; 922 ;
649 923
650 # limit the number of persistent connections 924 # limit the number of persistent connections
651 # keepalive not yet supported 925 # keepalive not yet supported
652# if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) { 926# if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) {
653# ++$KA_COUNT{$_[1]}; 927# ++$KA_COUNT{$_[1]};
654# $state{handle}{ka_count_guard} = AnyEvent::Util::guard { 928# $state{handle}{ka_count_guard} = AnyEvent::Util::guard {
655# --$KA_COUNT{$_[1]} 929# --$KA_COUNT{$_[1]}
656# }; 930# };
657# $hdr{connection} = "keep-alive"; 931# $hdr{connection} = "keep-alive";
658# } 932# }
659 933
660 $state{handle}->starttls ("connect") if $rscheme eq "https"; 934 $state{handle}->starttls ("connect") if $rscheme eq "https";
661 935
662 # handle actual, non-tunneled, request
663 my $handle_actual_request = sub {
664 $ae_error = 596; # request phase
665
666 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
667
668 # send request
669 $state{handle}->push_write (
670 "$method $rpath HTTP/1.1\015\012"
671 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
672 . "\015\012"
673 . (delete $arg{body})
674 );
675
676 # return if error occured during push_write()
677 return unless %state;
678
679 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use
680
681 # status line and headers
682 $state{read_response} = sub {
683 for ("$_[1]") {
684 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
685
686 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci
687 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" }));
688
689 # 100 Continue handling
690 # should not happen as we don't send expect: 100-continue,
691 # but we handle it just in case.
692 # since we send the request body regardless, if we get an error
693 # we are out of-sync, which we currently do NOT handle correctly.
694 return $state{handle}->push_read (line => $qr_nlnl, $state{read_response})
695 if $2 eq 100;
696
697 push @pseudo,
698 HTTPVersion => $1,
699 Status => $2,
700 Reason => $3,
701 ;
702
703 my $hdr = parse_hdr
704 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" }));
705
706 %hdr = (%$hdr, @pseudo);
707 }
708
709 # redirect handling
710 # microsoft and other shitheads don't give a shit for following standards,
711 # try to support some common forms of broken Location headers.
712 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
713 $hdr{location} =~ s/^\.\/+//;
714
715 my $url = "$rscheme://$uhost:$uport";
716
717 unless ($hdr{location} =~ s/^\///) {
718 $url .= $upath;
719 $url =~ s/\/[^\/]*$//;
720 }
721
722 $hdr{location} = "$url/$hdr{location}";
723 }
724
725 my $redirect;
726
727 if ($recurse) {
728 my $status = $hdr{Status};
729
730 # industry standard is to redirect POST as GET for
731 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
732 # also, the UA should ask the user for 301 and 307 and POST,
733 # industry standard seems to be to simply follow.
734 # we go with the industry standard.
735 if ($status == 301 or $status == 302 or $status == 303) {
736 # HTTP/1.1 is unclear on how to mutate the method
737 $method = "GET" unless $method eq "HEAD";
738 $redirect = 1;
739 } elsif ($status == 307) {
740 $redirect = 1;
741 }
742 }
743
744 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
745 my $may_keep_alive = $_[3];
746
747 $state{handle}->destroy if $state{handle};
748 %state = ();
749
750 if (defined $_[1]) {
751 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
752 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
753 }
754
755 # set-cookie processing
756 if ($arg{cookie_jar}) {
757 cookie_jar_set_cookie $arg{cookie_jar}, $hdr{"set-cookie"}, $uhost;
758 }
759
760 if ($redirect && exists $hdr{location}) {
761 # we ignore any errors, as it is very common to receive
762 # Content-Length != 0 but no actual body
763 # we also access %hdr, as $_[1] might be an erro
764 http_request (
765 $method => $hdr{location},
766 %arg,
767 recurse => $recurse - 1,
768 Redirect => [$_[0], \%hdr],
769 $cb);
770 } else {
771 $cb->($_[0], \%hdr);
772 }
773 };
774
775 $ae_error = 597; # body phase
776
777 my $len = $hdr{"content-length"};
778
779 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
780 $finish->(undef, 598 => "Request cancelled by on_header");
781 } elsif (
782 $hdr{Status} =~ /^(?:1..|204|205|304)$/
783 or $method eq "HEAD"
784 or (defined $len && !$len)
785 ) {
786 # no body
787 $finish->("", undef, undef, 1);
788 } else {
789 # body handling, many different code paths
790 # - no body expected
791 # - want_body_handle
792 # - te chunked
793 # - 2x length known (with or without on_body)
794 # - 2x length not known (with or without on_body)
795 if (!$redirect && $arg{want_body_handle}) {
796 $_[0]->on_eof (undef);
797 $_[0]->on_error (undef);
798 $_[0]->on_read (undef);
799
800 $finish->(delete $state{handle});
801
802 } elsif ($hdr{"transfer-encoding"} =~ /\bchunked\b/i) {
803 my $cl = 0;
804 my $body = undef;
805 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
806
807 my $read_chunk; $read_chunk = sub {
808 $_[1] =~ /^([0-9a-fA-F]+)/
809 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
810
811 my $len = hex $1;
812
813 if ($len) {
814 $cl += $len;
815
816 $_[0]->push_read (chunk => $len, sub {
817 $on_body->($_[1], \%hdr)
818 or return $finish->(undef, 598 => "Request cancelled by on_body");
819
820 $_[0]->push_read (line => sub {
821 length $_[1]
822 and return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
823 $_[0]->push_read (line => $read_chunk);
824 });
825 });
826 } else {
827 $hdr{"content-length"} ||= $cl;
828
829 $_[0]->push_read (line => $qr_nlnl, sub {
830 if (length $_[1]) {
831 for ("$_[1]") {
832 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
833
834 my $hdr = parse_hdr
835 or return $finish->(undef, $ae_error => "Garbled response trailers");
836
837 %hdr = (%hdr, %$hdr);
838 }
839 }
840
841 $finish->($body, undef, undef, 1);
842 });
843 }
844 };
845
846 $_[0]->push_read (line => $read_chunk);
847
848 } elsif ($arg{on_body}) {
849 if ($len) {
850 $_[0]->on_read (sub {
851 $len -= length $_[0]{rbuf};
852
853 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
854 or return $finish->(undef, 598 => "Request cancelled by on_body");
855
856 $len > 0
857 or $finish->("", undef, undef, 1);
858 });
859 } else {
860 $_[0]->on_eof (sub {
861 $finish->("");
862 });
863 $_[0]->on_read (sub {
864 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
865 or $finish->(undef, 598 => "Request cancelled by on_body");
866 });
867 }
868 } else {
869 $_[0]->on_eof (undef);
870
871 if ($len) {
872 $_[0]->on_read (sub {
873 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), undef, undef, 1)
874 if $len <= length $_[0]{rbuf};
875 });
876 } else {
877 $_[0]->on_error (sub {
878 ($! == Errno::EPIPE || !$!)
879 ? $finish->(delete $_[0]{rbuf})
880 : $finish->(undef, $ae_error => $_[2]);
881 });
882 $_[0]->on_read (sub { });
883 }
884 }
885 }
886 };
887
888 $state{handle}->push_read (line => $qr_nlnl, $state{read_response});
889 };
890
891 # now handle proxy-CONNECT method 936 # now handle proxy-CONNECT method
892 if ($proxy && $uscheme eq "https") { 937 if ($proxy && $uscheme eq "https") {
893 # oh dear, we have to wrap it into a connect request 938 # oh dear, we have to wrap it into a connect request
894 939
895 # maybe re-use $uauthority with patched port? 940 # maybe re-use $uauthority with patched port?
896 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012Host: $uhost\015\012\015\012"); 941 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012\015\012");
897 $state{handle}->push_read (line => $qr_nlnl, sub { 942 $state{handle}->push_read (line => $qr_nlnl, sub {
898 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix 943 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
899 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" })); 944 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" }));
900 945
901 if ($2 == 200) { 946 if ($2 == 200) {
902 $rpath = $upath; 947 $rpath = $upath;
903 &$handle_actual_request; 948 $handle_actual_request->();
904 } else { 949 } else {
905 %state = (); 950 %state = ();
906 $cb->(undef, { @pseudo, Status => $2, Reason => $3 }); 951 $cb->(undef, { @pseudo, Status => $2, Reason => $3 });
907 }
908 }); 952 }
909 } else {
910 &$handle_actual_request;
911 } 953 });
954 } else {
955 $handle_actual_request->();
912 }; 956 }
957 };
958
959 _get_slot $uhost, sub {
960 $state{slot_guard} = shift;
961
962 return unless $state{connect_guard};
913 963
914 my $tcp_connect = $arg{tcp_connect} 964 my $tcp_connect = $arg{tcp_connect}
915 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 965 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
916 966
917 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout }); 967 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout });
918
919 }; 968 };
920 969
921 defined wantarray && AnyEvent::Util::guard { %state = () } 970 defined wantarray && AnyEvent::Util::guard { %state = () }
922} 971}
923 972
957Sets the default proxy server to use. The proxy-url must begin with a 1006Sets the default proxy server to use. The proxy-url must begin with a
958string of the form C<http://host:port> (optionally C<https:...>), croaks 1007string of the form C<http://host:port> (optionally C<https:...>), croaks
959otherwise. 1008otherwise.
960 1009
961To clear an already-set proxy, use C<undef>. 1010To clear an already-set proxy, use C<undef>.
1011
1012=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
1013
1014Remove all cookies from the cookie jar that have been expired. If
1015C<$session_end> is given and true, then additionally remove all session
1016cookies.
1017
1018You should call this function (with a true C<$session_end>) before you
1019save cookies to disk, and you should call this function after loading them
1020again. If you have a long-running program you can additonally call this
1021function from time to time.
1022
1023A cookie jar is initially an empty hash-reference that is managed by this
1024module. It's format is subject to change, but currently it is like this:
1025
1026The key C<version> has to contain C<1>, otherwise the hash gets
1027emptied. All other keys are hostnames or IP addresses pointing to
1028hash-references. The key for these inner hash references is the
1029server path for which this cookie is meant, and the values are again
1030hash-references. The keys of those hash-references is the cookie name, and
1031the value, you guessed it, is another hash-reference, this time with the
1032key-value pairs from the cookie, except for C<expires> and C<max-age>,
1033which have been replaced by a C<_expires> key that contains the cookie
1034expiry timestamp.
1035
1036Here is an example of a cookie jar with a single cookie, so you have a
1037chance of understanding the above paragraph:
1038
1039 {
1040 version => 1,
1041 "10.0.0.1" => {
1042 "/" => {
1043 "mythweb_id" => {
1044 _expires => 1293917923,
1045 value => "ooRung9dThee3ooyXooM1Ohm",
1046 },
1047 },
1048 },
1049 }
962 1050
963=item $date = AnyEvent::HTTP::format_date $timestamp 1051=item $date = AnyEvent::HTTP::format_date $timestamp
964 1052
965Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP 1053Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP
966Date (RFC 2616). 1054Date (RFC 2616).

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines