ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.75 by root, Sat Jan 1 00:08:51 2011 UTC vs.
Revision 1.82 by root, Sun Jan 2 04:50:40 2011 UTC

122 122
123If the server sends a header multiple times, then their contents will be 123If the server sends a header multiple times, then their contents will be
124joined together with a comma (C<,>), as per the HTTP spec. 124joined together with a comma (C<,>), as per the HTTP spec.
125 125
126If an internal error occurs, such as not being able to resolve a hostname, 126If an internal error occurs, such as not being able to resolve a hostname,
127then C<$data> will be C<undef>, C<< $headers->{Status} >> will be C<59x> 127then C<$data> will be C<undef>, C<< $headers->{Status} >> will be
128(usually C<599>) and the C<Reason> pseudo-header will contain an error 128C<590>-C<599> and the C<Reason> pseudo-header will contain an error
129message. 129message. Currently the following status codes are used:
130
131=over 4
132
133=item 595 - errors during connection etsbalishment, proxy handshake.
134
135=item 596 - errors during TLS negotiation, request sending and header processing.
136
137=item 597 - errors during body receiving or processing.
138
139=item 598 - user aborted request via C<on_header> or C<on_body>.
140
141=item 599 - other, usually nonretryable, errors (garbled URL etc.).
142
143=back
130 144
131A typical callback might look like this: 145A typical callback might look like this:
132 146
133 sub { 147 sub {
134 my ($body, $hdr) = @_; 148 my ($body, $hdr) = @_;
182=item cookie_jar => $hash_ref 196=item cookie_jar => $hash_ref
183 197
184Passing this parameter enables (simplified) cookie-processing, loosely 198Passing this parameter enables (simplified) cookie-processing, loosely
185based on the original netscape specification. 199based on the original netscape specification.
186 200
187The C<$hash_ref> must be an (initially empty) hash reference which will 201The C<$hash_ref> must be an (initially empty) hash reference which
188get updated automatically. It is possible to save the cookie jar to 202will get updated automatically. It is possible to save the cookie jar
189persistent storage with something like JSON or Storable, but this is not 203to persistent storage with something like JSON or Storable - see the
190recommended, as session-only cookies might survive longer than expected. 204C<AnyEvent::HTTP::cookie_jar_expire> function if you wish to remove
205expired or session-only cookies, and also for documentation on the format
206of the cookie jar.
191 207
192Note that this cookie implementation is not meant to be complete. If 208Note that this cookie implementation is not meant to be complete. If
193you want complete cookie management you have to do that on your 209you want complete cookie management you have to do that on your
194own. C<cookie_jar> is meant as a quick fix to get some cookie-using sites 210own. C<cookie_jar> is meant as a quick fix to get most cookie-using sites
195working. Cookies are a privacy disaster, do not use them unless required 211working. Cookies are a privacy disaster, do not use them unless required
196to. 212to.
197 213
198When cookie processing is enabled, the C<Cookie:> and C<Set-Cookie:> 214When cookie processing is enabled, the C<Cookie:> and C<Set-Cookie:>
199headers will be set and handled by this module, otherwise they will be 215headers will be set and handled by this module, otherwise they will be
364 push @{ $CO_SLOT{$_[0]}[1] }, $_[1]; 380 push @{ $CO_SLOT{$_[0]}[1] }, $_[1];
365 381
366 _slot_schedule $_[0]; 382 _slot_schedule $_[0];
367} 383}
368 384
385#############################################################################
386
387# expire cookies
388sub cookie_jar_expire($;$) {
389 my ($jar, $session_end) = @_;
390
391 %$jar = () if $jar->{version} != 1;
392
393 my $anow = AE::now;
394
395 while (my ($chost, $paths) = each %$jar) {
396 next unless ref $paths;
397
398 while (my ($cpath, $cookies) = each %$paths) {
399 while (my ($cookie, $kv) = each %$cookies) {
400 if (exists $kv->{_expires}) {
401 delete $cookies->{$cookie}
402 if $anow > $kv->{_expires};
403 } elsif ($session_end) {
404 delete $cookies->{$cookie};
405 }
406 }
407
408 delete $paths->{$cpath}
409 unless %$cookies;
410 }
411
412 delete $jar->{$chost}
413 unless %$paths;
414 }
415}
416
369# extract cookies from jar 417# extract cookies from jar
370sub cookie_jar_extract($$$$) { 418sub cookie_jar_extract($$$$) {
371 my ($jar, $uscheme, $uhost, $upath) = @_; 419 my ($jar, $uscheme, $uhost, $upath) = @_;
372 420
373 %$jar = () if $jar->{version} != 1; 421 %$jar = () if $jar->{version} != 1;
389 next unless $cpath eq substr $upath, 0, length $cpath; 437 next unless $cpath eq substr $upath, 0, length $cpath;
390 438
391 while (my ($cookie, $kv) = each %$cookies) { 439 while (my ($cookie, $kv) = each %$cookies) {
392 next if $uscheme ne "https" && exists $kv->{secure}; 440 next if $uscheme ne "https" && exists $kv->{secure};
393 441
394 if (exists $kv->{expires}) { 442 if (exists $kv->{_expires} and AE::now > $kv->{_expires}) {
395 if (AE::now > parse_date ($kv->{expires})) {
396 delete $cookies->{$cookie}; 443 delete $cookies->{$cookie};
397 next; 444 next;
398 }
399 } 445 }
400 446
401 my $value = $kv->{value}; 447 my $value = $kv->{value};
402 448
403 if ($value =~ /[=;,[:space:]]/) { 449 if ($value =~ /[=;,[:space:]]/) {
412 458
413 \@cookies 459 \@cookies
414} 460}
415 461
416# parse set_cookie header into jar 462# parse set_cookie header into jar
417sub cookie_jar_set_cookie($$$) { 463sub cookie_jar_set_cookie($$$$) {
418 my ($jar, $set_cookie, $uhost) = @_; 464 my ($jar, $set_cookie, $uhost, $date) = @_;
465
466 my $anow = int AE::now;
467 my $snow; # server-now
419 468
420 for ($set_cookie) { 469 for ($set_cookie) {
421 # parse NAME=VALUE 470 # parse NAME=VALUE
422 my @kv; 471 my @kv;
423 472
473 # expires is not http-compliant in the original cookie-spec,
474 # we support the official date format and some extensions
424 while ( 475 while (
425 m{ 476 m{
426 \G\s* 477 \G\s*
427 (?: 478 (?:
428 expires \s*=\s* ([A-Z][a-z][a-z],\ [^,;]+) 479 expires \s*=\s* ([A-Z][a-z][a-z]+,\ [^,;]+)
429 | ([^=;,[:space:]]+) \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) ) 480 | ([^=;,[:space:]]+) (?: \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) ) )?
430 ) 481 )
431 }gcxsi 482 }gcxsi
432 ) { 483 ) {
433 my $name = $2; 484 my $name = $2;
434 my $value = $4; 485 my $value = $4;
435 486
436 unless (defined $name) { 487 if (defined $1) {
437 # expires 488 # expires
438 $name = "expires"; 489 $name = "expires";
439 $value = $1; 490 $value = $1;
440 } elsif (!defined $value) { 491 } elsif (defined $3) {
441 # quoted 492 # quoted
442 $value = $3; 493 $value = $3;
443 $value =~ s/\\(.)/$1/gs; 494 $value =~ s/\\(.)/$1/gs;
444 } 495 }
445 496
451 last unless @kv; 502 last unless @kv;
452 503
453 my $name = shift @kv; 504 my $name = shift @kv;
454 my %kv = (value => shift @kv, @kv); 505 my %kv = (value => shift @kv, @kv);
455 506
456 $kv{expires} ||= format_date (AE::now + $kv{"max-age"})
457 if exists $kv{"max-age"}; 507 if (exists $kv{"max-age"}) {
508 $kv{_expires} = $anow + delete $kv{"max-age"};
509 } elsif (exists $kv{expires}) {
510 $snow ||= parse_date ($date) || $anow;
511 $kv{_expires} = $anow + (parse_date (delete $kv{expires}) - $snow);
512 } else {
513 delete $kv{_expires};
514 }
458 515
459 my $cdom; 516 my $cdom;
460 my $cpath = (delete $kv{path}) || "/"; 517 my $cpath = (delete $kv{path}) || "/";
461 518
462 if (exists $kv{domain}) { 519 if (exists $kv{domain}) {
600 _get_slot $uhost, sub { 657 _get_slot $uhost, sub {
601 $state{slot_guard} = shift; 658 $state{slot_guard} = shift;
602 659
603 return unless $state{connect_guard}; 660 return unless $state{connect_guard};
604 661
662 my $ae_error = 595; # connecting
663
664 # handle actual, non-tunneled, request
665 my $handle_actual_request = sub {
666 $ae_error = 596; # request phase
667
668 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
669
670 # send request
671 $state{handle}->push_write (
672 "$method $rpath HTTP/1.1\015\012"
673 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
674 . "\015\012"
675 . (delete $arg{body})
676 );
677
678 # return if error occured during push_write()
679 return unless %state;
680
681 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use
682
683 # status line and headers
684 $state{read_response} = sub {
685 for ("$_[1]") {
686 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
687
688 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci
689 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" }));
690
691 # 100 Continue handling
692 # should not happen as we don't send expect: 100-continue,
693 # but we handle it just in case.
694 # since we send the request body regardless, if we get an error
695 # we are out of-sync, which we currently do NOT handle correctly.
696 return $state{handle}->push_read (line => $qr_nlnl, $state{read_response})
697 if $2 eq 100;
698
699 push @pseudo,
700 HTTPVersion => $1,
701 Status => $2,
702 Reason => $3,
703 ;
704
705 my $hdr = parse_hdr
706 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" }));
707
708 %hdr = (%$hdr, @pseudo);
709 }
710
711 # redirect handling
712 # microsoft and other shitheads don't give a shit for following standards,
713 # try to support some common forms of broken Location headers.
714 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
715 $hdr{location} =~ s/^\.\/+//;
716
717 my $url = "$rscheme://$uhost:$uport";
718
719 unless ($hdr{location} =~ s/^\///) {
720 $url .= $upath;
721 $url =~ s/\/[^\/]*$//;
722 }
723
724 $hdr{location} = "$url/$hdr{location}";
725 }
726
727 my $redirect;
728
729 if ($recurse) {
730 my $status = $hdr{Status};
731
732 # industry standard is to redirect POST as GET for
733 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
734 # also, the UA should ask the user for 301 and 307 and POST,
735 # industry standard seems to be to simply follow.
736 # we go with the industry standard.
737 if ($status == 301 or $status == 302 or $status == 303) {
738 # HTTP/1.1 is unclear on how to mutate the method
739 $method = "GET" unless $method eq "HEAD";
740 $redirect = 1;
741 } elsif ($status == 307) {
742 $redirect = 1;
743 }
744 }
745
746 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
747 my $may_keep_alive = $_[3];
748
749 $state{handle}->destroy if $state{handle};
750 %state = ();
751
752 if (defined $_[1]) {
753 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
754 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
755 }
756
757 # set-cookie processing
758 if ($arg{cookie_jar}) {
759 cookie_jar_set_cookie $arg{cookie_jar}, $hdr{"set-cookie"}, $uhost, $hdr{date};
760 }
761
762 if ($redirect && exists $hdr{location}) {
763 # we ignore any errors, as it is very common to receive
764 # Content-Length != 0 but no actual body
765 # we also access %hdr, as $_[1] might be an erro
766 http_request (
767 $method => $hdr{location},
768 %arg,
769 recurse => $recurse - 1,
770 Redirect => [$_[0], \%hdr],
771 $cb);
772 } else {
773 $cb->($_[0], \%hdr);
774 }
775 };
776
777 $ae_error = 597; # body phase
778
779 my $len = $hdr{"content-length"};
780
781 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
782 $finish->(undef, 598 => "Request cancelled by on_header");
783 } elsif (
784 $hdr{Status} =~ /^(?:1..|204|205|304)$/
785 or $method eq "HEAD"
786 or (defined $len && !$len)
787 ) {
788 # no body
789 $finish->("", undef, undef, 1);
790 } else {
791 # body handling, many different code paths
792 # - no body expected
793 # - want_body_handle
794 # - te chunked
795 # - 2x length known (with or without on_body)
796 # - 2x length not known (with or without on_body)
797 if (!$redirect && $arg{want_body_handle}) {
798 $_[0]->on_eof (undef);
799 $_[0]->on_error (undef);
800 $_[0]->on_read (undef);
801
802 $finish->(delete $state{handle});
803
804 } elsif ($hdr{"transfer-encoding"} =~ /\bchunked\b/i) {
805 my $cl = 0;
806 my $body = undef;
807 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
808
809 $state{read_chunk} = sub {
810 $_[1] =~ /^([0-9a-fA-F]+)/
811 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
812
813 my $len = hex $1;
814
815 if ($len) {
816 $cl += $len;
817
818 $_[0]->push_read (chunk => $len, sub {
819 $on_body->($_[1], \%hdr)
820 or return $finish->(undef, 598 => "Request cancelled by on_body");
821
822 $_[0]->push_read (line => sub {
823 length $_[1]
824 and return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
825 $_[0]->push_read (line => $state{read_chunk});
826 });
827 });
828 } else {
829 $hdr{"content-length"} ||= $cl;
830
831 $_[0]->push_read (line => $qr_nlnl, sub {
832 if (length $_[1]) {
833 for ("$_[1]") {
834 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
835
836 my $hdr = parse_hdr
837 or return $finish->(undef, $ae_error => "Garbled response trailers");
838
839 %hdr = (%hdr, %$hdr);
840 }
841 }
842
843 $finish->($body, undef, undef, 1);
844 });
845 }
846 };
847
848 $_[0]->push_read (line => $state{read_chunk});
849
850 } elsif ($arg{on_body}) {
851 if ($len) {
852 $_[0]->on_read (sub {
853 $len -= length $_[0]{rbuf};
854
855 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
856 or return $finish->(undef, 598 => "Request cancelled by on_body");
857
858 $len > 0
859 or $finish->("", undef, undef, 1);
860 });
861 } else {
862 $_[0]->on_eof (sub {
863 $finish->("");
864 });
865 $_[0]->on_read (sub {
866 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
867 or $finish->(undef, 598 => "Request cancelled by on_body");
868 });
869 }
870 } else {
871 $_[0]->on_eof (undef);
872
873 if ($len) {
874 $_[0]->on_read (sub {
875 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), undef, undef, 1)
876 if $len <= length $_[0]{rbuf};
877 });
878 } else {
879 $_[0]->on_error (sub {
880 ($! == Errno::EPIPE || !$!)
881 ? $finish->(delete $_[0]{rbuf})
882 : $finish->(undef, $ae_error => $_[2]);
883 });
884 $_[0]->on_read (sub { });
885 }
886 }
887 }
888 };
889
890 $state{handle}->push_read (line => $qr_nlnl, $state{read_response});
891 };
892
605 my $connect_cb = sub { 893 my $connect_cb = sub {
606 $state{fh} = shift 894 $state{fh} = shift
607 or do { 895 or do {
608 my $err = "$!"; 896 my $err = "$!";
609 %state = (); 897 %state = ();
610 return $cb->(undef, { @pseudo, Status => 599, Reason => $err }); 898 return $cb->(undef, { @pseudo, Status => $ae_error, Reason => $err });
611 }; 899 };
612 900
613 return unless delete $state{connect_guard}; 901 return unless delete $state{connect_guard};
614 902
615 # get handle 903 # get handle
619 tls_ctx => $arg{tls_ctx}, 907 tls_ctx => $arg{tls_ctx},
620 # these need to be reconfigured on keepalive handles 908 # these need to be reconfigured on keepalive handles
621 timeout => $timeout, 909 timeout => $timeout,
622 on_error => sub { 910 on_error => sub {
623 %state = (); 911 %state = ();
624 $cb->(undef, { @pseudo, Status => 599, Reason => $_[2] }); 912 $cb->(undef, { @pseudo, Status => $ae_error, Reason => $_[2] });
625 }, 913 },
626 on_eof => sub { 914 on_eof => sub {
627 %state = (); 915 %state = ();
628 $cb->(undef, { @pseudo, Status => 599, Reason => "Unexpected end-of-file" }); 916 $cb->(undef, { @pseudo, Status => $ae_error, Reason => "Unexpected end-of-file" });
629 }, 917 },
630 ; 918 ;
631 919
632 # limit the number of persistent connections 920 # limit the number of persistent connections
633 # keepalive not yet supported 921 # keepalive not yet supported
639# $hdr{connection} = "keep-alive"; 927# $hdr{connection} = "keep-alive";
640# } 928# }
641 929
642 $state{handle}->starttls ("connect") if $rscheme eq "https"; 930 $state{handle}->starttls ("connect") if $rscheme eq "https";
643 931
644 # handle actual, non-tunneled, request
645 my $handle_actual_request = sub {
646 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
647
648 # send request
649 $state{handle}->push_write (
650 "$method $rpath HTTP/1.1\015\012"
651 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
652 . "\015\012"
653 . (delete $arg{body})
654 );
655
656 # return if error occured during push_write()
657 return unless %state;
658
659 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use
660
661 # status line and headers
662 $state{read_response} = sub {
663 for ("$_[1]") {
664 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
665
666 /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/igxc
667 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" }));
668
669 # 100 Continue handling
670 # should not happen as we don't send expect: 100-continue,
671 # but we handle it just in case.
672 # since we send the request body regardless, if we get an error
673 # we are out of-sync, which we currently do NOT handle correctly.
674 return $state{handle}->push_read (line => $qr_nlnl, $state{read_response})
675 if $2 eq 100;
676
677 push @pseudo,
678 HTTPVersion => $1,
679 Status => $2,
680 Reason => $3,
681 ;
682
683 my $hdr = parse_hdr
684 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" }));
685
686 %hdr = (%$hdr, @pseudo);
687 }
688
689 # redirect handling
690 # microsoft and other shitheads don't give a shit for following standards,
691 # try to support some common forms of broken Location headers.
692 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
693 $hdr{location} =~ s/^\.\/+//;
694
695 my $url = "$rscheme://$uhost:$uport";
696
697 unless ($hdr{location} =~ s/^\///) {
698 $url .= $upath;
699 $url =~ s/\/[^\/]*$//;
700 }
701
702 $hdr{location} = "$url/$hdr{location}";
703 }
704
705 my $redirect;
706
707 if ($recurse) {
708 my $status = $hdr{Status};
709
710 # industry standard is to redirect POST as GET for
711 # 301, 302 and 303, in contrast to http/1.0 and 1.1.
712 # also, the UA should ask the user for 301 and 307 and POST,
713 # industry standard seems to be to simply follow.
714 # we go with the industry standard.
715 if ($status == 301 or $status == 302 or $status == 303) {
716 # HTTP/1.1 is unclear on how to mutate the method
717 $method = "GET" unless $method eq "HEAD";
718 $redirect = 1;
719 } elsif ($status == 307) {
720 $redirect = 1;
721 }
722 }
723
724 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
725 my $may_keep_alive = $_[3];
726
727 $state{handle}->destroy if $state{handle};
728 %state = ();
729
730 if (defined $_[1]) {
731 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
732 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
733 }
734
735 # set-cookie processing
736 if ($arg{cookie_jar}) {
737 cookie_jar_set_cookie $arg{cookie_jar}, $hdr{"set-cookie"}, $uhost;
738 }
739
740 if ($redirect && exists $hdr{location}) {
741 # we ignore any errors, as it is very common to receive
742 # Content-Length != 0 but no actual body
743 # we also access %hdr, as $_[1] might be an erro
744 http_request (
745 $method => $hdr{location},
746 %arg,
747 recurse => $recurse - 1,
748 Redirect => [$_[0], \%hdr],
749 $cb);
750 } else {
751 $cb->($_[0], \%hdr);
752 }
753 };
754
755 my $len = $hdr{"content-length"};
756
757 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
758 $finish->(undef, 598 => "Request cancelled by on_header");
759 } elsif (
760 $hdr{Status} =~ /^(?:1..|204|205|304)$/
761 or $method eq "HEAD"
762 or (defined $len && !$len)
763 ) {
764 # no body
765 $finish->("", undef, undef, 1);
766 } else {
767 # body handling, many different code paths
768 # - no body expected
769 # - want_body_handle
770 # - te chunked
771 # - 2x length known (with or without on_body)
772 # - 2x length not known (with or without on_body)
773 if (!$redirect && $arg{want_body_handle}) {
774 $_[0]->on_eof (undef);
775 $_[0]->on_error (undef);
776 $_[0]->on_read (undef);
777
778 $finish->(delete $state{handle});
779
780 } elsif ($hdr{"transfer-encoding"} =~ /\bchunked\b/i) {
781 my $cl = 0;
782 my $body = undef;
783 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
784
785 $_[0]->on_error (sub { $finish->(undef, 599 => $_[2]) });
786
787 my $read_chunk; $read_chunk = sub {
788 $_[1] =~ /^([0-9a-fA-F]+)/
789 or $finish->(undef, 599 => "Garbled chunked transfer encoding");
790
791 my $len = hex $1;
792
793 if ($len) {
794 $cl += $len;
795
796 $_[0]->push_read (chunk => $len, sub {
797 $on_body->($_[1], \%hdr)
798 or return $finish->(undef, 598 => "Request cancelled by on_body");
799
800 $_[0]->push_read (line => sub {
801 length $_[1]
802 and return $finish->(undef, 599 => "Garbled chunked transfer encoding");
803 $_[0]->push_read (line => $read_chunk);
804 });
805 });
806 } else {
807 $hdr{"content-length"} ||= $cl;
808
809 $_[0]->push_read (line => $qr_nlnl, sub {
810 if (length $_[1]) {
811 for ("$_[1]") {
812 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
813
814 my $hdr = parse_hdr
815 or return $finish->(undef, 599 => "Garbled response trailers");
816
817 %hdr = (%hdr, %$hdr);
818 }
819 }
820
821 $finish->($body, undef, undef, 1);
822 });
823 }
824 };
825
826 $_[0]->push_read (line => $read_chunk);
827
828 } elsif ($arg{on_body}) {
829 $_[0]->on_error (sub { $finish->(undef, 599 => $_[2]) });
830
831 if ($len) {
832 $_[0]->on_read (sub {
833 $len -= length $_[0]{rbuf};
834
835 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
836 or return $finish->(undef, 598 => "Request cancelled by on_body");
837
838 $len > 0
839 or $finish->("", undef, undef, 1);
840 });
841 } else {
842 $_[0]->on_eof (sub {
843 $finish->("");
844 });
845 $_[0]->on_read (sub {
846 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
847 or $finish->(undef, 598 => "Request cancelled by on_body");
848 });
849 }
850 } else {
851 $_[0]->on_eof (undef);
852
853 if ($len) {
854 $_[0]->on_error (sub { $finish->(undef, 599 => $_[2]) });
855 $_[0]->on_read (sub {
856 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), undef, undef, 1)
857 if $len <= length $_[0]{rbuf};
858 });
859 } else {
860 $_[0]->on_error (sub {
861 ($! == Errno::EPIPE || !$!)
862 ? $finish->(delete $_[0]{rbuf})
863 : $finish->(undef, 599 => $_[2]);
864 });
865 $_[0]->on_read (sub { });
866 }
867 }
868 }
869 };
870
871 $state{handle}->push_read (line => $qr_nlnl, $state{read_response});
872 };
873
874 # now handle proxy-CONNECT method 932 # now handle proxy-CONNECT method
875 if ($proxy && $uscheme eq "https") { 933 if ($proxy && $uscheme eq "https") {
876 # oh dear, we have to wrap it into a connect request 934 # oh dear, we have to wrap it into a connect request
877 935
878 # maybe re-use $uauthority with patched port? 936 # maybe re-use $uauthority with patched port?
881 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix 939 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
882 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" })); 940 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" }));
883 941
884 if ($2 == 200) { 942 if ($2 == 200) {
885 $rpath = $upath; 943 $rpath = $upath;
886 &$handle_actual_request; 944 $handle_actual_request->();
887 } else { 945 } else {
888 %state = (); 946 %state = ();
889 $cb->(undef, { @pseudo, Status => $2, Reason => $3 }); 947 $cb->(undef, { @pseudo, Status => $2, Reason => $3 });
890 } 948 }
891 }); 949 });
892 } else { 950 } else {
893 &$handle_actual_request; 951 $handle_actual_request->();
894 } 952 }
895 }; 953 };
896 954
897 my $tcp_connect = $arg{tcp_connect} 955 my $tcp_connect = $arg{tcp_connect}
898 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 956 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
899 957
900 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout }); 958 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout });
901
902 }; 959 };
903 960
904 defined wantarray && AnyEvent::Util::guard { %state = () } 961 defined wantarray && AnyEvent::Util::guard { %state = () }
905} 962}
906 963
941string of the form C<http://host:port> (optionally C<https:...>), croaks 998string of the form C<http://host:port> (optionally C<https:...>), croaks
942otherwise. 999otherwise.
943 1000
944To clear an already-set proxy, use C<undef>. 1001To clear an already-set proxy, use C<undef>.
945 1002
1003=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
1004
1005Remove all cookies from the cookie jar that have been expired. If
1006C<$session_end> is given and true, then additionally remove all session
1007cookies.
1008
1009You should call this function (with a true C<$session_end>) before you
1010save cookies to disk, and you should call this function after loading them
1011again. If you have a long-running program you can additonally call this
1012function from time to time.
1013
1014A cookie jar is initially an empty hash-reference that is managed by this
1015module. It's format is subject to change, but currently it is like this:
1016
1017The key C<version> has to contain C<1>, otherwise the hash gets
1018emptied. All other keys are hostnames or IP addresses pointing to
1019hash-references. The key for these inner hash references is the
1020server path for which this cookie is meant, and the values are again
1021hash-references. The keys of those hash-references is the cookie name, and
1022the value, you guessed it, is another hash-reference, this time with the
1023key-value pairs from the cookie, except for C<expires> and C<max-age>,
1024which have been replaced by a C<_expires> key that contains the cookie
1025expiry timestamp.
1026
1027Here is an example of a cookie jar with a single cookie, so you have a
1028chance of understanding the above paragraph:
1029
1030 {
1031 version => 1,
1032 "10.0.0.1" => {
1033 "/" => {
1034 "mythweb_id" => {
1035 _expires => 1293917923,
1036 value => "ooRung9dThee3ooyXooM1Ohm",
1037 },
1038 },
1039 },
1040 }
1041
946=item $date = AnyEvent::HTTP::format_date $timestamp 1042=item $date = AnyEvent::HTTP::format_date $timestamp
947 1043
948Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP 1044Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP
949Date (RFC 2616). 1045Date (RFC 2616).
950 1046
951=item $timestamp = AnyEvent::HTTP::parse_date $date 1047=item $timestamp = AnyEvent::HTTP::parse_date $date
952 1048
953Takes a HTTP Date (RFC 2616) or a Cookie date (netscape cookie spec) and 1049Takes a HTTP Date (RFC 2616) or a Cookie date (netscape cookie spec) or a
954returns the corresponding POSIX timestamp, or C<undef> if the date cannot 1050bunch of minor variations of those, and returns the corresponding POSIX
955be parsed. 1051timestamp, or C<undef> if the date cannot be parsed.
956 1052
957=item $AnyEvent::HTTP::MAX_RECURSE 1053=item $AnyEvent::HTTP::MAX_RECURSE
958 1054
959The default value for the C<recurse> request parameter (default: C<10>). 1055The default value for the C<recurse> request parameter (default: C<10>).
960 1056
999sub parse_date($) { 1095sub parse_date($) {
1000 my ($date) = @_; 1096 my ($date) = @_;
1001 1097
1002 my ($d, $m, $y, $H, $M, $S); 1098 my ($d, $m, $y, $H, $M, $S);
1003 1099
1004 if ($date =~ /^[A-Z][a-z][a-z], ([0-9][0-9])[\- ]([A-Z][a-z][a-z])[\- ]([0-9][0-9][0-9][0-9]) ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) GMT$/) { 1100 if ($date =~ /^[A-Z][a-z][a-z]+, ([0-9][0-9]?)[\- ]([A-Z][a-z][a-z])[\- ]([0-9][0-9][0-9][0-9]) ([0-9][0-9]?):([0-9][0-9]?):([0-9][0-9]?) GMT$/) {
1005 # RFC 822/1123, required by RFC 2616 (with " ") 1101 # RFC 822/1123, required by RFC 2616 (with " ")
1006 # cookie dates (with "-") 1102 # cookie dates (with "-")
1007 1103
1008 ($d, $m, $y, $H, $M, $S) = ($1, $2, $3, $4, $5, $6); 1104 ($d, $m, $y, $H, $M, $S) = ($1, $2, $3, $4, $5, $6);
1009 1105
1010 } elsif ($date =~ /^[A-Z][a-z]+, ([0-9][0-9])-([A-Z][a-z][a-z])-([0-9][0-9]) ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) GMT$/) { 1106 } elsif ($date =~ /^[A-Z][a-z][a-z]+, ([0-9][0-9]?)-([A-Z][a-z][a-z])-([0-9][0-9]) ([0-9][0-9]?):([0-9][0-9]?):([0-9][0-9]?) GMT$/) {
1011 # RFC 850 1107 # RFC 850
1012 ($d, $m, $y, $H, $M, $S) = ($1, $2, $3 < 69 ? $3 + 2000 : $3 + 1900, $4, $5, $6); 1108 ($d, $m, $y, $H, $M, $S) = ($1, $2, $3 < 69 ? $3 + 2000 : $3 + 1900, $4, $5, $6);
1013 1109
1014 } elsif ($date =~ /^[A-Z][a-z][a-z] ([A-Z][a-z][a-z]) ([0-9 ][0-9]) ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ([0-9][0-9][0-9][0-9])$/) { 1110 } elsif ($date =~ /^[A-Z][a-z][a-z]+ ([A-Z][a-z][a-z]) ([0-9 ]?[0-9]) ([0-9][0-9]?):([0-9][0-9]?):([0-9][0-9]?) ([0-9][0-9][0-9][0-9])$/) {
1015 # ISO C's asctime 1111 # ISO C's asctime
1016 ($d, $m, $y, $H, $M, $S) = ($2, $1, $6, $3, $4, $5); 1112 ($d, $m, $y, $H, $M, $S) = ($2, $1, $6, $3, $4, $5);
1017 } 1113 }
1018 # other formats fail in the loop below 1114 # other formats fail in the loop below
1019 1115

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines