ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.79 by root, Sat Jan 1 20:01:07 2011 UTC vs.
Revision 1.87 by root, Sun Jan 2 08:51:53 2011 UTC

36 36
37=cut 37=cut
38 38
39package AnyEvent::HTTP; 39package AnyEvent::HTTP;
40 40
41use strict; 41use common::sense;
42no warnings;
43 42
44use Errno (); 43use Errno ();
45 44
46use AnyEvent 5.0 (); 45use AnyEvent 5.0 ();
47use AnyEvent::Util (); 46use AnyEvent::Util ();
196=item cookie_jar => $hash_ref 195=item cookie_jar => $hash_ref
197 196
198Passing this parameter enables (simplified) cookie-processing, loosely 197Passing this parameter enables (simplified) cookie-processing, loosely
199based on the original netscape specification. 198based on the original netscape specification.
200 199
201The C<$hash_ref> must be an (initially empty) hash reference which will 200The C<$hash_ref> must be an (initially empty) hash reference which
202get updated automatically. It is possible to save the cookie jar to 201will get updated automatically. It is possible to save the cookie jar
203persistent storage with something like JSON or Storable, but this is not 202to persistent storage with something like JSON or Storable - see the
204recommended, as session-only cookies might survive longer than expected. 203C<AnyEvent::HTTP::cookie_jar_expire> function if you wish to remove
204expired or session-only cookies, and also for documentation on the format
205of the cookie jar.
205 206
206Note that this cookie implementation is not meant to be complete. If 207Note that this cookie implementation is not meant to be complete. If
207you want complete cookie management you have to do that on your 208you want complete cookie management you have to do that on your
208own. C<cookie_jar> is meant as a quick fix to get some cookie-using sites 209own. C<cookie_jar> is meant as a quick fix to get most cookie-using sites
209working. Cookies are a privacy disaster, do not use them unless required 210working. Cookies are a privacy disaster, do not use them unless required
210to. 211to.
211 212
212When cookie processing is enabled, the C<Cookie:> and C<Set-Cookie:> 213When cookie processing is enabled, the C<Cookie:> and C<Set-Cookie:>
213headers will be set and handled by this module, otherwise they will be 214headers will be set and handled by this module, otherwise they will be
378 push @{ $CO_SLOT{$_[0]}[1] }, $_[1]; 379 push @{ $CO_SLOT{$_[0]}[1] }, $_[1];
379 380
380 _slot_schedule $_[0]; 381 _slot_schedule $_[0];
381} 382}
382 383
384#############################################################################
385
386# expire cookies
387sub cookie_jar_expire($;$) {
388 my ($jar, $session_end) = @_;
389
390 %$jar = () if $jar->{version} != 1;
391
392 my $anow = AE::now;
393
394 while (my ($chost, $paths) = each %$jar) {
395 next unless ref $paths;
396
397 while (my ($cpath, $cookies) = each %$paths) {
398 while (my ($cookie, $kv) = each %$cookies) {
399 if (exists $kv->{_expires}) {
400 delete $cookies->{$cookie}
401 if $anow > $kv->{_expires};
402 } elsif ($session_end) {
403 delete $cookies->{$cookie};
404 }
405 }
406
407 delete $paths->{$cpath}
408 unless %$cookies;
409 }
410
411 delete $jar->{$chost}
412 unless %$paths;
413 }
414}
415
383# extract cookies from jar 416# extract cookies from jar
384sub cookie_jar_extract($$$$) { 417sub cookie_jar_extract($$$$) {
385 my ($jar, $uscheme, $uhost, $upath) = @_; 418 my ($jar, $uscheme, $uhost, $upath) = @_;
386 419
387 %$jar = () if $jar->{version} != 1; 420 %$jar = () if $jar->{version} != 1;
403 next unless $cpath eq substr $upath, 0, length $cpath; 436 next unless $cpath eq substr $upath, 0, length $cpath;
404 437
405 while (my ($cookie, $kv) = each %$cookies) { 438 while (my ($cookie, $kv) = each %$cookies) {
406 next if $uscheme ne "https" && exists $kv->{secure}; 439 next if $uscheme ne "https" && exists $kv->{secure};
407 440
408 if (exists $kv->{expires}) { 441 if (exists $kv->{_expires} and AE::now > $kv->{_expires}) {
409 if (AE::now > parse_date ($kv->{expires})) {
410 delete $cookies->{$cookie}; 442 delete $cookies->{$cookie};
411 next; 443 next;
412 }
413 } 444 }
414 445
415 my $value = $kv->{value}; 446 my $value = $kv->{value};
416 447
417 if ($value =~ /[=;,[:space:]]/) { 448 if ($value =~ /[=;,[:space:]]/) {
426 457
427 \@cookies 458 \@cookies
428} 459}
429 460
430# parse set_cookie header into jar 461# parse set_cookie header into jar
431sub cookie_jar_set_cookie($$$) { 462sub cookie_jar_set_cookie($$$$) {
432 my ($jar, $set_cookie, $uhost) = @_; 463 my ($jar, $set_cookie, $uhost, $date) = @_;
464
465 my $anow = int AE::now;
466 my $snow; # server-now
433 467
434 for ($set_cookie) { 468 for ($set_cookie) {
435 # parse NAME=VALUE 469 # parse NAME=VALUE
436 my @kv; 470 my @kv;
437 471
440 while ( 474 while (
441 m{ 475 m{
442 \G\s* 476 \G\s*
443 (?: 477 (?:
444 expires \s*=\s* ([A-Z][a-z][a-z]+,\ [^,;]+) 478 expires \s*=\s* ([A-Z][a-z][a-z]+,\ [^,;]+)
445 | ([^=;,[:space:]]+) \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) ) 479 | ([^=;,[:space:]]+) (?: \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) ) )?
446 ) 480 )
447 }gcxsi 481 }gcxsi
448 ) { 482 ) {
449 my $name = $2; 483 my $name = $2;
450 my $value = $4; 484 my $value = $4;
451 485
452 unless (defined $name) { 486 if (defined $1) {
453 # expires 487 # expires
454 $name = "expires"; 488 $name = "expires";
455 $value = $1; 489 $value = $1;
456 } elsif (!defined $value) { 490 } elsif (defined $3) {
457 # quoted 491 # quoted
458 $value = $3; 492 $value = $3;
459 $value =~ s/\\(.)/$1/gs; 493 $value =~ s/\\(.)/$1/gs;
460 } 494 }
461 495
467 last unless @kv; 501 last unless @kv;
468 502
469 my $name = shift @kv; 503 my $name = shift @kv;
470 my %kv = (value => shift @kv, @kv); 504 my %kv = (value => shift @kv, @kv);
471 505
472 $kv{expires} ||= format_date (AE::now + $kv{"max-age"})
473 if exists $kv{"max-age"}; 506 if (exists $kv{"max-age"}) {
507 $kv{_expires} = $anow + delete $kv{"max-age"};
508 } elsif (exists $kv{expires}) {
509 $snow ||= parse_date ($date) || $anow;
510 $kv{_expires} = $anow + (parse_date (delete $kv{expires}) - $snow);
511 } else {
512 delete $kv{_expires};
513 }
474 514
475 my $cdom; 515 my $cdom;
476 my $cpath = (delete $kv{path}) || "/"; 516 my $cpath = (delete $kv{path}) || "/";
477 517
478 if (exists $kv{domain}) { 518 if (exists $kv{domain}) {
489 $cdom = $uhost; 529 $cdom = $uhost;
490 } 530 }
491 531
492 # store it 532 # store it
493 $jar->{version} = 1; 533 $jar->{version} = 1;
494 $jar->{$cdom}{$cpath}{$name} = \%kv; 534 $jar->{lc $cdom}{$cpath}{$name} = \%kv;
495 535
496 redo if /\G\s*,/gc; 536 redo if /\G\s*,/gc;
497 } 537 }
498} 538}
499 539
566 : return $cb->(undef, { @pseudo, Status => 599, Reason => "Only http and https URL schemes supported" }); 606 : return $cb->(undef, { @pseudo, Status => 599, Reason => "Only http and https URL schemes supported" });
567 607
568 $uauthority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x 608 $uauthority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x
569 or return $cb->(undef, { @pseudo, Status => 599, Reason => "Unparsable URL" }); 609 or return $cb->(undef, { @pseudo, Status => 599, Reason => "Unparsable URL" });
570 610
571 my $uhost = $1; 611 my $uhost = lc $1;
572 $uport = $2 if defined $2; 612 $uport = $2 if defined $2;
573 613
574 $hdr{host} = defined $2 ? "$uhost:$2" : "$uhost" 614 $hdr{host} = defined $2 ? "$uhost:$2" : "$uhost"
575 unless exists $hdr{host}; 615 unless exists $hdr{host};
576 616
595 $rscheme = "http" unless defined $rscheme; 635 $rscheme = "http" unless defined $rscheme;
596 636
597 # don't support https requests over https-proxy transport, 637 # don't support https requests over https-proxy transport,
598 # can't be done with tls as spec'ed, unless you double-encrypt. 638 # can't be done with tls as spec'ed, unless you double-encrypt.
599 $rscheme = "http" if $uscheme eq "https" && $rscheme eq "https"; 639 $rscheme = "http" if $uscheme eq "https" && $rscheme eq "https";
640
641 $rhost = lc $rhost;
642 $rscheme = lc $rscheme;
600 } else { 643 } else {
601 ($rhost, $rport, $rscheme, $rpath) = ($uhost, $uport, $uscheme, $upath); 644 ($rhost, $rport, $rscheme, $rpath) = ($uhost, $uport, $uscheme, $upath);
602 } 645 }
603 646
604 # leave out fragment and query string, just a heuristic 647 # leave out fragment and query string, just a heuristic
606 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"}; 649 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"};
607 650
608 $hdr{"content-length"} = length $arg{body} 651 $hdr{"content-length"} = length $arg{body}
609 if length $arg{body} || $method ne "GET"; 652 if length $arg{body} || $method ne "GET";
610 653
611 $hdr{connection} = "close TE"; #1.1 654 $hdr{connection} = "close Te"; #1.1
612 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1 655 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1
613 656
614 my %state = (connect_guard => 1); 657 my %state = (connect_guard => 1);
615 658
616 _get_slot $uhost, sub { 659 _get_slot $uhost, sub {
617 $state{slot_guard} = shift; 660 $state{slot_guard} = shift;
618 661
619 return unless $state{connect_guard}; 662 return unless $state{connect_guard};
620 663
621 my $ae_error = 595; # connecting 664 my $ae_error = 595; # connecting
665
666 # handle actual, non-tunneled, request
667 my $handle_actual_request = sub {
668 $ae_error = 596; # request phase
669
670 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
671
672 # send request
673 $state{handle}->push_write (
674 "$method $rpath HTTP/1.1\015\012"
675 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
676 . "\015\012"
677 . (delete $arg{body})
678 );
679
680 # return if error occured during push_write()
681 return unless %state;
682
683 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use
684
685 # status line and headers
686 $state{read_response} = sub {
687 for ("$_[1]") {
688 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
689
690 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci
691 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" }));
692
693 # 100 Continue handling
694 # should not happen as we don't send expect: 100-continue,
695 # but we handle it just in case.
696 # since we send the request body regardless, if we get an error
697 # we are out of-sync, which we currently do NOT handle correctly.
698 return $state{handle}->push_read (line => $qr_nlnl, $state{read_response})
699 if $2 eq 100;
700
701 push @pseudo,
702 HTTPVersion => $1,
703 Status => $2,
704 Reason => $3,
705 ;
706
707 my $hdr = parse_hdr
708 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" }));
709
710 %hdr = (%$hdr, @pseudo);
711 }
712
713 # redirect handling
714 # microsoft and other shitheads don't give a shit for following standards,
715 # try to support some common forms of broken Location headers.
716 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
717 $hdr{location} =~ s/^\.\/+//;
718
719 my $url = "$rscheme://$uhost:$uport";
720
721 unless ($hdr{location} =~ s/^\///) {
722 $url .= $upath;
723 $url =~ s/\/[^\/]*$//;
724 }
725
726 $hdr{location} = "$url/$hdr{location}";
727 }
728
729 my $redirect;
730
731 if ($recurse) {
732 my $status = $hdr{Status};
733
734 # industry standard is to redirect POST as GET for
735 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
736 # also, the UA should ask the user for 301 and 307 and POST,
737 # industry standard seems to be to simply follow.
738 # we go with the industry standard.
739 if ($status == 301 or $status == 302 or $status == 303) {
740 # HTTP/1.1 is unclear on how to mutate the method
741 $method = "GET" unless $method eq "HEAD";
742 $redirect = 1;
743 } elsif ($status == 307) {
744 $redirect = 1;
745 }
746 }
747
748 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
749 my $may_keep_alive = $_[3];
750
751 $state{handle}->destroy if $state{handle};
752 %state = ();
753
754 if (defined $_[1]) {
755 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
756 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
757 }
758
759 # set-cookie processing
760 if ($arg{cookie_jar}) {
761 cookie_jar_set_cookie $arg{cookie_jar}, $hdr{"set-cookie"}, $uhost, $hdr{date};
762 }
763
764 if ($redirect && exists $hdr{location}) {
765 # we ignore any errors, as it is very common to receive
766 # Content-Length != 0 but no actual body
767 # we also access %hdr, as $_[1] might be an erro
768 http_request (
769 $method => $hdr{location},
770 %arg,
771 recurse => $recurse - 1,
772 Redirect => [$_[0], \%hdr],
773 $cb);
774 } else {
775 $cb->($_[0], \%hdr);
776 }
777 };
778
779 $ae_error = 597; # body phase
780
781 my $len = $hdr{"content-length"};
782
783 # body handling, many different code paths
784 # - no body expected
785 # - want_body_handle
786 # - te chunked
787 # - 2x length known (with or without on_body)
788 # - 2x length not known (with or without on_body)
789 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
790 $finish->(undef, 598 => "Request cancelled by on_header");
791 } elsif (
792 $hdr{Status} =~ /^(?:1..|204|205|304)$/
793 or $method eq "HEAD"
794 or (defined $len && $len == 0) # == 0, not !, because "0 " is true
795 ) {
796 # no body
797 $finish->("", undef, undef, 1);
798
799 } elsif (!$redirect && $arg{want_body_handle}) {
800 $_[0]->on_eof (undef);
801 $_[0]->on_error (undef);
802 $_[0]->on_read (undef);
803
804 $finish->(delete $state{handle});
805
806 } elsif ($hdr{"transfer-encoding"} =~ /\bchunked\b/i) {
807 my $cl = 0;
808 my $body = undef;
809 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
810
811 $state{read_chunk} = sub {
812 $_[1] =~ /^([0-9a-fA-F]+)/
813 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
814
815 my $len = hex $1;
816
817 if ($len) {
818 $cl += $len;
819
820 $_[0]->push_read (chunk => $len, sub {
821 $on_body->($_[1], \%hdr)
822 or return $finish->(undef, 598 => "Request cancelled by on_body");
823
824 $_[0]->push_read (line => sub {
825 length $_[1]
826 and return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
827 $_[0]->push_read (line => $state{read_chunk});
828 });
829 });
830 } else {
831 $hdr{"content-length"} ||= $cl;
832
833 $_[0]->push_read (line => $qr_nlnl, sub {
834 if (length $_[1]) {
835 for ("$_[1]") {
836 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
837
838 my $hdr = parse_hdr
839 or return $finish->(undef, $ae_error => "Garbled response trailers");
840
841 %hdr = (%hdr, %$hdr);
842 }
843 }
844
845 $finish->($body, undef, undef, 1);
846 });
847 }
848 };
849
850 $_[0]->push_read (line => $state{read_chunk});
851
852 } elsif ($arg{on_body}) {
853 if (defined $len) {
854 $_[0]->on_read (sub {
855 $len -= length $_[0]{rbuf};
856
857 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
858 or return $finish->(undef, 598 => "Request cancelled by on_body");
859
860 $len > 0
861 or $finish->("", undef, undef, 1);
862 });
863 } else {
864 $_[0]->on_eof (sub {
865 $finish->("");
866 });
867 $_[0]->on_read (sub {
868 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
869 or $finish->(undef, 598 => "Request cancelled by on_body");
870 });
871 }
872 } else {
873 $_[0]->on_eof (undef);
874
875 if (defined $len) {
876 $_[0]->on_read (sub {
877 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), undef, undef, 1)
878 if $len <= length $_[0]{rbuf};
879 });
880 } else {
881 $_[0]->on_error (sub {
882 ($! == Errno::EPIPE || !$!)
883 ? $finish->(delete $_[0]{rbuf})
884 : $finish->(undef, $ae_error => $_[2]);
885 });
886 $_[0]->on_read (sub { });
887 }
888 }
889 };
890
891 $state{handle}->push_read (line => $qr_nlnl, $state{read_response});
892 };
622 893
623 my $connect_cb = sub { 894 my $connect_cb = sub {
624 $state{fh} = shift 895 $state{fh} = shift
625 or do { 896 or do {
626 my $err = "$!"; 897 my $err = "$!";
657# $hdr{connection} = "keep-alive"; 928# $hdr{connection} = "keep-alive";
658# } 929# }
659 930
660 $state{handle}->starttls ("connect") if $rscheme eq "https"; 931 $state{handle}->starttls ("connect") if $rscheme eq "https";
661 932
662 # handle actual, non-tunneled, request
663 my $handle_actual_request = sub {
664 $ae_error = 596; # request phase
665
666 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
667
668 # send request
669 $state{handle}->push_write (
670 "$method $rpath HTTP/1.1\015\012"
671 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
672 . "\015\012"
673 . (delete $arg{body})
674 );
675
676 # return if error occured during push_write()
677 return unless %state;
678
679 %hdr = (); # reduce memory usage, save a kitten, also make it possible to re-use
680
681 # status line and headers
682 $state{read_response} = sub {
683 for ("$_[1]") {
684 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
685
686 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci
687 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" }));
688
689 # 100 Continue handling
690 # should not happen as we don't send expect: 100-continue,
691 # but we handle it just in case.
692 # since we send the request body regardless, if we get an error
693 # we are out of-sync, which we currently do NOT handle correctly.
694 return $state{handle}->push_read (line => $qr_nlnl, $state{read_response})
695 if $2 eq 100;
696
697 push @pseudo,
698 HTTPVersion => $1,
699 Status => $2,
700 Reason => $3,
701 ;
702
703 my $hdr = parse_hdr
704 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" }));
705
706 %hdr = (%$hdr, @pseudo);
707 }
708
709 # redirect handling
710 # microsoft and other shitheads don't give a shit for following standards,
711 # try to support some common forms of broken Location headers.
712 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
713 $hdr{location} =~ s/^\.\/+//;
714
715 my $url = "$rscheme://$uhost:$uport";
716
717 unless ($hdr{location} =~ s/^\///) {
718 $url .= $upath;
719 $url =~ s/\/[^\/]*$//;
720 }
721
722 $hdr{location} = "$url/$hdr{location}";
723 }
724
725 my $redirect;
726
727 if ($recurse) {
728 my $status = $hdr{Status};
729
730 # industry standard is to redirect POST as GET for
731 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
732 # also, the UA should ask the user for 301 and 307 and POST,
733 # industry standard seems to be to simply follow.
734 # we go with the industry standard.
735 if ($status == 301 or $status == 302 or $status == 303) {
736 # HTTP/1.1 is unclear on how to mutate the method
737 $method = "GET" unless $method eq "HEAD";
738 $redirect = 1;
739 } elsif ($status == 307) {
740 $redirect = 1;
741 }
742 }
743
744 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
745 my $may_keep_alive = $_[3];
746
747 $state{handle}->destroy if $state{handle};
748 %state = ();
749
750 if (defined $_[1]) {
751 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
752 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
753 }
754
755 # set-cookie processing
756 if ($arg{cookie_jar}) {
757 cookie_jar_set_cookie $arg{cookie_jar}, $hdr{"set-cookie"}, $uhost;
758 }
759
760 if ($redirect && exists $hdr{location}) {
761 # we ignore any errors, as it is very common to receive
762 # Content-Length != 0 but no actual body
763 # we also access %hdr, as $_[1] might be an erro
764 http_request (
765 $method => $hdr{location},
766 %arg,
767 recurse => $recurse - 1,
768 Redirect => [$_[0], \%hdr],
769 $cb);
770 } else {
771 $cb->($_[0], \%hdr);
772 }
773 };
774
775 $ae_error = 597; # body phase
776
777 my $len = $hdr{"content-length"};
778
779 if (!$redirect && $arg{on_header} && !$arg{on_header}(\%hdr)) {
780 $finish->(undef, 598 => "Request cancelled by on_header");
781 } elsif (
782 $hdr{Status} =~ /^(?:1..|204|205|304)$/
783 or $method eq "HEAD"
784 or (defined $len && !$len)
785 ) {
786 # no body
787 $finish->("", undef, undef, 1);
788 } else {
789 # body handling, many different code paths
790 # - no body expected
791 # - want_body_handle
792 # - te chunked
793 # - 2x length known (with or without on_body)
794 # - 2x length not known (with or without on_body)
795 if (!$redirect && $arg{want_body_handle}) {
796 $_[0]->on_eof (undef);
797 $_[0]->on_error (undef);
798 $_[0]->on_read (undef);
799
800 $finish->(delete $state{handle});
801
802 } elsif ($hdr{"transfer-encoding"} =~ /\bchunked\b/i) {
803 my $cl = 0;
804 my $body = undef;
805 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
806
807 my $read_chunk; $read_chunk = sub {
808 $_[1] =~ /^([0-9a-fA-F]+)/
809 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
810
811 my $len = hex $1;
812
813 if ($len) {
814 $cl += $len;
815
816 $_[0]->push_read (chunk => $len, sub {
817 $on_body->($_[1], \%hdr)
818 or return $finish->(undef, 598 => "Request cancelled by on_body");
819
820 $_[0]->push_read (line => sub {
821 length $_[1]
822 and return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
823 $_[0]->push_read (line => $read_chunk);
824 });
825 });
826 } else {
827 $hdr{"content-length"} ||= $cl;
828
829 $_[0]->push_read (line => $qr_nlnl, sub {
830 if (length $_[1]) {
831 for ("$_[1]") {
832 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
833
834 my $hdr = parse_hdr
835 or return $finish->(undef, $ae_error => "Garbled response trailers");
836
837 %hdr = (%hdr, %$hdr);
838 }
839 }
840
841 $finish->($body, undef, undef, 1);
842 });
843 }
844 };
845
846 $_[0]->push_read (line => $read_chunk);
847
848 } elsif ($arg{on_body}) {
849 if ($len) {
850 $_[0]->on_read (sub {
851 $len -= length $_[0]{rbuf};
852
853 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
854 or return $finish->(undef, 598 => "Request cancelled by on_body");
855
856 $len > 0
857 or $finish->("", undef, undef, 1);
858 });
859 } else {
860 $_[0]->on_eof (sub {
861 $finish->("");
862 });
863 $_[0]->on_read (sub {
864 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
865 or $finish->(undef, 598 => "Request cancelled by on_body");
866 });
867 }
868 } else {
869 $_[0]->on_eof (undef);
870
871 if ($len) {
872 $_[0]->on_read (sub {
873 $finish->((substr delete $_[0]{rbuf}, 0, $len, ""), undef, undef, 1)
874 if $len <= length $_[0]{rbuf};
875 });
876 } else {
877 $_[0]->on_error (sub {
878 ($! == Errno::EPIPE || !$!)
879 ? $finish->(delete $_[0]{rbuf})
880 : $finish->(undef, $ae_error => $_[2]);
881 });
882 $_[0]->on_read (sub { });
883 }
884 }
885 }
886 };
887
888 $state{handle}->push_read (line => $qr_nlnl, $state{read_response});
889 };
890
891 # now handle proxy-CONNECT method 933 # now handle proxy-CONNECT method
892 if ($proxy && $uscheme eq "https") { 934 if ($proxy && $uscheme eq "https") {
893 # oh dear, we have to wrap it into a connect request 935 # oh dear, we have to wrap it into a connect request
894 936
895 # maybe re-use $uauthority with patched port? 937 # maybe re-use $uauthority with patched port?
896 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012Host: $uhost\015\012\015\012"); 938 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012\015\012");
897 $state{handle}->push_read (line => $qr_nlnl, sub { 939 $state{handle}->push_read (line => $qr_nlnl, sub {
898 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix 940 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
899 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" })); 941 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" }));
900 942
901 if ($2 == 200) { 943 if ($2 == 200) {
902 $rpath = $upath; 944 $rpath = $upath;
903 &$handle_actual_request; 945 $handle_actual_request->();
904 } else { 946 } else {
905 %state = (); 947 %state = ();
906 $cb->(undef, { @pseudo, Status => $2, Reason => $3 }); 948 $cb->(undef, { @pseudo, Status => $2, Reason => $3 });
907 } 949 }
908 }); 950 });
909 } else { 951 } else {
910 &$handle_actual_request; 952 $handle_actual_request->();
911 } 953 }
912 }; 954 };
913 955
914 my $tcp_connect = $arg{tcp_connect} 956 my $tcp_connect = $arg{tcp_connect}
915 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 957 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
916 958
917 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout }); 959 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout });
918
919 }; 960 };
920 961
921 defined wantarray && AnyEvent::Util::guard { %state = () } 962 defined wantarray && AnyEvent::Util::guard { %state = () }
922} 963}
923 964
957Sets the default proxy server to use. The proxy-url must begin with a 998Sets the default proxy server to use. The proxy-url must begin with a
958string of the form C<http://host:port> (optionally C<https:...>), croaks 999string of the form C<http://host:port> (optionally C<https:...>), croaks
959otherwise. 1000otherwise.
960 1001
961To clear an already-set proxy, use C<undef>. 1002To clear an already-set proxy, use C<undef>.
1003
1004=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
1005
1006Remove all cookies from the cookie jar that have been expired. If
1007C<$session_end> is given and true, then additionally remove all session
1008cookies.
1009
1010You should call this function (with a true C<$session_end>) before you
1011save cookies to disk, and you should call this function after loading them
1012again. If you have a long-running program you can additonally call this
1013function from time to time.
1014
1015A cookie jar is initially an empty hash-reference that is managed by this
1016module. It's format is subject to change, but currently it is like this:
1017
1018The key C<version> has to contain C<1>, otherwise the hash gets
1019emptied. All other keys are hostnames or IP addresses pointing to
1020hash-references. The key for these inner hash references is the
1021server path for which this cookie is meant, and the values are again
1022hash-references. The keys of those hash-references is the cookie name, and
1023the value, you guessed it, is another hash-reference, this time with the
1024key-value pairs from the cookie, except for C<expires> and C<max-age>,
1025which have been replaced by a C<_expires> key that contains the cookie
1026expiry timestamp.
1027
1028Here is an example of a cookie jar with a single cookie, so you have a
1029chance of understanding the above paragraph:
1030
1031 {
1032 version => 1,
1033 "10.0.0.1" => {
1034 "/" => {
1035 "mythweb_id" => {
1036 _expires => 1293917923,
1037 value => "ooRung9dThee3ooyXooM1Ohm",
1038 },
1039 },
1040 },
1041 }
962 1042
963=item $date = AnyEvent::HTTP::format_date $timestamp 1043=item $date = AnyEvent::HTTP::format_date $timestamp
964 1044
965Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP 1045Takes a POSIX timestamp (seconds since the epoch) and formats it as a HTTP
966Date (RFC 2616). 1046Date (RFC 2616).

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines