ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.109 by root, Wed Jul 27 16:11:55 2011 UTC vs.
Revision 1.134 by root, Fri Sep 7 22:11:31 2018 UTC

46use AnyEvent::Util (); 46use AnyEvent::Util ();
47use AnyEvent::Handle (); 47use AnyEvent::Handle ();
48 48
49use base Exporter::; 49use base Exporter::;
50 50
51our $VERSION = '2.13'; 51our $VERSION = 2.24;
52 52
53our @EXPORT = qw(http_get http_post http_head http_request); 53our @EXPORT = qw(http_get http_post http_head http_request);
54 54
55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)"; 55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)";
56our $MAX_RECURSE = 10; 56our $MAX_RECURSE = 10;
89C<http_request> returns a "cancellation guard" - you have to keep the 89C<http_request> returns a "cancellation guard" - you have to keep the
90object at least alive until the callback get called. If the object gets 90object at least alive until the callback get called. If the object gets
91destroyed before the callback is called, the request will be cancelled. 91destroyed before the callback is called, the request will be cancelled.
92 92
93The callback will be called with the response body data as first argument 93The callback will be called with the response body data as first argument
94(or C<undef> if an error occured), and a hash-ref with response headers 94(or C<undef> if an error occurred), and a hash-ref with response headers
95(and trailers) as second argument. 95(and trailers) as second argument.
96 96
97All the headers in that hash are lowercased. In addition to the response 97All the headers in that hash are lowercased. In addition to the response
98headers, the "pseudo-headers" (uppercase to avoid clashing with possible 98headers, the "pseudo-headers" (uppercase to avoid clashing with possible
99response headers) C<HTTPVersion>, C<Status> and C<Reason> contain the 99response headers) C<HTTPVersion>, C<Status> and C<Reason> contain the
123C<590>-C<599> and the C<Reason> pseudo-header will contain an error 123C<590>-C<599> and the C<Reason> pseudo-header will contain an error
124message. Currently the following status codes are used: 124message. Currently the following status codes are used:
125 125
126=over 4 126=over 4
127 127
128=item 595 - errors during connection etsbalishment, proxy handshake. 128=item 595 - errors during connection establishment, proxy handshake.
129 129
130=item 596 - errors during TLS negotiation, request sending and header processing. 130=item 596 - errors during TLS negotiation, request sending and header processing.
131 131
132=item 597 - errors during body receiving or processing. 132=item 597 - errors during body receiving or processing.
133 133
154 154
155=over 4 155=over 4
156 156
157=item recurse => $count (default: $MAX_RECURSE) 157=item recurse => $count (default: $MAX_RECURSE)
158 158
159Whether to recurse requests or not, e.g. on redirects, authentication 159Whether to recurse requests or not, e.g. on redirects, authentication and
160retries and so on, and how often to do so. 160other retries and so on, and how often to do so.
161
162Only redirects to http and https URLs are supported. While most common
163redirection forms are handled entirely within this module, some require
164the use of the optional L<URI> module. If it is required but missing, then
165the request will fail with an error.
161 166
162=item headers => hashref 167=item headers => hashref
163 168
164The request headers to use. Currently, C<http_request> may provide its own 169The request headers to use. Currently, C<http_request> may provide its own
165C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers and 170C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers and
189 194
190C<$scheme> must be either missing or must be C<http> for HTTP. 195C<$scheme> must be either missing or must be C<http> for HTTP.
191 196
192If not specified, then the default proxy is used (see 197If not specified, then the default proxy is used (see
193C<AnyEvent::HTTP::set_proxy>). 198C<AnyEvent::HTTP::set_proxy>).
199
200Currently, if your proxy requires authorization, you have to specify an
201appropriate "Proxy-Authorization" header in every request.
194 202
195=item body => $string 203=item body => $string
196 204
197The request body, usually empty. Will be sent as-is (future versions of 205The request body, usually empty. Will be sent as-is (future versions of
198this module might offer more options). 206this module might offer more options).
242context) - only connections using the same unique ID will be reused. 250context) - only connections using the same unique ID will be reused.
243 251
244=item on_prepare => $callback->($fh) 252=item on_prepare => $callback->($fh)
245 253
246In rare cases you need to "tune" the socket before it is used to 254In rare cases you need to "tune" the socket before it is used to
247connect (for exmaple, to bind it on a given IP address). This parameter 255connect (for example, to bind it on a given IP address). This parameter
248overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect> 256overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect>
249and behaves exactly the same way (e.g. it has to provide a 257and behaves exactly the same way (e.g. it has to provide a
250timeout). See the description for the C<$prepare_cb> argument of 258timeout). See the description for the C<$prepare_cb> argument of
251C<AnyEvent::Socket::tcp_connect> for details. 259C<AnyEvent::Socket::tcp_connect> for details.
252 260
255In even rarer cases you want total control over how AnyEvent::HTTP 263In even rarer cases you want total control over how AnyEvent::HTTP
256establishes connections. Normally it uses L<AnyEvent::Socket::tcp_connect> 264establishes connections. Normally it uses L<AnyEvent::Socket::tcp_connect>
257to do this, but you can provide your own C<tcp_connect> function - 265to do this, but you can provide your own C<tcp_connect> function -
258obviously, it has to follow the same calling conventions, except that it 266obviously, it has to follow the same calling conventions, except that it
259may always return a connection guard object. 267may always return a connection guard object.
268
269The connections made by this hook will be treated as equivalent to
270connecitons made the built-in way, specifically, they will be put into
271and taken from the persistent conneciton cache. If your C<$tcp_connect>
272function is incompatible with this kind of re-use, consider switching off
273C<persistent> connections and/or providing a C<session> identifier.
260 274
261There are probably lots of weird uses for this function, starting from 275There are probably lots of weird uses for this function, starting from
262tracing the hosts C<http_request> actually tries to connect, to (inexact 276tracing the hosts C<http_request> actually tries to connect, to (inexact
263but fast) host => IP address caching or even socks protocol support. 277but fast) host => IP address caching or even socks protocol support.
264 278
334=item persistent => $boolean 348=item persistent => $boolean
335 349
336Try to create/reuse a persistent connection. When this flag is set 350Try to create/reuse a persistent connection. When this flag is set
337(default: true for idempotent requests, false for all others), then 351(default: true for idempotent requests, false for all others), then
338C<http_request> tries to re-use an existing (previously-created) 352C<http_request> tries to re-use an existing (previously-created)
339persistent connection to the host and, failing that, tries to create a new 353persistent connection to same host (i.e. identical URL scheme, hostname,
340one. 354port and session) and, failing that, tries to create a new one.
341 355
342Requests failing in certain ways will be automatically retried once, which 356Requests failing in certain ways will be automatically retried once, which
343is dangerous for non-idempotent requests, which is why it defaults to off 357is dangerous for non-idempotent requests, which is why it defaults to off
344for them. The reason for this is because the bozos who designed HTTP/1.1 358for them. The reason for this is because the bozos who designed HTTP/1.1
345made it impossible to distinguish between a fatal error and a normal 359made it impossible to distinguish between a fatal error and a normal
446 460
447# expire cookies 461# expire cookies
448sub cookie_jar_expire($;$) { 462sub cookie_jar_expire($;$) {
449 my ($jar, $session_end) = @_; 463 my ($jar, $session_end) = @_;
450 464
451 %$jar = () if $jar->{version} != 1; 465 %$jar = () if $jar->{version} != 2;
452 466
453 my $anow = AE::now; 467 my $anow = AE::now;
454 468
455 while (my ($chost, $paths) = each %$jar) { 469 while (my ($chost, $paths) = each %$jar) {
456 next unless ref $paths; 470 next unless ref $paths;
476 490
477# extract cookies from jar 491# extract cookies from jar
478sub cookie_jar_extract($$$$) { 492sub cookie_jar_extract($$$$) {
479 my ($jar, $scheme, $host, $path) = @_; 493 my ($jar, $scheme, $host, $path) = @_;
480 494
481 %$jar = () if $jar->{version} != 1; 495 %$jar = () if $jar->{version} != 2;
496
497 $host = AnyEvent::Util::idn_to_ascii $host
498 if $host =~ /[^\x00-\x7f]/;
482 499
483 my @cookies; 500 my @cookies;
484 501
485 while (my ($chost, $paths) = each %$jar) { 502 while (my ($chost, $paths) = each %$jar) {
486 next unless ref $paths; 503 next unless ref $paths;
487 504
488 if ($chost =~ /^\./) { 505 # exact match or suffix including . match
489 next unless $chost eq substr $host, -length $chost; 506 $chost eq $host or ".$chost" eq substr $host, -1 - length $chost
490 } elsif ($chost =~ /\./) {
491 next unless $chost eq $host;
492 } else {
493 next; 507 or next;
494 }
495 508
496 while (my ($cpath, $cookies) = each %$paths) { 509 while (my ($cpath, $cookies) = each %$paths) {
497 next unless $cpath eq substr $path, 0, length $cpath; 510 next unless $cpath eq substr $path, 0, length $cpath;
498 511
499 while (my ($cookie, $kv) = each %$cookies) { 512 while (my ($cookie, $kv) = each %$cookies) {
520} 533}
521 534
522# parse set_cookie header into jar 535# parse set_cookie header into jar
523sub cookie_jar_set_cookie($$$$) { 536sub cookie_jar_set_cookie($$$$) {
524 my ($jar, $set_cookie, $host, $date) = @_; 537 my ($jar, $set_cookie, $host, $date) = @_;
538
539 %$jar = () if $jar->{version} != 2;
525 540
526 my $anow = int AE::now; 541 my $anow = int AE::now;
527 my $snow; # server-now 542 my $snow; # server-now
528 543
529 for ($set_cookie) { 544 for ($set_cookie) {
575 590
576 my $cdom; 591 my $cdom;
577 my $cpath = (delete $kv{path}) || "/"; 592 my $cpath = (delete $kv{path}) || "/";
578 593
579 if (exists $kv{domain}) { 594 if (exists $kv{domain}) {
580 $cdom = delete $kv{domain}; 595 $cdom = $kv{domain};
581 596
582 $cdom =~ s/^\.?/./; # make sure it starts with a "." 597 $cdom =~ s/^\.?/./; # make sure it starts with a "."
583 598
584 next if $cdom =~ /\.$/; 599 next if $cdom =~ /\.$/;
585 600
586 # this is not rfc-like and not netscape-like. go figure. 601 # this is not rfc-like and not netscape-like. go figure.
587 my $ndots = $cdom =~ y/.//; 602 my $ndots = $cdom =~ y/.//;
588 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2); 603 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2);
604
605 $cdom = substr $cdom, 1; # remove initial .
589 } else { 606 } else {
590 $cdom = $host; 607 $cdom = $host;
591 } 608 }
592 609
593 # store it 610 # store it
594 $jar->{version} = 1; 611 $jar->{version} = 2;
595 $jar->{lc $cdom}{$cpath}{$name} = \%kv; 612 $jar->{lc $cdom}{$cpath}{$name} = \%kv;
596 613
597 redo if /\G\s*,/gc; 614 redo if /\G\s*,/gc;
598 } 615 }
599} 616}
689 706
690 $cb->(undef, $hdr); 707 $cb->(undef, $hdr);
691 () 708 ()
692} 709}
693 710
711our %IDEMPOTENT = (
712 DELETE => 1,
713 GET => 1,
714 HEAD => 1,
715 OPTIONS => 1,
716 PUT => 1,
717 TRACE => 1,
718
719 ACL => 1,
720 "BASELINE-CONTROL" => 1,
721 BIND => 1,
722 CHECKIN => 1,
723 CHECKOUT => 1,
724 COPY => 1,
725 LABEL => 1,
726 LINK => 1,
727 MERGE => 1,
728 MKACTIVITY => 1,
729 MKCALENDAR => 1,
730 MKCOL => 1,
731 MKREDIRECTREF => 1,
732 MKWORKSPACE => 1,
733 MOVE => 1,
734 ORDERPATCH => 1,
735 PROPFIND => 1,
736 PROPPATCH => 1,
737 REBIND => 1,
738 REPORT => 1,
739 SEARCH => 1,
740 UNBIND => 1,
741 UNCHECKOUT => 1,
742 UNLINK => 1,
743 UNLOCK => 1,
744 UPDATE => 1,
745 UPDATEREDIRECTREF => 1,
746 "VERSION-CONTROL" => 1,
747);
748
694sub http_request($$@) { 749sub http_request($$@) {
695 my $cb = pop; 750 my $cb = pop;
696 my ($method, $url, %arg) = @_; 751 my ($method, $url, %arg) = @_;
697 752
698 my %hdr; 753 my %hdr;
727 782
728 my $uport = $uscheme eq "http" ? 80 783 my $uport = $uscheme eq "http" ? 80
729 : $uscheme eq "https" ? 443 784 : $uscheme eq "https" ? 443
730 : return $cb->(undef, { @pseudo, Status => 599, Reason => "Only http and https URL schemes supported" }); 785 : return $cb->(undef, { @pseudo, Status => 599, Reason => "Only http and https URL schemes supported" });
731 786
732 $uauthority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x 787 $uauthority =~ /^(?: .*\@ )? ([^\@]+?) (?: : (\d+) )?$/x
733 or return $cb->(undef, { @pseudo, Status => 599, Reason => "Unparsable URL" }); 788 or return $cb->(undef, { @pseudo, Status => 599, Reason => "Unparsable URL" });
734 789
735 my $uhost = lc $1; 790 my $uhost = lc $1;
736 $uport = $2 if defined $2; 791 $uport = $2 if defined $2;
737 792
773 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"}; 828 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"};
774 829
775 $hdr{"content-length"} = length $arg{body} 830 $hdr{"content-length"} = length $arg{body}
776 if length $arg{body} || $method ne "GET"; 831 if length $arg{body} || $method ne "GET";
777 832
778 my $idempotent = $method =~ /^(?:GET|HEAD|PUT|DELETE|OPTIONS|TRACE)$/; 833 my $idempotent = $IDEMPOTENT{$method};
779 834
780 # default value for keepalive is true iff the request is for an idempotent method 835 # default value for keepalive is true iff the request is for an idempotent method
781 my $persistent = exists $arg{persistent} ? !!$arg{persistent} : $idempotent; 836 my $persistent = exists $arg{persistent} ? !!$arg{persistent} : $idempotent;
782 my $keepalive = exists $arg{keepalive} ? !!$arg{keepalive} : !$proxy; 837 my $keepalive = exists $arg{keepalive} ? !!$arg{keepalive} : !$proxy;
783 my $was_persistent; # true if this is actually a recycled connection 838 my $was_persistent; # true if this is actually a recycled connection
784 839
785 # the key to use in the keepalive cache 840 # the key to use in the keepalive cache
786 my $ka_key = "$uscheme\x00$uhost\x00$uport\x00$arg{sessionid}"; 841 my $ka_key = "$uscheme\x00$uhost\x00$uport\x00$arg{sessionid}";
787 842
788 $hdr{connection} = ($persistent ? $keepalive ? "keep-alive " : "" : "close ") . "Te"; #1.1 843 $hdr{connection} = ($persistent ? $keepalive ? "keep-alive, " : "" : "close, ") . "Te"; #1.1
789 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1 844 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1
790 845
791 my %state = (connect_guard => 1); 846 my %state = (connect_guard => 1);
792 847
793 my $ae_error = 595; # connecting 848 my $ae_error = 595; # connecting
803 # send request 858 # send request
804 $hdl->push_write ( 859 $hdl->push_write (
805 "$method $rpath HTTP/1.1\015\012" 860 "$method $rpath HTTP/1.1\015\012"
806 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr) 861 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
807 . "\015\012" 862 . "\015\012"
808 . (delete $arg{body}) 863 . $arg{body}
809 ); 864 );
810 865
811 # return if error occured during push_write() 866 # return if error occurred during push_write()
812 return unless %state; 867 return unless %state;
813 868
814 # reduce memory usage, save a kitten, also re-use it for the response headers. 869 # reduce memory usage, save a kitten, also re-use it for the response headers.
815 %hdr = (); 870 %hdr = ();
816 871
843 898
844 %hdr = (%$hdr, @pseudo); 899 %hdr = (%$hdr, @pseudo);
845 } 900 }
846 901
847 # redirect handling 902 # redirect handling
848 # microsoft and other shitheads don't give a shit for following standards, 903 # relative uri handling forced by microsoft and other shitheads.
849 # try to support some common forms of broken Location headers. 904 # we give our best and fall back to URI if available.
850 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) { 905 if (exists $hdr{location}) {
906 my $loc = $hdr{location};
907
908 if ($loc =~ m%^//%) { # //
909 $loc = "$uscheme:$loc";
910
911 } elsif ($loc eq "") {
912 $loc = $url;
913
914 } elsif ($loc !~ /^(?: $ | [^:\/?\#]+ : )/x) { # anything "simple"
851 $hdr{location} =~ s/^\.\/+//; 915 $loc =~ s/^\.\/+//;
852 916
853 my $url = "$rscheme://$uhost:$uport"; 917 if ($loc !~ m%^[.?#]%) {
918 my $prefix = "$uscheme://$uauthority";
854 919
855 unless ($hdr{location} =~ s/^\///) { 920 unless ($loc =~ s/^\///) {
856 $url .= $upath; 921 $prefix .= $upath;
857 $url =~ s/\/[^\/]*$//; 922 $prefix =~ s/\/[^\/]*$//;
923 }
924
925 $loc = "$prefix/$loc";
926
927 } elsif (eval { require URI }) { # uri
928 $loc = URI->new_abs ($loc, $url)->as_string;
929
930 } else {
931 return _error %state, $cb, { @pseudo, Status => 599, Reason => "Cannot parse Location (URI module missing)" };
932 #$hdr{Status} = 599;
933 #$hdr{Reason} = "Unparsable Redirect (URI module missing)";
934 #$recurse = 0;
935 }
858 } 936 }
859 937
860 $hdr{location} = "$url/$hdr{location}"; 938 $hdr{location} = $loc;
861 } 939 }
862 940
863 my $redirect; 941 my $redirect;
864 942
865 if ($recurse) { 943 if ($recurse) {
867 945
868 # industry standard is to redirect POST as GET for 946 # industry standard is to redirect POST as GET for
869 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1. 947 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
870 # also, the UA should ask the user for 301 and 307 and POST, 948 # also, the UA should ask the user for 301 and 307 and POST,
871 # industry standard seems to be to simply follow. 949 # industry standard seems to be to simply follow.
872 # we go with the industry standard. 950 # we go with the industry standard. 308 is defined
951 # by rfc7538
873 if ($status == 301 or $status == 302 or $status == 303) { 952 if ($status == 301 or $status == 302 or $status == 303) {
953 $redirect = 1;
874 # HTTP/1.1 is unclear on how to mutate the method 954 # HTTP/1.1 is unclear on how to mutate the method
875 $method = "GET" unless $method eq "HEAD"; 955 unless ($method eq "HEAD") {
876 $redirect = 1; 956 $method = "GET";
957 delete $arg{body};
958 }
877 } elsif ($status == 307) { 959 } elsif ($status == 307 or $status == 308) {
878 $redirect = 1; 960 $redirect = 1;
879 } 961 }
880 } 962 }
881 963
882 my $finish = sub { # ($data, $err_status, $err_reason[, $persistent]) 964 my $finish = sub { # ($data, $err_status, $err_reason[, $persistent])
958 $finish->(delete $state{handle}); 1040 $finish->(delete $state{handle});
959 1041
960 } elsif ($chunked) { 1042 } elsif ($chunked) {
961 my $cl = 0; 1043 my $cl = 0;
962 my $body = ""; 1044 my $body = "";
963 my $on_body = $arg{on_body} || sub { $body .= shift; 1 }; 1045 my $on_body = (!$redirect && $arg{on_body}) || sub { $body .= shift; 1 };
964 1046
965 $state{read_chunk} = sub { 1047 $state{read_chunk} = sub {
966 $_[1] =~ /^([0-9a-fA-F]+)/ 1048 $_[1] =~ /^([0-9a-fA-F]+)/
967 or return $finish->(undef, $ae_error => "Garbled chunked transfer encoding"); 1049 or return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
968 1050
1001 } 1083 }
1002 }; 1084 };
1003 1085
1004 $_[0]->push_read (line => $state{read_chunk}); 1086 $_[0]->push_read (line => $state{read_chunk});
1005 1087
1006 } elsif ($arg{on_body}) { 1088 } elsif (!$redirect && $arg{on_body}) {
1007 if (defined $len) { 1089 if (defined $len) {
1008 $_[0]->on_read (sub { 1090 $_[0]->on_read (sub {
1009 $len -= length $_[0]{rbuf}; 1091 $len -= length $_[0]{rbuf};
1010 1092
1011 $arg{on_body}(delete $_[0]{rbuf}, \%hdr) 1093 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
1050 _destroy_state %state; 1132 _destroy_state %state;
1051 1133
1052 %state = (); 1134 %state = ();
1053 $state{recurse} = 1135 $state{recurse} =
1054 http_request ( 1136 http_request (
1055 $method => $url, 1137 $method => $url,
1056 %arg, 1138 %arg,
1139 recurse => $recurse - 1,
1057 keepalive => 0, 1140 persistent => 0,
1058 sub { 1141 sub {
1059 %state = (); 1142 %state = ();
1060 &$cb 1143 &$cb
1061 } 1144 }
1062 ); 1145 );
1108 1191
1109 # now handle proxy-CONNECT method 1192 # now handle proxy-CONNECT method
1110 if ($proxy && $uscheme eq "https") { 1193 if ($proxy && $uscheme eq "https") {
1111 # oh dear, we have to wrap it into a connect request 1194 # oh dear, we have to wrap it into a connect request
1112 1195
1196 my $auth = exists $hdr{"proxy-authorization"}
1197 ? "proxy-authorization: " . (delete $hdr{"proxy-authorization"}) . "\015\012"
1198 : "";
1199
1113 # maybe re-use $uauthority with patched port? 1200 # maybe re-use $uauthority with patched port?
1114 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012\015\012"); 1201 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012$auth\015\012");
1115 $state{handle}->push_read (line => $qr_nlnl, sub { 1202 $state{handle}->push_read (line => $qr_nlnl, sub {
1116 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix 1203 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
1117 or return _error %state, $cb, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" }; 1204 or return _error %state, $cb, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" };
1118 1205
1119 if ($2 == 200) { 1206 if ($2 == 200) {
1122 } else { 1209 } else {
1123 _error %state, $cb, { @pseudo, Status => $2, Reason => $3 }; 1210 _error %state, $cb, { @pseudo, Status => $2, Reason => $3 };
1124 } 1211 }
1125 }); 1212 });
1126 } else { 1213 } else {
1214 delete $hdr{"proxy-authorization"} unless $proxy;
1215
1127 $handle_actual_request->(); 1216 $handle_actual_request->();
1128 } 1217 }
1129 }; 1218 };
1130 1219
1131 _get_slot $uhost, sub { 1220 _get_slot $uhost, sub {
1137 # on a keepalive request (in theory, this should be a separate config option). 1226 # on a keepalive request (in theory, this should be a separate config option).
1138 if ($persistent && $KA_CACHE{$ka_key}) { 1227 if ($persistent && $KA_CACHE{$ka_key}) {
1139 $was_persistent = 1; 1228 $was_persistent = 1;
1140 1229
1141 $state{handle} = ka_fetch $ka_key; 1230 $state{handle} = ka_fetch $ka_key;
1142 $state{handle}->destroyed 1231# $state{handle}->destroyed
1143 and die "AnyEvent::HTTP: unexpectedly got a destructed handle (1), please report.";#d# 1232# and die "AnyEvent::HTTP: unexpectedly got a destructed handle (1), please report.";#d#
1144 $prepare_handle->(); 1233 $prepare_handle->();
1145 $state{handle}->destroyed 1234# $state{handle}->destroyed
1146 and die "AnyEvent::HTTP: unexpectedly got a destructed handle (2), please report.";#d# 1235# and die "AnyEvent::HTTP: unexpectedly got a destructed handle (2), please report.";#d#
1147 $handle_actual_request->(); 1236 $handle_actual_request->();
1148 1237
1149 } else { 1238 } else {
1150 my $tcp_connect = $arg{tcp_connect} 1239 my $tcp_connect = $arg{tcp_connect}
1151 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 1240 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
1193Sets the default proxy server to use. The proxy-url must begin with a 1282Sets the default proxy server to use. The proxy-url must begin with a
1194string of the form C<http://host:port>, croaks otherwise. 1283string of the form C<http://host:port>, croaks otherwise.
1195 1284
1196To clear an already-set proxy, use C<undef>. 1285To clear an already-set proxy, use C<undef>.
1197 1286
1198When AnyEvent::HTTP is laoded for the first time it will query the 1287When AnyEvent::HTTP is loaded for the first time it will query the
1199default proxy from the operating system, currently by looking at 1288default proxy from the operating system, currently by looking at
1200C<$ENV{http_proxy>}. 1289C<$ENV{http_proxy>}.
1201 1290
1202=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] 1291=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
1203 1292
1205C<$session_end> is given and true, then additionally remove all session 1294C<$session_end> is given and true, then additionally remove all session
1206cookies. 1295cookies.
1207 1296
1208You should call this function (with a true C<$session_end>) before you 1297You should call this function (with a true C<$session_end>) before you
1209save cookies to disk, and you should call this function after loading them 1298save cookies to disk, and you should call this function after loading them
1210again. If you have a long-running program you can additonally call this 1299again. If you have a long-running program you can additionally call this
1211function from time to time. 1300function from time to time.
1212 1301
1213A cookie jar is initially an empty hash-reference that is managed by this 1302A cookie jar is initially an empty hash-reference that is managed by this
1214module. It's format is subject to change, but currently it is like this: 1303module. Its format is subject to change, but currently it is as follows:
1215 1304
1216The key C<version> has to contain C<1>, otherwise the hash gets 1305The key C<version> has to contain C<2>, otherwise the hash gets
1217emptied. All other keys are hostnames or IP addresses pointing to 1306cleared. All other keys are hostnames or IP addresses pointing to
1218hash-references. The key for these inner hash references is the 1307hash-references. The key for these inner hash references is the
1219server path for which this cookie is meant, and the values are again 1308server path for which this cookie is meant, and the values are again
1220hash-references. The keys of those hash-references is the cookie name, and 1309hash-references. Each key of those hash-references is a cookie name, and
1221the value, you guessed it, is another hash-reference, this time with the 1310the value, you guessed it, is another hash-reference, this time with the
1222key-value pairs from the cookie, except for C<expires> and C<max-age>, 1311key-value pairs from the cookie, except for C<expires> and C<max-age>,
1223which have been replaced by a C<_expires> key that contains the cookie 1312which have been replaced by a C<_expires> key that contains the cookie
1224expiry timestamp. 1313expiry timestamp. Session cookies are indicated by not having an
1314C<_expires> key.
1225 1315
1226Here is an example of a cookie jar with a single cookie, so you have a 1316Here is an example of a cookie jar with a single cookie, so you have a
1227chance of understanding the above paragraph: 1317chance of understanding the above paragraph:
1228 1318
1229 { 1319 {
1230 version => 1, 1320 version => 2,
1231 "10.0.0.1" => { 1321 "10.0.0.1" => {
1232 "/" => { 1322 "/" => {
1233 "mythweb_id" => { 1323 "mythweb_id" => {
1234 _expires => 1293917923, 1324 _expires => 1293917923,
1235 value => "ooRung9dThee3ooyXooM1Ohm", 1325 value => "ooRung9dThee3ooyXooM1Ohm",
1253 1343
1254The default value for the C<recurse> request parameter (default: C<10>). 1344The default value for the C<recurse> request parameter (default: C<10>).
1255 1345
1256=item $AnyEvent::HTTP::TIMEOUT 1346=item $AnyEvent::HTTP::TIMEOUT
1257 1347
1258The default timeout for conenction operations (default: C<300>). 1348The default timeout for connection operations (default: C<300>).
1259 1349
1260=item $AnyEvent::HTTP::USERAGENT 1350=item $AnyEvent::HTTP::USERAGENT
1261 1351
1262The default value for the C<User-Agent> header (the default is 1352The default value for the C<User-Agent> header (the default is
1263C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>). 1353C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>).
1264 1354
1265=item $AnyEvent::HTTP::MAX_PER_HOST 1355=item $AnyEvent::HTTP::MAX_PER_HOST
1266 1356
1267The maximum number of concurrent connections to the same host (identified 1357The maximum number of concurrent connections to the same host (identified
1268by the hostname). If the limit is exceeded, then the additional requests 1358by the hostname). If the limit is exceeded, then additional requests
1269are queued until previous connections are closed. Both persistent and 1359are queued until previous connections are closed. Both persistent and
1270non-persistent connections are counted in this limit. 1360non-persistent connections are counted in this limit.
1271 1361
1272The default value for this is C<4>, and it is highly advisable to not 1362The default value for this is C<4>, and it is highly advisable to not
1273increase it much. 1363increase it much.
1274 1364
1275For comparison: the RFC's recommend 4 non-persistent or 2 persistent 1365For comparison: the RFC's recommend 4 non-persistent or 2 persistent
1276connections, older browsers used 2, newers (such as firefox 3) typically 1366connections, older browsers used 2, newer ones (such as firefox 3)
1277use 6, and Opera uses 8 because like, they have the fastest browser and 1367typically use 6, and Opera uses 8 because like, they have the fastest
1278give a shit for everybody else on the planet. 1368browser and give a shit for everybody else on the planet.
1279 1369
1280=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT 1370=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT
1281 1371
1282The time after which idle persistent conenctions get closed by 1372The time after which idle persistent connections get closed by
1283AnyEvent::HTTP (default: C<3>). 1373AnyEvent::HTTP (default: C<3>).
1284 1374
1285=item $AnyEvent::HTTP::ACTIVE 1375=item $AnyEvent::HTTP::ACTIVE
1286 1376
1287The number of active connections. This is not the number of currently 1377The number of active connections. This is not the number of currently
1328 # other formats fail in the loop below 1418 # other formats fail in the loop below
1329 1419
1330 for (0..11) { 1420 for (0..11) {
1331 if ($m eq $month[$_]) { 1421 if ($m eq $month[$_]) {
1332 require Time::Local; 1422 require Time::Local;
1333 return Time::Local::timegm ($S, $M, $H, $d, $_, $y); 1423 return eval { Time::Local::timegm ($S, $M, $H, $d, $_, $y) };
1334 } 1424 }
1335 } 1425 }
1336 1426
1337 undef 1427 undef
1338} 1428}
1352 set_proxy $ENV{http_proxy}; 1442 set_proxy $ENV{http_proxy};
1353}; 1443};
1354 1444
1355=head2 SHOWCASE 1445=head2 SHOWCASE
1356 1446
1357This section contaisn some more elaborate "real-world" examples or code 1447This section contains some more elaborate "real-world" examples or code
1358snippets. 1448snippets.
1359 1449
1360=head2 HTTP/1.1 FILE DOWNLOAD 1450=head2 HTTP/1.1 FILE DOWNLOAD
1361 1451
1362Downloading files with HTTP can be quite tricky, especially when something 1452Downloading files with HTTP can be quite tricky, especially when something
1366last modified time to check for file content changes, and works with many 1456last modified time to check for file content changes, and works with many
1367HTTP/1.0 servers as well, and usually falls back to a complete re-download 1457HTTP/1.0 servers as well, and usually falls back to a complete re-download
1368on older servers. 1458on older servers.
1369 1459
1370It calls the completion callback with either C<undef>, which means a 1460It calls the completion callback with either C<undef>, which means a
1371nonretryable error occured, C<0> when the download was partial and should 1461nonretryable error occurred, C<0> when the download was partial and should
1372be retried, and C<1> if it was successful. 1462be retried, and C<1> if it was successful.
1373 1463
1374 use AnyEvent::HTTP; 1464 use AnyEvent::HTTP;
1375 1465
1376 sub download($$$) { 1466 sub download($$$) {
1380 or die "$file: $!"; 1470 or die "$file: $!";
1381 1471
1382 my %hdr; 1472 my %hdr;
1383 my $ofs = 0; 1473 my $ofs = 0;
1384 1474
1385 warn stat $fh;
1386 warn -s _;
1387 if (stat $fh and -s _) { 1475 if (stat $fh and -s _) {
1388 $ofs = -s _; 1476 $ofs = -s _;
1389 warn "-s is ", $ofs; 1477 warn "-s is ", $ofs;
1390 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; 1478 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9];
1391 $hdr{"range"} = "bytes=$ofs-"; 1479 $hdr{"range"} = "bytes=$ofs-";
1419 my (undef, $hdr) = @_; 1507 my (undef, $hdr) = @_;
1420 1508
1421 my $status = $hdr->{Status}; 1509 my $status = $hdr->{Status};
1422 1510
1423 if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) { 1511 if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) {
1424 utime $fh, $time, $time; 1512 utime $time, $time, $fh;
1425 } 1513 }
1426 1514
1427 if ($status == 200 || $status == 206 || $status == 416) { 1515 if ($status == 200 || $status == 206 || $status == 416) {
1428 # download ok || resume ok || file already fully downloaded 1516 # download ok || resume ok || file already fully downloaded
1429 $cb->(1, $hdr); 1517 $cb->(1, $hdr);

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines