ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.125 by root, Thu May 14 02:04:35 2015 UTC vs.
Revision 1.140 by root, Wed Mar 6 19:29:18 2024 UTC

4 4
5=head1 SYNOPSIS 5=head1 SYNOPSIS
6 6
7 use AnyEvent::HTTP; 7 use AnyEvent::HTTP;
8 8
9 http_get "http://www.nethype.de/", sub { print $_[1] }; 9 http_get "http://www.nethype.de/", sub {
10 my ($body, $hdr) = @_;
11 print "$hdr->{URL} Status: $hdr->{Status}\n";
12 print $body;
13 };
10 14
11 # ... do something else here 15 # ... do something else here
12 16
13=head1 DESCRIPTION 17=head1 DESCRIPTION
14 18
46use AnyEvent::Util (); 50use AnyEvent::Util ();
47use AnyEvent::Handle (); 51use AnyEvent::Handle ();
48 52
49use base Exporter::; 53use base Exporter::;
50 54
51our $VERSION = 2.22; 55our $VERSION = 2.25;
52 56
53our @EXPORT = qw(http_get http_post http_head http_request); 57our @EXPORT = qw(http_get http_post http_head http_request);
54 58
55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)"; 59our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)";
56our $MAX_RECURSE = 10; 60our $MAX_RECURSE = 10;
198C<AnyEvent::HTTP::set_proxy>). 202C<AnyEvent::HTTP::set_proxy>).
199 203
200Currently, if your proxy requires authorization, you have to specify an 204Currently, if your proxy requires authorization, you have to specify an
201appropriate "Proxy-Authorization" header in every request. 205appropriate "Proxy-Authorization" header in every request.
202 206
207Note that this module will prefer an existing persistent connection,
208even if that connection was made using another proxy. If you need to
209ensure that a new connection is made in this case, you can either force
210C<persistent> to false or e.g. use the proxy address in your C<sessionid>.
211
203=item body => $string 212=item body => $string
204 213
205The request body, usually empty. Will be sent as-is (future versions of 214The request body, usually empty. Will be sent as-is (future versions of
206this module might offer more options). 215this module might offer more options).
207 216
239The default for this option is C<low>, which could be interpreted as "give 248The default for this option is C<low>, which could be interpreted as "give
240me the page, no matter what". 249me the page, no matter what".
241 250
242See also the C<sessionid> parameter. 251See also the C<sessionid> parameter.
243 252
244=item session => $string 253=item sessionid => $string
245 254
246The module might reuse connections to the same host internally. Sometimes 255The module might reuse connections to the same host internally (regardless
247(e.g. when using TLS), you do not want to reuse connections from other 256of other settings, such as C<tcp_connect> or C<proxy>). Sometimes (e.g.
257when using TLS or a specfic proxy), you do not want to reuse connections
248sessions. This can be achieved by setting this parameter to some unique 258from other sessions. This can be achieved by setting this parameter to
249ID (such as the address of an object storing your state data, or the TLS 259some unique ID (such as the address of an object storing your state data
250context) - only connections using the same unique ID will be reused. 260or the TLS context, or the proxy IP) - only connections using the same
261unique ID will be reused.
251 262
252=item on_prepare => $callback->($fh) 263=item on_prepare => $callback->($fh)
253 264
254In rare cases you need to "tune" the socket before it is used to 265In rare cases you need to "tune" the socket before it is used to
255connect (for example, to bind it on a given IP address). This parameter 266connect (for example, to bind it on a given IP address). This parameter
263In even rarer cases you want total control over how AnyEvent::HTTP 274In even rarer cases you want total control over how AnyEvent::HTTP
264establishes connections. Normally it uses L<AnyEvent::Socket::tcp_connect> 275establishes connections. Normally it uses L<AnyEvent::Socket::tcp_connect>
265to do this, but you can provide your own C<tcp_connect> function - 276to do this, but you can provide your own C<tcp_connect> function -
266obviously, it has to follow the same calling conventions, except that it 277obviously, it has to follow the same calling conventions, except that it
267may always return a connection guard object. 278may always return a connection guard object.
279
280The connections made by this hook will be treated as equivalent to
281connections made the built-in way, specifically, they will be put into
282and taken from the persistent connection cache. If your C<$tcp_connect>
283function is incompatible with this kind of re-use, consider switching off
284C<persistent> connections and/or providing a C<sessionid> identifier.
268 285
269There are probably lots of weird uses for this function, starting from 286There are probably lots of weird uses for this function, starting from
270tracing the hosts C<http_request> actually tries to connect, to (inexact 287tracing the hosts C<http_request> actually tries to connect, to (inexact
271but fast) host => IP address caching or even socks protocol support. 288but fast) host => IP address caching or even socks protocol support.
272 289
342=item persistent => $boolean 359=item persistent => $boolean
343 360
344Try to create/reuse a persistent connection. When this flag is set 361Try to create/reuse a persistent connection. When this flag is set
345(default: true for idempotent requests, false for all others), then 362(default: true for idempotent requests, false for all others), then
346C<http_request> tries to re-use an existing (previously-created) 363C<http_request> tries to re-use an existing (previously-created)
347persistent connection to the host and, failing that, tries to create a new 364persistent connection to same host (i.e. identical URL scheme, hostname,
348one. 365port and sessionid) and, failing that, tries to create a new one.
349 366
350Requests failing in certain ways will be automatically retried once, which 367Requests failing in certain ways will be automatically retried once, which
351is dangerous for non-idempotent requests, which is why it defaults to off 368is dangerous for non-idempotent requests, which is why it defaults to off
352for them. The reason for this is because the bozos who designed HTTP/1.1 369for them. The reason for this is because the bozos who designed HTTP/1.1
353made it impossible to distinguish between a fatal error and a normal 370made it impossible to distinguish between a fatal error and a normal
354connection timeout, so you never know whether there was a problem with 371connection timeout, so you never know whether there was a problem with
355your request or not. 372your request or not.
356 373
357When reusing an existent connection, many parameters (such as TLS context) 374When reusing an existent connection, many parameters (such as TLS context)
358will be ignored. See the C<session> parameter for a workaround. 375will be ignored. See the C<sessionid> parameter for a workaround.
359 376
360=item keepalive => $boolean 377=item keepalive => $boolean
361 378
362Only used when C<persistent> is also true. This parameter decides whether 379Only used when C<persistent> is also true. This parameter decides whether
363C<http_request> tries to handshake a HTTP/1.0-style keep-alive connection 380C<http_request> tries to handshake a HTTP/1.0-style keep-alive connection
454 471
455# expire cookies 472# expire cookies
456sub cookie_jar_expire($;$) { 473sub cookie_jar_expire($;$) {
457 my ($jar, $session_end) = @_; 474 my ($jar, $session_end) = @_;
458 475
459 %$jar = () if $jar->{version} != 1; 476 %$jar = () if $jar->{version} != 2;
460 477
461 my $anow = AE::now; 478 my $anow = AE::now;
462 479
463 while (my ($chost, $paths) = each %$jar) { 480 while (my ($chost, $paths) = each %$jar) {
464 next unless ref $paths; 481 next unless ref $paths;
484 501
485# extract cookies from jar 502# extract cookies from jar
486sub cookie_jar_extract($$$$) { 503sub cookie_jar_extract($$$$) {
487 my ($jar, $scheme, $host, $path) = @_; 504 my ($jar, $scheme, $host, $path) = @_;
488 505
489 %$jar = () if $jar->{version} != 1; 506 %$jar = () if $jar->{version} != 2;
507
508 $host = AnyEvent::Util::idn_to_ascii $host
509 if $host =~ /[^\x00-\x7f]/;
490 510
491 my @cookies; 511 my @cookies;
492 512
493 while (my ($chost, $paths) = each %$jar) { 513 while (my ($chost, $paths) = each %$jar) {
494 next unless ref $paths; 514 next unless ref $paths;
495 515
496 if ($chost =~ /^\./) { 516 # exact match or suffix including . match
497 next unless $chost eq substr $host, -length $chost; 517 $chost eq $host or ".$chost" eq substr $host, -1 - length $chost
498 } elsif ($chost =~ /\./) {
499 next unless $chost eq $host;
500 } else {
501 next; 518 or next;
502 }
503 519
504 while (my ($cpath, $cookies) = each %$paths) { 520 while (my ($cpath, $cookies) = each %$paths) {
505 next unless $cpath eq substr $path, 0, length $cpath; 521 next unless $cpath eq substr $path, 0, length $cpath;
506 522
507 while (my ($cookie, $kv) = each %$cookies) { 523 while (my ($cookie, $kv) = each %$cookies) {
528} 544}
529 545
530# parse set_cookie header into jar 546# parse set_cookie header into jar
531sub cookie_jar_set_cookie($$$$) { 547sub cookie_jar_set_cookie($$$$) {
532 my ($jar, $set_cookie, $host, $date) = @_; 548 my ($jar, $set_cookie, $host, $date) = @_;
549
550 %$jar = () if $jar->{version} != 2;
533 551
534 my $anow = int AE::now; 552 my $anow = int AE::now;
535 my $snow; # server-now 553 my $snow; # server-now
536 554
537 for ($set_cookie) { 555 for ($set_cookie) {
583 601
584 my $cdom; 602 my $cdom;
585 my $cpath = (delete $kv{path}) || "/"; 603 my $cpath = (delete $kv{path}) || "/";
586 604
587 if (exists $kv{domain}) { 605 if (exists $kv{domain}) {
588 $cdom = delete $kv{domain}; 606 $cdom = $kv{domain};
589 607
590 $cdom =~ s/^\.?/./; # make sure it starts with a "." 608 $cdom =~ s/^\.?/./; # make sure it starts with a "."
591 609
592 next if $cdom =~ /\.$/; 610 next if $cdom =~ /\.$/;
593 611
594 # this is not rfc-like and not netscape-like. go figure. 612 # this is not rfc-like and not netscape-like. go figure.
595 my $ndots = $cdom =~ y/.//; 613 my $ndots = $cdom =~ y/.//;
596 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2); 614 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2);
615
616 $cdom = substr $cdom, 1; # remove initial .
597 } else { 617 } else {
598 $cdom = $host; 618 $cdom = $host;
599 } 619 }
600 620
601 # store it 621 # store it
602 $jar->{version} = 1; 622 $jar->{version} = 2;
603 $jar->{lc $cdom}{$cpath}{$name} = \%kv; 623 $jar->{lc $cdom}{$cpath}{$name} = \%kv;
604 624
605 redo if /\G\s*,/gc; 625 redo if /\G\s*,/gc;
606 } 626 }
607} 627}
700} 720}
701 721
702our %IDEMPOTENT = ( 722our %IDEMPOTENT = (
703 DELETE => 1, 723 DELETE => 1,
704 GET => 1, 724 GET => 1,
725 QUERY => 1,
705 HEAD => 1, 726 HEAD => 1,
706 OPTIONS => 1, 727 OPTIONS => 1,
707 PUT => 1, 728 PUT => 1,
708 TRACE => 1, 729 TRACE => 1,
709 730
721 MKCOL => 1, 742 MKCOL => 1,
722 MKREDIRECTREF => 1, 743 MKREDIRECTREF => 1,
723 MKWORKSPACE => 1, 744 MKWORKSPACE => 1,
724 MOVE => 1, 745 MOVE => 1,
725 ORDERPATCH => 1, 746 ORDERPATCH => 1,
747 PRI => 1,
726 PROPFIND => 1, 748 PROPFIND => 1,
727 PROPPATCH => 1, 749 PROPPATCH => 1,
728 REBIND => 1, 750 REBIND => 1,
729 REPORT => 1, 751 REPORT => 1,
730 SEARCH => 1, 752 SEARCH => 1,
895 # we give our best and fall back to URI if available. 917 # we give our best and fall back to URI if available.
896 if (exists $hdr{location}) { 918 if (exists $hdr{location}) {
897 my $loc = $hdr{location}; 919 my $loc = $hdr{location};
898 920
899 if ($loc =~ m%^//%) { # // 921 if ($loc =~ m%^//%) { # //
900 $loc = "$rscheme:$loc"; 922 $loc = "$uscheme:$loc";
901 923
902 } elsif ($loc eq "") { 924 } elsif ($loc eq "") {
903 $loc = $url; 925 $loc = $url;
904 926
905 } elsif ($loc !~ /^(?: $ | [^:\/?\#]+ : )/x) { # anything "simple" 927 } elsif ($loc !~ /^(?: $ | [^:\/?\#]+ : )/x) { # anything "simple"
906 $loc =~ s/^\.\/+//; 928 $loc =~ s/^\.\/+//;
907 929
908 if ($loc !~ m%^[.?#]%) { 930 if ($loc !~ m%^[.?#]%) {
909 my $prefix = "$rscheme://$uhost:$uport"; 931 my $prefix = "$uscheme://$uauthority";
910 932
911 unless ($loc =~ s/^\///) { 933 unless ($loc =~ s/^\///) {
912 $prefix .= $upath; 934 $prefix .= $upath;
913 $prefix =~ s/\/[^\/]*$//; 935 $prefix =~ s/\/[^\/]*$//;
914 } 936 }
1031 $finish->(delete $state{handle}); 1053 $finish->(delete $state{handle});
1032 1054
1033 } elsif ($chunked) { 1055 } elsif ($chunked) {
1034 my $cl = 0; 1056 my $cl = 0;
1035 my $body = ""; 1057 my $body = "";
1036 my $on_body = $arg{on_body} || sub { $body .= shift; 1 }; 1058 my $on_body = (!$redirect && $arg{on_body}) || sub { $body .= shift; 1 };
1037 1059
1038 $state{read_chunk} = sub { 1060 $state{read_chunk} = sub {
1039 $_[1] =~ /^([0-9a-fA-F]+)/ 1061 $_[1] =~ /^([0-9a-fA-F]+)/
1040 or return $finish->(undef, $ae_error => "Garbled chunked transfer encoding"); 1062 or return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
1041 1063
1074 } 1096 }
1075 }; 1097 };
1076 1098
1077 $_[0]->push_read (line => $state{read_chunk}); 1099 $_[0]->push_read (line => $state{read_chunk});
1078 1100
1079 } elsif ($arg{on_body}) { 1101 } elsif (!$redirect && $arg{on_body}) {
1080 if (defined $len) { 1102 if (defined $len) {
1081 $_[0]->on_read (sub { 1103 $_[0]->on_read (sub {
1082 $len -= length $_[0]{rbuf}; 1104 $len -= length $_[0]{rbuf};
1083 1105
1084 $arg{on_body}(delete $_[0]{rbuf}, \%hdr) 1106 $arg{on_body}(delete $_[0]{rbuf}, \%hdr)
1217 # on a keepalive request (in theory, this should be a separate config option). 1239 # on a keepalive request (in theory, this should be a separate config option).
1218 if ($persistent && $KA_CACHE{$ka_key}) { 1240 if ($persistent && $KA_CACHE{$ka_key}) {
1219 $was_persistent = 1; 1241 $was_persistent = 1;
1220 1242
1221 $state{handle} = ka_fetch $ka_key; 1243 $state{handle} = ka_fetch $ka_key;
1222 $state{handle}->destroyed 1244# $state{handle}->destroyed
1223 and die "AnyEvent::HTTP: unexpectedly got a destructed handle (1), please report.";#d# 1245# and die "AnyEvent::HTTP: unexpectedly got a destructed handle (1), please report.";#d#
1224 $prepare_handle->(); 1246 $prepare_handle->();
1225 $state{handle}->destroyed 1247# $state{handle}->destroyed
1226 and die "AnyEvent::HTTP: unexpectedly got a destructed handle (2), please report.";#d# 1248# and die "AnyEvent::HTTP: unexpectedly got a destructed handle (2), please report.";#d#
1249 $rpath = $upath;
1227 $handle_actual_request->(); 1250 $handle_actual_request->();
1228 1251
1229 } else { 1252 } else {
1230 my $tcp_connect = $arg{tcp_connect} 1253 my $tcp_connect = $arg{tcp_connect}
1231 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 1254 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
1291function from time to time. 1314function from time to time.
1292 1315
1293A cookie jar is initially an empty hash-reference that is managed by this 1316A cookie jar is initially an empty hash-reference that is managed by this
1294module. Its format is subject to change, but currently it is as follows: 1317module. Its format is subject to change, but currently it is as follows:
1295 1318
1296The key C<version> has to contain C<1>, otherwise the hash gets 1319The key C<version> has to contain C<2>, otherwise the hash gets
1297emptied. All other keys are hostnames or IP addresses pointing to 1320cleared. All other keys are hostnames or IP addresses pointing to
1298hash-references. The key for these inner hash references is the 1321hash-references. The key for these inner hash references is the
1299server path for which this cookie is meant, and the values are again 1322server path for which this cookie is meant, and the values are again
1300hash-references. Each key of those hash-references is a cookie name, and 1323hash-references. Each key of those hash-references is a cookie name, and
1301the value, you guessed it, is another hash-reference, this time with the 1324the value, you guessed it, is another hash-reference, this time with the
1302key-value pairs from the cookie, except for C<expires> and C<max-age>, 1325key-value pairs from the cookie, except for C<expires> and C<max-age>,
1306 1329
1307Here is an example of a cookie jar with a single cookie, so you have a 1330Here is an example of a cookie jar with a single cookie, so you have a
1308chance of understanding the above paragraph: 1331chance of understanding the above paragraph:
1309 1332
1310 { 1333 {
1311 version => 1, 1334 version => 2,
1312 "10.0.0.1" => { 1335 "10.0.0.1" => {
1313 "/" => { 1336 "/" => {
1314 "mythweb_id" => { 1337 "mythweb_id" => {
1315 _expires => 1293917923, 1338 _expires => 1293917923,
1316 value => "ooRung9dThee3ooyXooM1Ohm", 1339 value => "ooRung9dThee3ooyXooM1Ohm",
1461 or die "$file: $!"; 1484 or die "$file: $!";
1462 1485
1463 my %hdr; 1486 my %hdr;
1464 my $ofs = 0; 1487 my $ofs = 0;
1465 1488
1466 warn stat $fh;
1467 warn -s _;
1468 if (stat $fh and -s _) { 1489 if (stat $fh and -s _) {
1469 $ofs = -s _; 1490 $ofs = -s _;
1470 warn "-s is ", $ofs; 1491 warn "-s is ", $ofs;
1471 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; 1492 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9];
1472 $hdr{"range"} = "bytes=$ofs-"; 1493 $hdr{"range"} = "bytes=$ofs-";
1500 my (undef, $hdr) = @_; 1521 my (undef, $hdr) = @_;
1501 1522
1502 my $status = $hdr->{Status}; 1523 my $status = $hdr->{Status};
1503 1524
1504 if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) { 1525 if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) {
1505 utime $fh, $time, $time; 1526 utime $time, $time, $fh;
1506 } 1527 }
1507 1528
1508 if ($status == 200 || $status == 206 || $status == 416) { 1529 if ($status == 200 || $status == 206 || $status == 416) {
1509 # download ok || resume ok || file already fully downloaded 1530 # download ok || resume ok || file already fully downloaded
1510 $cb->(1, $hdr); 1531 $cb->(1, $hdr);

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines