ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.116 by root, Fri May 17 07:19:23 2013 UTC vs.
Revision 1.119 by root, Sun Jun 8 23:33:28 2014 UTC

46use AnyEvent::Util (); 46use AnyEvent::Util ();
47use AnyEvent::Handle (); 47use AnyEvent::Handle ();
48 48
49use base Exporter::; 49use base Exporter::;
50 50
51our $VERSION = '2.15'; 51our $VERSION = 2.2;
52 52
53our @EXPORT = qw(http_get http_post http_head http_request); 53our @EXPORT = qw(http_get http_post http_head http_request);
54 54
55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)"; 55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)";
56our $MAX_RECURSE = 10; 56our $MAX_RECURSE = 10;
89C<http_request> returns a "cancellation guard" - you have to keep the 89C<http_request> returns a "cancellation guard" - you have to keep the
90object at least alive until the callback get called. If the object gets 90object at least alive until the callback get called. If the object gets
91destroyed before the callback is called, the request will be cancelled. 91destroyed before the callback is called, the request will be cancelled.
92 92
93The callback will be called with the response body data as first argument 93The callback will be called with the response body data as first argument
94(or C<undef> if an error occured), and a hash-ref with response headers 94(or C<undef> if an error occurred), and a hash-ref with response headers
95(and trailers) as second argument. 95(and trailers) as second argument.
96 96
97All the headers in that hash are lowercased. In addition to the response 97All the headers in that hash are lowercased. In addition to the response
98headers, the "pseudo-headers" (uppercase to avoid clashing with possible 98headers, the "pseudo-headers" (uppercase to avoid clashing with possible
99response headers) C<HTTPVersion>, C<Status> and C<Reason> contain the 99response headers) C<HTTPVersion>, C<Status> and C<Reason> contain the
157=item recurse => $count (default: $MAX_RECURSE) 157=item recurse => $count (default: $MAX_RECURSE)
158 158
159Whether to recurse requests or not, e.g. on redirects, authentication and 159Whether to recurse requests or not, e.g. on redirects, authentication and
160other retries and so on, and how often to do so. 160other retries and so on, and how often to do so.
161 161
162Only redirects to http and https URLs are supported. While most common
163redirection forms are handled entirely within this module, some require
164the use of the optional L<URI> module. If it is required but missing, then
165the request will fail with an error.
166
162=item headers => hashref 167=item headers => hashref
163 168
164The request headers to use. Currently, C<http_request> may provide its own 169The request headers to use. Currently, C<http_request> may provide its own
165C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers and 170C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers and
166will provide defaults at least for C<TE:>, C<Referer:> and C<User-Agent:> 171will provide defaults at least for C<TE:>, C<Referer:> and C<User-Agent:>
242context) - only connections using the same unique ID will be reused. 247context) - only connections using the same unique ID will be reused.
243 248
244=item on_prepare => $callback->($fh) 249=item on_prepare => $callback->($fh)
245 250
246In rare cases you need to "tune" the socket before it is used to 251In rare cases you need to "tune" the socket before it is used to
247connect (for exmaple, to bind it on a given IP address). This parameter 252connect (for example, to bind it on a given IP address). This parameter
248overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect> 253overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect>
249and behaves exactly the same way (e.g. it has to provide a 254and behaves exactly the same way (e.g. it has to provide a
250timeout). See the description for the C<$prepare_cb> argument of 255timeout). See the description for the C<$prepare_cb> argument of
251C<AnyEvent::Socket::tcp_connect> for details. 256C<AnyEvent::Socket::tcp_connect> for details.
252 257
821 my $was_persistent; # true if this is actually a recycled connection 826 my $was_persistent; # true if this is actually a recycled connection
822 827
823 # the key to use in the keepalive cache 828 # the key to use in the keepalive cache
824 my $ka_key = "$uscheme\x00$uhost\x00$uport\x00$arg{sessionid}"; 829 my $ka_key = "$uscheme\x00$uhost\x00$uport\x00$arg{sessionid}";
825 830
826 $hdr{connection} = ($persistent ? $keepalive ? "keep-alive " : "" : "close ") . "Te"; #1.1 831 $hdr{connection} = ($persistent ? $keepalive ? "keep-alive, " : "" : "close, ") . "Te"; #1.1
827 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1 832 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1
828 833
829 my %state = (connect_guard => 1); 834 my %state = (connect_guard => 1);
830 835
831 my $ae_error = 595; # connecting 836 my $ae_error = 595; # connecting
844 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr) 849 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
845 . "\015\012" 850 . "\015\012"
846 . (delete $arg{body}) 851 . (delete $arg{body})
847 ); 852 );
848 853
849 # return if error occured during push_write() 854 # return if error occurred during push_write()
850 return unless %state; 855 return unless %state;
851 856
852 # reduce memory usage, save a kitten, also re-use it for the response headers. 857 # reduce memory usage, save a kitten, also re-use it for the response headers.
853 %hdr = (); 858 %hdr = ();
854 859
881 886
882 %hdr = (%$hdr, @pseudo); 887 %hdr = (%$hdr, @pseudo);
883 } 888 }
884 889
885 # redirect handling 890 # redirect handling
886 # microsoft and other shitheads don't give a shit for following standards, 891 # relative uri handling forced by microsoft and other shitheads.
887 # try to support some common forms of broken Location headers. 892 # we give our best and fall back to URI if available.
888 if ($hdr{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) { 893 if (exists $hdr{location}) {
894 my $loc = $hdr{location};
895
896 if ($loc =~ m%^//%) { # //
897 $loc = "$rscheme:$loc";
898
899 } elsif ($loc eq "") {
900 $loc = $url;
901
902 } elsif ($loc !~ /^(?: $ | [^:\/?\#]+ : )/x) { # anything "simple"
889 $hdr{location} =~ s/^\.\/+//; 903 $loc =~ s/^\.\/+//;
890 904
905 if ($loc !~ m%^[.?#]%) {
891 my $url = "$rscheme://$uhost:$uport"; 906 my $prefix = "$rscheme://$uhost:$uport";
892 907
893 unless ($hdr{location} =~ s/^\///) { 908 unless ($loc =~ s/^\///) {
894 $url .= $upath; 909 $prefix .= $upath;
895 $url =~ s/\/[^\/]*$//; 910 $prefix =~ s/\/[^\/]*$//;
911 }
912
913 $loc = "$prefix/$loc";
914
915 } elsif (eval { require URI }) { # uri
916 $loc = URI->new_abs ($loc, $url)->as_string;
917
918 } else {
919 return _error %state, $cb, { @pseudo, Status => 599, Reason => "Cannot parse Location (URI module missing)" };
920 #$hdr{Status} = 599;
921 #$hdr{Reason} = "Unparsable Redirect (URI module missing)";
922 #$recurse = 0;
923 }
896 } 924 }
897 925
898 $hdr{location} = "$url/$hdr{location}"; 926 $hdr{location} = $loc;
899 } 927 }
900 928
901 my $redirect; 929 my $redirect;
902 930
903 if ($recurse) { 931 if ($recurse) {
905 933
906 # industry standard is to redirect POST as GET for 934 # industry standard is to redirect POST as GET for
907 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1. 935 # 301, 302 and 303, in contrast to HTTP/1.0 and 1.1.
908 # also, the UA should ask the user for 301 and 307 and POST, 936 # also, the UA should ask the user for 301 and 307 and POST,
909 # industry standard seems to be to simply follow. 937 # industry standard seems to be to simply follow.
910 # we go with the industry standard. 938 # we go with the industry standard. 308 is defined
939 # by rfc7238
911 if ($status == 301 or $status == 302 or $status == 303) { 940 if ($status == 301 or $status == 302 or $status == 303) {
912 # HTTP/1.1 is unclear on how to mutate the method 941 # HTTP/1.1 is unclear on how to mutate the method
913 $method = "GET" unless $method eq "HEAD"; 942 $method = "GET" unless $method eq "HEAD";
914 $redirect = 1; 943 $redirect = 1;
915 } elsif ($status == 307) { 944 } elsif ($status == 307 or $status == 308) {
916 $redirect = 1; 945 $redirect = 1;
917 } 946 }
918 } 947 }
919 948
920 my $finish = sub { # ($data, $err_status, $err_reason[, $persistent]) 949 my $finish = sub { # ($data, $err_status, $err_reason[, $persistent])
1244C<$session_end> is given and true, then additionally remove all session 1273C<$session_end> is given and true, then additionally remove all session
1245cookies. 1274cookies.
1246 1275
1247You should call this function (with a true C<$session_end>) before you 1276You should call this function (with a true C<$session_end>) before you
1248save cookies to disk, and you should call this function after loading them 1277save cookies to disk, and you should call this function after loading them
1249again. If you have a long-running program you can additonally call this 1278again. If you have a long-running program you can additionally call this
1250function from time to time. 1279function from time to time.
1251 1280
1252A cookie jar is initially an empty hash-reference that is managed by this 1281A cookie jar is initially an empty hash-reference that is managed by this
1253module. It's format is subject to change, but currently it is like this: 1282module. It's format is subject to change, but currently it is like this:
1254 1283
1311 1340
1312The default value for this is C<4>, and it is highly advisable to not 1341The default value for this is C<4>, and it is highly advisable to not
1313increase it much. 1342increase it much.
1314 1343
1315For comparison: the RFC's recommend 4 non-persistent or 2 persistent 1344For comparison: the RFC's recommend 4 non-persistent or 2 persistent
1316connections, older browsers used 2, newers (such as firefox 3) typically 1345connections, older browsers used 2, newer ones (such as firefox 3)
1317use 6, and Opera uses 8 because like, they have the fastest browser and 1346typically use 6, and Opera uses 8 because like, they have the fastest
1318give a shit for everybody else on the planet. 1347browser and give a shit for everybody else on the planet.
1319 1348
1320=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT 1349=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT
1321 1350
1322The time after which idle persistent connections get closed by 1351The time after which idle persistent connections get closed by
1323AnyEvent::HTTP (default: C<3>). 1352AnyEvent::HTTP (default: C<3>).
1392 set_proxy $ENV{http_proxy}; 1421 set_proxy $ENV{http_proxy};
1393}; 1422};
1394 1423
1395=head2 SHOWCASE 1424=head2 SHOWCASE
1396 1425
1397This section contaisn some more elaborate "real-world" examples or code 1426This section contains some more elaborate "real-world" examples or code
1398snippets. 1427snippets.
1399 1428
1400=head2 HTTP/1.1 FILE DOWNLOAD 1429=head2 HTTP/1.1 FILE DOWNLOAD
1401 1430
1402Downloading files with HTTP can be quite tricky, especially when something 1431Downloading files with HTTP can be quite tricky, especially when something
1406last modified time to check for file content changes, and works with many 1435last modified time to check for file content changes, and works with many
1407HTTP/1.0 servers as well, and usually falls back to a complete re-download 1436HTTP/1.0 servers as well, and usually falls back to a complete re-download
1408on older servers. 1437on older servers.
1409 1438
1410It calls the completion callback with either C<undef>, which means a 1439It calls the completion callback with either C<undef>, which means a
1411nonretryable error occured, C<0> when the download was partial and should 1440nonretryable error occurred, C<0> when the download was partial and should
1412be retried, and C<1> if it was successful. 1441be retried, and C<1> if it was successful.
1413 1442
1414 use AnyEvent::HTTP; 1443 use AnyEvent::HTTP;
1415 1444
1416 sub download($$$) { 1445 sub download($$$) {

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines