ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.104 by root, Thu Feb 24 15:09:03 2011 UTC vs.
Revision 1.118 by root, Mon Nov 18 01:01:02 2013 UTC

46use AnyEvent::Util (); 46use AnyEvent::Util ();
47use AnyEvent::Handle (); 47use AnyEvent::Handle ();
48 48
49use base Exporter::; 49use base Exporter::;
50 50
51our $VERSION = '2.1'; 51our $VERSION = '2.15';
52 52
53our @EXPORT = qw(http_get http_post http_head http_request); 53our @EXPORT = qw(http_get http_post http_head http_request);
54 54
55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)"; 55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)";
56our $MAX_RECURSE = 10; 56our $MAX_RECURSE = 10;
89C<http_request> returns a "cancellation guard" - you have to keep the 89C<http_request> returns a "cancellation guard" - you have to keep the
90object at least alive until the callback get called. If the object gets 90object at least alive until the callback get called. If the object gets
91destroyed before the callback is called, the request will be cancelled. 91destroyed before the callback is called, the request will be cancelled.
92 92
93The callback will be called with the response body data as first argument 93The callback will be called with the response body data as first argument
94(or C<undef> if an error occured), and a hash-ref with response headers 94(or C<undef> if an error occurred), and a hash-ref with response headers
95(and trailers) as second argument. 95(and trailers) as second argument.
96 96
97All the headers in that hash are lowercased. In addition to the response 97All the headers in that hash are lowercased. In addition to the response
98headers, the "pseudo-headers" (uppercase to avoid clashing with possible 98headers, the "pseudo-headers" (uppercase to avoid clashing with possible
99response headers) C<HTTPVersion>, C<Status> and C<Reason> contain the 99response headers) C<HTTPVersion>, C<Status> and C<Reason> contain the
123C<590>-C<599> and the C<Reason> pseudo-header will contain an error 123C<590>-C<599> and the C<Reason> pseudo-header will contain an error
124message. Currently the following status codes are used: 124message. Currently the following status codes are used:
125 125
126=over 4 126=over 4
127 127
128=item 595 - errors during connection etsbalishment, proxy handshake. 128=item 595 - errors during connection establishment, proxy handshake.
129 129
130=item 596 - errors during TLS negotiation, request sending and header processing. 130=item 596 - errors during TLS negotiation, request sending and header processing.
131 131
132=item 597 - errors during body receiving or processing. 132=item 597 - errors during body receiving or processing.
133 133
154 154
155=over 4 155=over 4
156 156
157=item recurse => $count (default: $MAX_RECURSE) 157=item recurse => $count (default: $MAX_RECURSE)
158 158
159Whether to recurse requests or not, e.g. on redirects, authentication 159Whether to recurse requests or not, e.g. on redirects, authentication and
160retries and so on, and how often to do so. 160other retries and so on, and how often to do so.
161 161
162=item headers => hashref 162=item headers => hashref
163 163
164The request headers to use. Currently, C<http_request> may provide its own 164The request headers to use. Currently, C<http_request> may provide its own
165C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers and 165C<Host:>, C<Content-Length:>, C<Connection:> and C<Cookie:> headers and
242context) - only connections using the same unique ID will be reused. 242context) - only connections using the same unique ID will be reused.
243 243
244=item on_prepare => $callback->($fh) 244=item on_prepare => $callback->($fh)
245 245
246In rare cases you need to "tune" the socket before it is used to 246In rare cases you need to "tune" the socket before it is used to
247connect (for exmaple, to bind it on a given IP address). This parameter 247connect (for example, to bind it on a given IP address). This parameter
248overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect> 248overrides the prepare callback passed to C<AnyEvent::Socket::tcp_connect>
249and behaves exactly the same way (e.g. it has to provide a 249and behaves exactly the same way (e.g. it has to provide a
250timeout). See the description for the C<$prepare_cb> argument of 250timeout). See the description for the C<$prepare_cb> argument of
251C<AnyEvent::Socket::tcp_connect> for details. 251C<AnyEvent::Socket::tcp_connect> for details.
252 252
384 384
385Example: do a HTTP HEAD request on https://www.google.com/, use a 385Example: do a HTTP HEAD request on https://www.google.com/, use a
386timeout of 30 seconds. 386timeout of 30 seconds.
387 387
388 http_request 388 http_request
389 GET => "https://www.google.com", 389 HEAD => "https://www.google.com",
390 headers => { "user-agent" => "MySearchClient 1.0" }, 390 headers => { "user-agent" => "MySearchClient 1.0" },
391 timeout => 30, 391 timeout => 30,
392 sub { 392 sub {
393 my ($body, $hdr) = @_; 393 my ($body, $hdr) = @_;
394 use Data::Dumper; 394 use Data::Dumper;
689 689
690 $cb->(undef, $hdr); 690 $cb->(undef, $hdr);
691 () 691 ()
692} 692}
693 693
694our %IDEMPOTENT = (
695 DELETE => 1,
696 GET => 1,
697 HEAD => 1,
698 OPTIONS => 1,
699 PUT => 1,
700 TRACE => 1,
701
702 ACL => 1,
703 "BASELINE-CONTROL" => 1,
704 BIND => 1,
705 CHECKIN => 1,
706 CHECKOUT => 1,
707 COPY => 1,
708 LABEL => 1,
709 LINK => 1,
710 MERGE => 1,
711 MKACTIVITY => 1,
712 MKCALENDAR => 1,
713 MKCOL => 1,
714 MKREDIRECTREF => 1,
715 MKWORKSPACE => 1,
716 MOVE => 1,
717 ORDERPATCH => 1,
718 PROPFIND => 1,
719 PROPPATCH => 1,
720 REBIND => 1,
721 REPORT => 1,
722 SEARCH => 1,
723 UNBIND => 1,
724 UNCHECKOUT => 1,
725 UNLINK => 1,
726 UNLOCK => 1,
727 UPDATE => 1,
728 UPDATEREDIRECTREF => 1,
729 "VERSION-CONTROL" => 1,
730);
731
694sub http_request($$@) { 732sub http_request($$@) {
695 my $cb = pop; 733 my $cb = pop;
696 my ($method, $url, %arg) = @_; 734 my ($method, $url, %arg) = @_;
697 735
698 my %hdr; 736 my %hdr;
773 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"}; 811 $hdr{"user-agent"} = $USERAGENT unless exists $hdr{"user-agent"};
774 812
775 $hdr{"content-length"} = length $arg{body} 813 $hdr{"content-length"} = length $arg{body}
776 if length $arg{body} || $method ne "GET"; 814 if length $arg{body} || $method ne "GET";
777 815
778 my $idempotent = $method =~ /^(?:GET|HEAD|PUT|DELETE|OPTIONS|TRACE)$/; 816 my $idempotent = $IDEMPOTENT{$method};
779 817
780 # default value for keepalive is true iff the request is for an idempotent method 818 # default value for keepalive is true iff the request is for an idempotent method
781 my $persistent = exists $arg{persistent} ? !!$arg{persistent} : $idempotent; 819 my $persistent = exists $arg{persistent} ? !!$arg{persistent} : $idempotent;
782 my $keepalive = exists $arg{keepalive} ? !!$arg{keepalive} : !$proxy; 820 my $keepalive = exists $arg{keepalive} ? !!$arg{keepalive} : !$proxy;
783 my $was_persistent; # true if this is actually a recycled connection 821 my $was_persistent; # true if this is actually a recycled connection
784 822
785 # the key to use in the keepalive cache 823 # the key to use in the keepalive cache
786 my $ka_key = "$uhost\x00$arg{sessionid}"; 824 my $ka_key = "$uscheme\x00$uhost\x00$uport\x00$arg{sessionid}";
787 825
788 $hdr{connection} = ($persistent ? $keepalive ? "keep-alive " : "" : "close ") . "Te"; #1.1 826 $hdr{connection} = ($persistent ? $keepalive ? "keep-alive, " : "" : "close, ") . "Te"; #1.1
789 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1 827 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1
790 828
791 my %state = (connect_guard => 1); 829 my %state = (connect_guard => 1);
792 830
793 my $ae_error = 595; # connecting 831 my $ae_error = 595; # connecting
806 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr) 844 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
807 . "\015\012" 845 . "\015\012"
808 . (delete $arg{body}) 846 . (delete $arg{body})
809 ); 847 );
810 848
811 # return if error occured during push_write() 849 # return if error occurred during push_write()
812 return unless %state; 850 return unless %state;
813 851
814 # reduce memory usage, save a kitten, also re-use it for the response headers. 852 # reduce memory usage, save a kitten, also re-use it for the response headers.
815 %hdr = (); 853 %hdr = ();
816 854
962 my $body = ""; 1000 my $body = "";
963 my $on_body = $arg{on_body} || sub { $body .= shift; 1 }; 1001 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
964 1002
965 $state{read_chunk} = sub { 1003 $state{read_chunk} = sub {
966 $_[1] =~ /^([0-9a-fA-F]+)/ 1004 $_[1] =~ /^([0-9a-fA-F]+)/
967 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding"); 1005 or return $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
968 1006
969 my $len = hex $1; 1007 my $len = hex $1;
970 1008
971 if ($len) { 1009 if ($len) {
972 $cl += $len; 1010 $cl += $len;
1050 _destroy_state %state; 1088 _destroy_state %state;
1051 1089
1052 %state = (); 1090 %state = ();
1053 $state{recurse} = 1091 $state{recurse} =
1054 http_request ( 1092 http_request (
1055 $method => $url, 1093 $method => $url,
1056 %arg, 1094 %arg,
1095 recurse => $recurse - 1,
1057 keepalive => 0, 1096 keepalive => 0,
1058 sub { 1097 sub {
1059 %state = (); 1098 %state = ();
1060 &$cb 1099 &$cb
1061 } 1100 }
1138 if ($persistent && $KA_CACHE{$ka_key}) { 1177 if ($persistent && $KA_CACHE{$ka_key}) {
1139 $was_persistent = 1; 1178 $was_persistent = 1;
1140 1179
1141 $state{handle} = ka_fetch $ka_key; 1180 $state{handle} = ka_fetch $ka_key;
1142 $state{handle}->destroyed 1181 $state{handle}->destroyed
1143 and die "got a destructed habndle. pah\n";#d# 1182 and die "AnyEvent::HTTP: unexpectedly got a destructed handle (1), please report.";#d#
1144 $prepare_handle->(); 1183 $prepare_handle->();
1145 $state{handle}->destroyed 1184 $state{handle}->destroyed
1146 and die "got a destructed habndle. pa2\n";#d# 1185 and die "AnyEvent::HTTP: unexpectedly got a destructed handle (2), please report.";#d#
1147 $handle_actual_request->(); 1186 $handle_actual_request->();
1148 $state{handle}->destroyed
1149 and die "got a destructed habndle. pa3\n";#d#
1150 1187
1151 } else { 1188 } else {
1152 my $tcp_connect = $arg{tcp_connect} 1189 my $tcp_connect = $arg{tcp_connect}
1153 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 1190 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
1154 1191
1195Sets the default proxy server to use. The proxy-url must begin with a 1232Sets the default proxy server to use. The proxy-url must begin with a
1196string of the form C<http://host:port>, croaks otherwise. 1233string of the form C<http://host:port>, croaks otherwise.
1197 1234
1198To clear an already-set proxy, use C<undef>. 1235To clear an already-set proxy, use C<undef>.
1199 1236
1200When AnyEvent::HTTP is laoded for the first time it will query the 1237When AnyEvent::HTTP is loaded for the first time it will query the
1201default proxy from the operating system, currently by looking at 1238default proxy from the operating system, currently by looking at
1202C<$ENV{http_proxy>}. 1239C<$ENV{http_proxy>}.
1203 1240
1204=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] 1241=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
1205 1242
1207C<$session_end> is given and true, then additionally remove all session 1244C<$session_end> is given and true, then additionally remove all session
1208cookies. 1245cookies.
1209 1246
1210You should call this function (with a true C<$session_end>) before you 1247You should call this function (with a true C<$session_end>) before you
1211save cookies to disk, and you should call this function after loading them 1248save cookies to disk, and you should call this function after loading them
1212again. If you have a long-running program you can additonally call this 1249again. If you have a long-running program you can additionally call this
1213function from time to time. 1250function from time to time.
1214 1251
1215A cookie jar is initially an empty hash-reference that is managed by this 1252A cookie jar is initially an empty hash-reference that is managed by this
1216module. It's format is subject to change, but currently it is like this: 1253module. It's format is subject to change, but currently it is like this:
1217 1254
1218The key C<version> has to contain C<1>, otherwise the hash gets 1255The key C<version> has to contain C<1>, otherwise the hash gets
1219emptied. All other keys are hostnames or IP addresses pointing to 1256emptied. All other keys are hostnames or IP addresses pointing to
1220hash-references. The key for these inner hash references is the 1257hash-references. The key for these inner hash references is the
1221server path for which this cookie is meant, and the values are again 1258server path for which this cookie is meant, and the values are again
1222hash-references. The keys of those hash-references is the cookie name, and 1259hash-references. Each key of those hash-references is a cookie name, and
1223the value, you guessed it, is another hash-reference, this time with the 1260the value, you guessed it, is another hash-reference, this time with the
1224key-value pairs from the cookie, except for C<expires> and C<max-age>, 1261key-value pairs from the cookie, except for C<expires> and C<max-age>,
1225which have been replaced by a C<_expires> key that contains the cookie 1262which have been replaced by a C<_expires> key that contains the cookie
1226expiry timestamp. 1263expiry timestamp. Session cookies are indicated by not having an
1264C<_expires> key.
1227 1265
1228Here is an example of a cookie jar with a single cookie, so you have a 1266Here is an example of a cookie jar with a single cookie, so you have a
1229chance of understanding the above paragraph: 1267chance of understanding the above paragraph:
1230 1268
1231 { 1269 {
1255 1293
1256The default value for the C<recurse> request parameter (default: C<10>). 1294The default value for the C<recurse> request parameter (default: C<10>).
1257 1295
1258=item $AnyEvent::HTTP::TIMEOUT 1296=item $AnyEvent::HTTP::TIMEOUT
1259 1297
1260The default timeout for conenction operations (default: C<300>). 1298The default timeout for connection operations (default: C<300>).
1261 1299
1262=item $AnyEvent::HTTP::USERAGENT 1300=item $AnyEvent::HTTP::USERAGENT
1263 1301
1264The default value for the C<User-Agent> header (the default is 1302The default value for the C<User-Agent> header (the default is
1265C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>). 1303C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>).
1273 1311
1274The default value for this is C<4>, and it is highly advisable to not 1312The default value for this is C<4>, and it is highly advisable to not
1275increase it much. 1313increase it much.
1276 1314
1277For comparison: the RFC's recommend 4 non-persistent or 2 persistent 1315For comparison: the RFC's recommend 4 non-persistent or 2 persistent
1278connections, older browsers used 2, newers (such as firefox 3) typically 1316connections, older browsers used 2, newer ones (such as firefox 3)
1279use 6, and Opera uses 8 because like, they have the fastest browser and 1317typically use 6, and Opera uses 8 because like, they have the fastest
1280give a shit for everybody else on the planet. 1318browser and give a shit for everybody else on the planet.
1281 1319
1282=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT 1320=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT
1283 1321
1284The time after which idle persistent conenctions get closed by 1322The time after which idle persistent connections get closed by
1285AnyEvent::HTTP (default: C<3>). 1323AnyEvent::HTTP (default: C<3>).
1286 1324
1287=item $AnyEvent::HTTP::ACTIVE 1325=item $AnyEvent::HTTP::ACTIVE
1288 1326
1289The number of active connections. This is not the number of currently 1327The number of active connections. This is not the number of currently
1330 # other formats fail in the loop below 1368 # other formats fail in the loop below
1331 1369
1332 for (0..11) { 1370 for (0..11) {
1333 if ($m eq $month[$_]) { 1371 if ($m eq $month[$_]) {
1334 require Time::Local; 1372 require Time::Local;
1335 return Time::Local::timegm ($S, $M, $H, $d, $_, $y); 1373 return eval { Time::Local::timegm ($S, $M, $H, $d, $_, $y) };
1336 } 1374 }
1337 } 1375 }
1338 1376
1339 undef 1377 undef
1340} 1378}
1354 set_proxy $ENV{http_proxy}; 1392 set_proxy $ENV{http_proxy};
1355}; 1393};
1356 1394
1357=head2 SHOWCASE 1395=head2 SHOWCASE
1358 1396
1359This section contaisn some more elaborate "real-world" examples or code 1397This section contains some more elaborate "real-world" examples or code
1360snippets. 1398snippets.
1361 1399
1362=head2 HTTP/1.1 FILE DOWNLOAD 1400=head2 HTTP/1.1 FILE DOWNLOAD
1363 1401
1364Downloading files with HTTP can be quite tricky, especially when something 1402Downloading files with HTTP can be quite tricky, especially when something
1368last modified time to check for file content changes, and works with many 1406last modified time to check for file content changes, and works with many
1369HTTP/1.0 servers as well, and usually falls back to a complete re-download 1407HTTP/1.0 servers as well, and usually falls back to a complete re-download
1370on older servers. 1408on older servers.
1371 1409
1372It calls the completion callback with either C<undef>, which means a 1410It calls the completion callback with either C<undef>, which means a
1373nonretryable error occured, C<0> when the download was partial and should 1411nonretryable error occurred, C<0> when the download was partial and should
1374be retried, and C<1> if it was successful. 1412be retried, and C<1> if it was successful.
1375 1413
1376 use AnyEvent::HTTP; 1414 use AnyEvent::HTTP;
1377 1415
1378 sub download($$$) { 1416 sub download($$$) {
1386 1424
1387 warn stat $fh; 1425 warn stat $fh;
1388 warn -s _; 1426 warn -s _;
1389 if (stat $fh and -s _) { 1427 if (stat $fh and -s _) {
1390 $ofs = -s _; 1428 $ofs = -s _;
1391 warn "-s is ", $ofs;#d# 1429 warn "-s is ", $ofs;
1392 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; 1430 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9];
1393 $hdr{"range"} = "bytes=$ofs-"; 1431 $hdr{"range"} = "bytes=$ofs-";
1394 } 1432 }
1395 1433
1396 http_get $url, 1434 http_get $url,

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines