… | |
… | |
50 | object at least alive until the callback get called. If the object |
50 | object at least alive until the callback get called. If the object |
51 | gets destroyed before the callback is called, the request will be |
51 | gets destroyed before the callback is called, the request will be |
52 | cancelled. |
52 | cancelled. |
53 | |
53 | |
54 | The callback will be called with the response body data as first |
54 | The callback will be called with the response body data as first |
55 | argument (or "undef" if an error occured), and a hash-ref with |
55 | argument (or "undef" if an error occurred), and a hash-ref with |
56 | response headers (and trailers) as second argument. |
56 | response headers (and trailers) as second argument. |
57 | |
57 | |
58 | All the headers in that hash are lowercased. In addition to the |
58 | All the headers in that hash are lowercased. In addition to the |
59 | response headers, the "pseudo-headers" (uppercase to avoid clashing |
59 | response headers, the "pseudo-headers" (uppercase to avoid clashing |
60 | with possible response headers) "HTTPVersion", "Status" and "Reason" |
60 | with possible response headers) "HTTPVersion", "Status" and "Reason" |
… | |
… | |
82 | If an internal error occurs, such as not being able to resolve a |
82 | If an internal error occurs, such as not being able to resolve a |
83 | hostname, then $data will be "undef", "$headers->{Status}" will be |
83 | hostname, then $data will be "undef", "$headers->{Status}" will be |
84 | 590-599 and the "Reason" pseudo-header will contain an error |
84 | 590-599 and the "Reason" pseudo-header will contain an error |
85 | message. Currently the following status codes are used: |
85 | message. Currently the following status codes are used: |
86 | |
86 | |
87 | 595 - errors during connection etsbalishment, proxy handshake. |
87 | 595 - errors during connection establishment, proxy handshake. |
88 | 596 - errors during TLS negotiation, request sending and header |
88 | 596 - errors during TLS negotiation, request sending and header |
89 | processing. |
89 | processing. |
90 | 597 - errors during body receiving or processing. |
90 | 597 - errors during body receiving or processing. |
91 | 598 - user aborted request via "on_header" or "on_body". |
91 | 598 - user aborted request via "on_header" or "on_body". |
92 | 599 - other, usually nonretryable, errors (garbled URL etc.). |
92 | 599 - other, usually nonretryable, errors (garbled URL etc.). |
… | |
… | |
106 | Additional parameters are key-value pairs, and are fully optional. |
106 | Additional parameters are key-value pairs, and are fully optional. |
107 | They include: |
107 | They include: |
108 | |
108 | |
109 | recurse => $count (default: $MAX_RECURSE) |
109 | recurse => $count (default: $MAX_RECURSE) |
110 | Whether to recurse requests or not, e.g. on redirects, |
110 | Whether to recurse requests or not, e.g. on redirects, |
111 | authentication retries and so on, and how often to do so. |
111 | authentication and other retries and so on, and how often to do |
|
|
112 | so. |
|
|
113 | |
|
|
114 | Only redirects to http and https URLs are supported. While most |
|
|
115 | common redirection forms are handled entirely within this |
|
|
116 | module, some require the use of the optional URI module. If it |
|
|
117 | is required but missing, then the request will fail with an |
|
|
118 | error. |
112 | |
119 | |
113 | headers => hashref |
120 | headers => hashref |
114 | The request headers to use. Currently, "http_request" may |
121 | The request headers to use. Currently, "http_request" may |
115 | provide its own "Host:", "Content-Length:", "Connection:" and |
122 | provide its own "Host:", "Content-Length:", "Connection:" and |
116 | "Cookie:" headers and will provide defaults at least for "TE:", |
123 | "Cookie:" headers and will provide defaults at least for "TE:", |
… | |
… | |
121 | You really should provide your own "User-Agent:" header value |
128 | You really should provide your own "User-Agent:" header value |
122 | that is appropriate for your program - I wouldn't be surprised |
129 | that is appropriate for your program - I wouldn't be surprised |
123 | if the default AnyEvent string gets blocked by webservers sooner |
130 | if the default AnyEvent string gets blocked by webservers sooner |
124 | or later. |
131 | or later. |
125 | |
132 | |
|
|
133 | Also, make sure that your headers names and values do not |
|
|
134 | contain any embedded newlines. |
|
|
135 | |
126 | timeout => $seconds |
136 | timeout => $seconds |
127 | The time-out to use for various stages - each connect attempt |
137 | The time-out to use for various stages - each connect attempt |
128 | will reset the timeout, as will read or write activity, i.e. |
138 | will reset the timeout, as will read or write activity, i.e. |
129 | this is not an overall timeout. |
139 | this is not an overall timeout. |
130 | |
140 | |
131 | Default timeout is 5 minutes. |
141 | Default timeout is 5 minutes. |
132 | |
142 | |
133 | proxy => [$host, $port[, $scheme]] or undef |
143 | proxy => [$host, $port[, $scheme]] or undef |
134 | Use the given http proxy for all requests. If not specified, |
144 | Use the given http proxy for all requests, or no proxy if |
135 | then the default proxy (as specified by $ENV{http_proxy}) is |
|
|
136 | used. |
145 | "undef" is used. |
137 | |
146 | |
138 | $scheme must be either missing or must be "http" for HTTP. |
147 | $scheme must be either missing or must be "http" for HTTP. |
|
|
148 | |
|
|
149 | If not specified, then the default proxy is used (see |
|
|
150 | "AnyEvent::HTTP::set_proxy"). |
|
|
151 | |
|
|
152 | Currently, if your proxy requires authorization, you have to |
|
|
153 | specify an appropriate "Proxy-Authorization" header in every |
|
|
154 | request. |
139 | |
155 | |
140 | body => $string |
156 | body => $string |
141 | The request body, usually empty. Will be sent as-is (future |
157 | The request body, usually empty. Will be sent as-is (future |
142 | versions of this module might offer more options). |
158 | versions of this module might offer more options). |
143 | |
159 | |
… | |
… | |
183 | object storing your state data, or the TLS context) - only |
199 | object storing your state data, or the TLS context) - only |
184 | connections using the same unique ID will be reused. |
200 | connections using the same unique ID will be reused. |
185 | |
201 | |
186 | on_prepare => $callback->($fh) |
202 | on_prepare => $callback->($fh) |
187 | In rare cases you need to "tune" the socket before it is used to |
203 | In rare cases you need to "tune" the socket before it is used to |
188 | connect (for exmaple, to bind it on a given IP address). This |
204 | connect (for example, to bind it on a given IP address). This |
189 | parameter overrides the prepare callback passed to |
205 | parameter overrides the prepare callback passed to |
190 | "AnyEvent::Socket::tcp_connect" and behaves exactly the same way |
206 | "AnyEvent::Socket::tcp_connect" and behaves exactly the same way |
191 | (e.g. it has to provide a timeout). See the description for the |
207 | (e.g. it has to provide a timeout). See the description for the |
192 | $prepare_cb argument of "AnyEvent::Socket::tcp_connect" for |
208 | $prepare_cb argument of "AnyEvent::Socket::tcp_connect" for |
193 | details. |
209 | details. |
… | |
… | |
331 | |
347 | |
332 | Example: do a HTTP HEAD request on https://www.google.com/, use a |
348 | Example: do a HTTP HEAD request on https://www.google.com/, use a |
333 | timeout of 30 seconds. |
349 | timeout of 30 seconds. |
334 | |
350 | |
335 | http_request |
351 | http_request |
336 | GET => "https://www.google.com", |
352 | HEAD => "https://www.google.com", |
337 | headers => { "user-agent" => "MySearchClient 1.0" }, |
353 | headers => { "user-agent" => "MySearchClient 1.0" }, |
338 | timeout => 30, |
354 | timeout => 30, |
339 | sub { |
355 | sub { |
340 | my ($body, $hdr) = @_; |
356 | my ($body, $hdr) = @_; |
341 | use Data::Dumper; |
357 | use Data::Dumper; |
… | |
… | |
366 | Sets the default proxy server to use. The proxy-url must begin with |
382 | Sets the default proxy server to use. The proxy-url must begin with |
367 | a string of the form "http://host:port", croaks otherwise. |
383 | a string of the form "http://host:port", croaks otherwise. |
368 | |
384 | |
369 | To clear an already-set proxy, use "undef". |
385 | To clear an already-set proxy, use "undef". |
370 | |
386 | |
|
|
387 | When AnyEvent::HTTP is loaded for the first time it will query the |
|
|
388 | default proxy from the operating system, currently by looking at |
|
|
389 | "$ENV{http_proxy"}. |
|
|
390 | |
371 | AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] |
391 | AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] |
372 | Remove all cookies from the cookie jar that have been expired. If |
392 | Remove all cookies from the cookie jar that have been expired. If |
373 | $session_end is given and true, then additionally remove all session |
393 | $session_end is given and true, then additionally remove all session |
374 | cookies. |
394 | cookies. |
375 | |
395 | |
376 | You should call this function (with a true $session_end) before you |
396 | You should call this function (with a true $session_end) before you |
377 | save cookies to disk, and you should call this function after |
397 | save cookies to disk, and you should call this function after |
378 | loading them again. If you have a long-running program you can |
398 | loading them again. If you have a long-running program you can |
379 | additonally call this function from time to time. |
399 | additionally call this function from time to time. |
380 | |
400 | |
381 | A cookie jar is initially an empty hash-reference that is managed by |
401 | A cookie jar is initially an empty hash-reference that is managed by |
382 | this module. It's format is subject to change, but currently it is |
402 | this module. Its format is subject to change, but currently it is as |
383 | like this: |
403 | follows: |
384 | |
404 | |
385 | The key "version" has to contain 1, otherwise the hash gets emptied. |
405 | The key "version" has to contain 1, otherwise the hash gets emptied. |
386 | All other keys are hostnames or IP addresses pointing to |
406 | All other keys are hostnames or IP addresses pointing to |
387 | hash-references. The key for these inner hash references is the |
407 | hash-references. The key for these inner hash references is the |
388 | server path for which this cookie is meant, and the values are again |
408 | server path for which this cookie is meant, and the values are again |
389 | hash-references. The keys of those hash-references is the cookie |
409 | hash-references. Each key of those hash-references is a cookie name, |
390 | name, and the value, you guessed it, is another hash-reference, this |
410 | and the value, you guessed it, is another hash-reference, this time |
391 | time with the key-value pairs from the cookie, except for "expires" |
411 | with the key-value pairs from the cookie, except for "expires" and |
392 | and "max-age", which have been replaced by a "_expires" key that |
412 | "max-age", which have been replaced by a "_expires" key that |
393 | contains the cookie expiry timestamp. |
413 | contains the cookie expiry timestamp. Session cookies are indicated |
|
|
414 | by not having an "_expires" key. |
394 | |
415 | |
395 | Here is an example of a cookie jar with a single cookie, so you have |
416 | Here is an example of a cookie jar with a single cookie, so you have |
396 | a chance of understanding the above paragraph: |
417 | a chance of understanding the above paragraph: |
397 | |
418 | |
398 | { |
419 | { |
… | |
… | |
419 | |
440 | |
420 | $AnyEvent::HTTP::MAX_RECURSE |
441 | $AnyEvent::HTTP::MAX_RECURSE |
421 | The default value for the "recurse" request parameter (default: 10). |
442 | The default value for the "recurse" request parameter (default: 10). |
422 | |
443 | |
423 | $AnyEvent::HTTP::TIMEOUT |
444 | $AnyEvent::HTTP::TIMEOUT |
424 | The default timeout for conenction operations (default: 300). |
445 | The default timeout for connection operations (default: 300). |
425 | |
446 | |
426 | $AnyEvent::HTTP::USERAGENT |
447 | $AnyEvent::HTTP::USERAGENT |
427 | The default value for the "User-Agent" header (the default is |
448 | The default value for the "User-Agent" header (the default is |
428 | "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; |
449 | "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; |
429 | +http://software.schmorp.de/pkg/AnyEvent)"). |
450 | +http://software.schmorp.de/pkg/AnyEvent)"). |
430 | |
451 | |
431 | $AnyEvent::HTTP::MAX_PER_HOST |
452 | $AnyEvent::HTTP::MAX_PER_HOST |
432 | The maximum number of concurrent connections to the same host |
453 | The maximum number of concurrent connections to the same host |
433 | (identified by the hostname). If the limit is exceeded, then the |
454 | (identified by the hostname). If the limit is exceeded, then |
434 | additional requests are queued until previous connections are |
455 | additional requests are queued until previous connections are |
435 | closed. Both persistent and non-persistent connections are counted |
456 | closed. Both persistent and non-persistent connections are counted |
436 | in this limit. |
457 | in this limit. |
437 | |
458 | |
438 | The default value for this is 4, and it is highly advisable to not |
459 | The default value for this is 4, and it is highly advisable to not |
439 | increase it much. |
460 | increase it much. |
440 | |
461 | |
441 | For comparison: the RFC's recommend 4 non-persistent or 2 persistent |
462 | For comparison: the RFC's recommend 4 non-persistent or 2 persistent |
442 | connections, older browsers used 2, newers (such as firefox 3) |
463 | connections, older browsers used 2, newer ones (such as firefox 3) |
443 | typically use 6, and Opera uses 8 because like, they have the |
464 | typically use 6, and Opera uses 8 because like, they have the |
444 | fastest browser and give a shit for everybody else on the planet. |
465 | fastest browser and give a shit for everybody else on the planet. |
445 | |
466 | |
446 | $AnyEvent::HTTP::PERSISTENT_TIMEOUT |
467 | $AnyEvent::HTTP::PERSISTENT_TIMEOUT |
447 | The time after which idle persistent conenctions get closed by |
468 | The time after which idle persistent connections get closed by |
448 | AnyEvent::HTTP (default: 3). |
469 | AnyEvent::HTTP (default: 3). |
449 | |
470 | |
450 | $AnyEvent::HTTP::ACTIVE |
471 | $AnyEvent::HTTP::ACTIVE |
451 | The number of active connections. This is not the number of |
472 | The number of active connections. This is not the number of |
452 | currently running requests, but the number of currently open and |
473 | currently running requests, but the number of currently open and |
453 | non-idle TCP connections. This number can be useful for |
474 | non-idle TCP connections. This number can be useful for |
454 | load-leveling. |
475 | load-leveling. |
455 | |
476 | |
456 | SHOWCASE |
477 | SHOWCASE |
457 | This section contaisn some more elaborate "real-world" examples or code |
478 | This section contains some more elaborate "real-world" examples or code |
458 | snippets. |
479 | snippets. |
459 | |
480 | |
460 | HTTP/1.1 FILE DOWNLOAD |
481 | HTTP/1.1 FILE DOWNLOAD |
461 | Downloading files with HTTP cna be quite tricky, especially when |
482 | Downloading files with HTTP can be quite tricky, especially when |
462 | something goes wrong and you want tor esume. |
483 | something goes wrong and you want to resume. |
463 | |
484 | |
464 | Here is a function that initiates and resumes a download. It uses the |
485 | Here is a function that initiates and resumes a download. It uses the |
465 | last modified time to check for file content changes, and works with |
486 | last modified time to check for file content changes, and works with |
466 | many HTTP/1.0 servers as well, and usually falls back to a complete |
487 | many HTTP/1.0 servers as well, and usually falls back to a complete |
467 | re-download on older servers. |
488 | re-download on older servers. |
468 | |
489 | |
469 | It calls the completion callback with either "undef", which means a |
490 | It calls the completion callback with either "undef", which means a |
470 | nonretryable error occured, 0 when the download was partial and should |
491 | nonretryable error occurred, 0 when the download was partial and should |
471 | be retried, and 1 if it was successful. |
492 | be retried, and 1 if it was successful. |
472 | |
493 | |
473 | use AnyEvent::HTTP; |
494 | use AnyEvent::HTTP; |
474 | |
495 | |
475 | sub download($$$) { |
496 | sub download($$$) { |
… | |
… | |
483 | |
504 | |
484 | warn stat $fh; |
505 | warn stat $fh; |
485 | warn -s _; |
506 | warn -s _; |
486 | if (stat $fh and -s _) { |
507 | if (stat $fh and -s _) { |
487 | $ofs = -s _; |
508 | $ofs = -s _; |
488 | warn "-s is ", $ofs;#d# |
509 | warn "-s is ", $ofs; |
489 | $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; |
510 | $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; |
490 | $hdr{"range"} = "bytes=$ofs-"; |
511 | $hdr{"range"} = "bytes=$ofs-"; |
491 | } |
512 | } |
492 | |
513 | |
493 | http_get $url, |
514 | http_get $url, |
… | |
… | |
613 | |
634 | |
614 | AUTHOR |
635 | AUTHOR |
615 | Marc Lehmann <schmorp@schmorp.de> |
636 | Marc Lehmann <schmorp@schmorp.de> |
616 | http://home.schmorp.de/ |
637 | http://home.schmorp.de/ |
617 | |
638 | |
618 | With many thanks to Дмитрий Шалашов, who provided |
639 | With many thanks to Дмитрий Шалашов, who provided countless testcases |
619 | countless testcases and bugreports. |
640 | and bugreports. |
620 | |
641 | |