ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/README
(Generate patch)

Comparing AnyEvent-HTTP/README (file contents):
Revision 1.16 by root, Tue Jan 11 06:38:47 2011 UTC vs.
Revision 1.25 by root, Sun Jun 8 23:33:28 2014 UTC

12 This module is an AnyEvent user, you need to make sure that you use and 12 This module is an AnyEvent user, you need to make sure that you use and
13 run a supported event loop. 13 run a supported event loop.
14 14
15 This module implements a simple, stateless and non-blocking HTTP client. 15 This module implements a simple, stateless and non-blocking HTTP client.
16 It supports GET, POST and other request methods, cookies and more, all 16 It supports GET, POST and other request methods, cookies and more, all
17 on a very low level. It can follow redirects supports proxies and 17 on a very low level. It can follow redirects, supports proxies, and
18 automatically limits the number of connections to the values specified 18 automatically limits the number of connections to the values specified
19 in the RFC. 19 in the RFC.
20 20
21 It should generally be a "good client" that is enough for most HTTP 21 It should generally be a "good client" that is enough for most HTTP
22 tasks. Simple tasks should be simple, but complex tasks should still be 22 tasks. Simple tasks should be simple, but complex tasks should still be
50 object at least alive until the callback get called. If the object 50 object at least alive until the callback get called. If the object
51 gets destroyed before the callback is called, the request will be 51 gets destroyed before the callback is called, the request will be
52 cancelled. 52 cancelled.
53 53
54 The callback will be called with the response body data as first 54 The callback will be called with the response body data as first
55 argument (or "undef" if an error occured), and a hash-ref with 55 argument (or "undef" if an error occurred), and a hash-ref with
56 response headers (and trailers) as second argument. 56 response headers (and trailers) as second argument.
57 57
58 All the headers in that hash are lowercased. In addition to the 58 All the headers in that hash are lowercased. In addition to the
59 response headers, the "pseudo-headers" (uppercase to avoid clashing 59 response headers, the "pseudo-headers" (uppercase to avoid clashing
60 with possible response headers) "HTTPVersion", "Status" and "Reason" 60 with possible response headers) "HTTPVersion", "Status" and "Reason"
82 If an internal error occurs, such as not being able to resolve a 82 If an internal error occurs, such as not being able to resolve a
83 hostname, then $data will be "undef", "$headers->{Status}" will be 83 hostname, then $data will be "undef", "$headers->{Status}" will be
84 590-599 and the "Reason" pseudo-header will contain an error 84 590-599 and the "Reason" pseudo-header will contain an error
85 message. Currently the following status codes are used: 85 message. Currently the following status codes are used:
86 86
87 595 - errors during connection etsbalishment, proxy handshake. 87 595 - errors during connection establishment, proxy handshake.
88 596 - errors during TLS negotiation, request sending and header 88 596 - errors during TLS negotiation, request sending and header
89 processing. 89 processing.
90 597 - errors during body receiving or processing. 90 597 - errors during body receiving or processing.
91 598 - user aborted request via "on_header" or "on_body". 91 598 - user aborted request via "on_header" or "on_body".
92 599 - other, usually nonretryable, errors (garbled URL etc.). 92 599 - other, usually nonretryable, errors (garbled URL etc.).
106 Additional parameters are key-value pairs, and are fully optional. 106 Additional parameters are key-value pairs, and are fully optional.
107 They include: 107 They include:
108 108
109 recurse => $count (default: $MAX_RECURSE) 109 recurse => $count (default: $MAX_RECURSE)
110 Whether to recurse requests or not, e.g. on redirects, 110 Whether to recurse requests or not, e.g. on redirects,
111 authentication retries and so on, and how often to do so. 111 authentication and other retries and so on, and how often to do
112 so.
113
114 Only redirects to http and https URLs are supported. While most
115 common redirection forms are handled entirely within this
116 module, some require the use of the optional URI module. If it
117 is required but missing, then the request will fail with an
118 error.
112 119
113 headers => hashref 120 headers => hashref
114 The request headers to use. Currently, "http_request" may 121 The request headers to use. Currently, "http_request" may
115 provide its own "Host:", "Content-Length:", "Connection:" and 122 provide its own "Host:", "Content-Length:", "Connection:" and
116 "Cookie:" headers and will provide defaults at least for "TE:", 123 "Cookie:" headers and will provide defaults at least for "TE:",
121 You really should provide your own "User-Agent:" header value 128 You really should provide your own "User-Agent:" header value
122 that is appropriate for your program - I wouldn't be surprised 129 that is appropriate for your program - I wouldn't be surprised
123 if the default AnyEvent string gets blocked by webservers sooner 130 if the default AnyEvent string gets blocked by webservers sooner
124 or later. 131 or later.
125 132
133 Also, make sure that your headers names and values do not
134 contain any embedded newlines.
135
126 timeout => $seconds 136 timeout => $seconds
127 The time-out to use for various stages - each connect attempt 137 The time-out to use for various stages - each connect attempt
128 will reset the timeout, as will read or write activity, i.e. 138 will reset the timeout, as will read or write activity, i.e.
129 this is not an overall timeout. 139 this is not an overall timeout.
130 140
131 Default timeout is 5 minutes. 141 Default timeout is 5 minutes.
132 142
133 proxy => [$host, $port[, $scheme]] or undef 143 proxy => [$host, $port[, $scheme]] or undef
134 Use the given http proxy for all requests. If not specified, 144 Use the given http proxy for all requests, or no proxy if
135 then the default proxy (as specified by $ENV{http_proxy}) is
136 used. 145 "undef" is used.
137 146
138 $scheme must be either missing or must be "http" for HTTP. 147 $scheme must be either missing or must be "http" for HTTP.
148
149 If not specified, then the default proxy is used (see
150 "AnyEvent::HTTP::set_proxy").
139 151
140 body => $string 152 body => $string
141 The request body, usually empty. Will be sent as-is (future 153 The request body, usually empty. Will be sent as-is (future
142 versions of this module might offer more options). 154 versions of this module might offer more options).
143 155
183 object storing your state data, or the TLS context) - only 195 object storing your state data, or the TLS context) - only
184 connections using the same unique ID will be reused. 196 connections using the same unique ID will be reused.
185 197
186 on_prepare => $callback->($fh) 198 on_prepare => $callback->($fh)
187 In rare cases you need to "tune" the socket before it is used to 199 In rare cases you need to "tune" the socket before it is used to
188 connect (for exmaple, to bind it on a given IP address). This 200 connect (for example, to bind it on a given IP address). This
189 parameter overrides the prepare callback passed to 201 parameter overrides the prepare callback passed to
190 "AnyEvent::Socket::tcp_connect" and behaves exactly the same way 202 "AnyEvent::Socket::tcp_connect" and behaves exactly the same way
191 (e.g. it has to provide a timeout). See the description for the 203 (e.g. it has to provide a timeout). See the description for the
192 $prepare_cb argument of "AnyEvent::Socket::tcp_connect" for 204 $prepare_cb argument of "AnyEvent::Socket::tcp_connect" for
193 details. 205 details.
331 343
332 Example: do a HTTP HEAD request on https://www.google.com/, use a 344 Example: do a HTTP HEAD request on https://www.google.com/, use a
333 timeout of 30 seconds. 345 timeout of 30 seconds.
334 346
335 http_request 347 http_request
336 GET => "https://www.google.com", 348 HEAD => "https://www.google.com",
337 headers => { "user-agent" => "MySearchClient 1.0" }, 349 headers => { "user-agent" => "MySearchClient 1.0" },
338 timeout => 30, 350 timeout => 30,
339 sub { 351 sub {
340 my ($body, $hdr) = @_; 352 my ($body, $hdr) = @_;
341 use Data::Dumper; 353 use Data::Dumper;
366 Sets the default proxy server to use. The proxy-url must begin with 378 Sets the default proxy server to use. The proxy-url must begin with
367 a string of the form "http://host:port", croaks otherwise. 379 a string of the form "http://host:port", croaks otherwise.
368 380
369 To clear an already-set proxy, use "undef". 381 To clear an already-set proxy, use "undef".
370 382
383 When AnyEvent::HTTP is loaded for the first time it will query the
384 default proxy from the operating system, currently by looking at
385 "$ENV{http_proxy"}.
386
371 AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] 387 AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
372 Remove all cookies from the cookie jar that have been expired. If 388 Remove all cookies from the cookie jar that have been expired. If
373 $session_end is given and true, then additionally remove all session 389 $session_end is given and true, then additionally remove all session
374 cookies. 390 cookies.
375 391
376 You should call this function (with a true $session_end) before you 392 You should call this function (with a true $session_end) before you
377 save cookies to disk, and you should call this function after 393 save cookies to disk, and you should call this function after
378 loading them again. If you have a long-running program you can 394 loading them again. If you have a long-running program you can
379 additonally call this function from time to time. 395 additionally call this function from time to time.
380 396
381 A cookie jar is initially an empty hash-reference that is managed by 397 A cookie jar is initially an empty hash-reference that is managed by
382 this module. It's format is subject to change, but currently it is 398 this module. It's format is subject to change, but currently it is
383 like this: 399 like this:
384 400
385 The key "version" has to contain 1, otherwise the hash gets emptied. 401 The key "version" has to contain 1, otherwise the hash gets emptied.
386 All other keys are hostnames or IP addresses pointing to 402 All other keys are hostnames or IP addresses pointing to
387 hash-references. The key for these inner hash references is the 403 hash-references. The key for these inner hash references is the
388 server path for which this cookie is meant, and the values are again 404 server path for which this cookie is meant, and the values are again
389 hash-references. The keys of those hash-references is the cookie 405 hash-references. Each key of those hash-references is a cookie name,
390 name, and the value, you guessed it, is another hash-reference, this 406 and the value, you guessed it, is another hash-reference, this time
391 time with the key-value pairs from the cookie, except for "expires" 407 with the key-value pairs from the cookie, except for "expires" and
392 and "max-age", which have been replaced by a "_expires" key that 408 "max-age", which have been replaced by a "_expires" key that
393 contains the cookie expiry timestamp. 409 contains the cookie expiry timestamp. Session cookies are indicated
410 by not having an "_expires" key.
394 411
395 Here is an example of a cookie jar with a single cookie, so you have 412 Here is an example of a cookie jar with a single cookie, so you have
396 a chance of understanding the above paragraph: 413 a chance of understanding the above paragraph:
397 414
398 { 415 {
419 436
420 $AnyEvent::HTTP::MAX_RECURSE 437 $AnyEvent::HTTP::MAX_RECURSE
421 The default value for the "recurse" request parameter (default: 10). 438 The default value for the "recurse" request parameter (default: 10).
422 439
423 $AnyEvent::HTTP::TIMEOUT 440 $AnyEvent::HTTP::TIMEOUT
424 The default timeout for conenction operations (default: 300). 441 The default timeout for connection operations (default: 300).
425 442
426 $AnyEvent::HTTP::USERAGENT 443 $AnyEvent::HTTP::USERAGENT
427 The default value for the "User-Agent" header (the default is 444 The default value for the "User-Agent" header (the default is
428 "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; 445 "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION;
429 +http://software.schmorp.de/pkg/AnyEvent)"). 446 +http://software.schmorp.de/pkg/AnyEvent)").
437 454
438 The default value for this is 4, and it is highly advisable to not 455 The default value for this is 4, and it is highly advisable to not
439 increase it much. 456 increase it much.
440 457
441 For comparison: the RFC's recommend 4 non-persistent or 2 persistent 458 For comparison: the RFC's recommend 4 non-persistent or 2 persistent
442 connections, older browsers used 2, newers (such as firefox 3) 459 connections, older browsers used 2, newer ones (such as firefox 3)
443 typically use 6, and Opera uses 8 because like, they have the 460 typically use 6, and Opera uses 8 because like, they have the
444 fastest browser and give a shit for everybody else on the planet. 461 fastest browser and give a shit for everybody else on the planet.
445 462
446 $AnyEvent::HTTP::PERSISTENT_TIMEOUT 463 $AnyEvent::HTTP::PERSISTENT_TIMEOUT
447 The time after which idle persistent conenctions get closed by 464 The time after which idle persistent connections get closed by
448 AnyEvent::HTTP (default: 3). 465 AnyEvent::HTTP (default: 3).
449 466
450 $AnyEvent::HTTP::ACTIVE 467 $AnyEvent::HTTP::ACTIVE
451 The number of active connections. This is not the number of 468 The number of active connections. This is not the number of
452 currently running requests, but the number of currently open and 469 currently running requests, but the number of currently open and
453 non-idle TCP connections. This number can be useful for 470 non-idle TCP connections. This number can be useful for
454 load-leveling. 471 load-leveling.
455 472
456 SHOWCASE 473 SHOWCASE
457 This section contaisn some more elaborate "real-world" examples or code 474 This section contains some more elaborate "real-world" examples or code
458 snippets. 475 snippets.
459 476
460 HTTP/1.1 FILE DOWNLOAD 477 HTTP/1.1 FILE DOWNLOAD
461 Downloading files with HTTP cna be quite tricky, especially when 478 Downloading files with HTTP can be quite tricky, especially when
462 something goes wrong and you want tor esume. 479 something goes wrong and you want to resume.
463 480
464 Here is a function that initiates and resumes a download. It uses the 481 Here is a function that initiates and resumes a download. It uses the
465 last modified time to check for file content changes, and works with 482 last modified time to check for file content changes, and works with
466 many HTTP/1.0 servers as well, and usually falls back to a complete 483 many HTTP/1.0 servers as well, and usually falls back to a complete
467 re-download on older servers. 484 re-download on older servers.
468 485
469 It calls the completion callback with either "undef", which means a 486 It calls the completion callback with either "undef", which means a
470 nonretryable error occured, 0 when the download was partial and should 487 nonretryable error occurred, 0 when the download was partial and should
471 be retried, and 1 if it was successful. 488 be retried, and 1 if it was successful.
472 489
473 use AnyEvent::HTTP; 490 use AnyEvent::HTTP;
474 491
475 sub download($$$) { 492 sub download($$$) {
483 500
484 warn stat $fh; 501 warn stat $fh;
485 warn -s _; 502 warn -s _;
486 if (stat $fh and -s _) { 503 if (stat $fh and -s _) {
487 $ofs = -s _; 504 $ofs = -s _;
488 warn "-s is ", $ofs;#d# 505 warn "-s is ", $ofs;
489 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; 506 $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9];
490 $hdr{"range"} = "bytes=$ofs-"; 507 $hdr{"range"} = "bytes=$ofs-";
491 } 508 }
492 509
493 http_get $url, 510 http_get $url,
613 630
614AUTHOR 631AUTHOR
615 Marc Lehmann <schmorp@schmorp.de> 632 Marc Lehmann <schmorp@schmorp.de>
616 http://home.schmorp.de/ 633 http://home.schmorp.de/
617 634
618 With many thanks to Дмитрий Шалашов, who provided 635 With many thanks to Дмитрий Шалашов, who provided countless testcases
619 countless testcases and bugreports. 636 and bugreports.
620 637

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines