1 |
NAME |
2 |
AnyEvent::HTTP - simple but non-blocking HTTP/HTTPS client |
3 |
|
4 |
SYNOPSIS |
5 |
use AnyEvent::HTTP; |
6 |
|
7 |
http_get "http://www.nethype.de/", sub { print $_[1] }; |
8 |
|
9 |
# ... do something else here |
10 |
|
11 |
DESCRIPTION |
12 |
This module is an AnyEvent user, you need to make sure that you use and |
13 |
run a supported event loop. |
14 |
|
15 |
This module implements a simple, stateless and non-blocking HTTP client. |
16 |
It supports GET, POST and other request methods, cookies and more, all |
17 |
on a very low level. It can follow redirects, supports proxies, and |
18 |
automatically limits the number of connections to the values specified |
19 |
in the RFC. |
20 |
|
21 |
It should generally be a "good client" that is enough for most HTTP |
22 |
tasks. Simple tasks should be simple, but complex tasks should still be |
23 |
possible as the user retains control over request and response headers. |
24 |
|
25 |
The caller is responsible for authentication management, cookies (if the |
26 |
simplistic implementation in this module doesn't suffice), referer and |
27 |
other high-level protocol details for which this module offers only |
28 |
limited support. |
29 |
|
30 |
METHODS |
31 |
http_get $url, key => value..., $cb->($data, $headers) |
32 |
Executes an HTTP-GET request. See the http_request function for |
33 |
details on additional parameters and the return value. |
34 |
|
35 |
http_head $url, key => value..., $cb->($data, $headers) |
36 |
Executes an HTTP-HEAD request. See the http_request function for |
37 |
details on additional parameters and the return value. |
38 |
|
39 |
http_post $url, $body, key => value..., $cb->($data, $headers) |
40 |
Executes an HTTP-POST request with a request body of $body. See the |
41 |
http_request function for details on additional parameters and the |
42 |
return value. |
43 |
|
44 |
http_request $method => $url, key => value..., $cb->($data, $headers) |
45 |
Executes a HTTP request of type $method (e.g. "GET", "POST"). The |
46 |
URL must be an absolute http or https URL. |
47 |
|
48 |
When called in void context, nothing is returned. In other contexts, |
49 |
"http_request" returns a "cancellation guard" - you have to keep the |
50 |
object at least alive until the callback get called. If the object |
51 |
gets destroyed before the callback is called, the request will be |
52 |
cancelled. |
53 |
|
54 |
The callback will be called with the response body data as first |
55 |
argument (or "undef" if an error occurred), and a hash-ref with |
56 |
response headers (and trailers) as second argument. |
57 |
|
58 |
All the headers in that hash are lowercased. In addition to the |
59 |
response headers, the "pseudo-headers" (uppercase to avoid clashing |
60 |
with possible response headers) "HTTPVersion", "Status" and "Reason" |
61 |
contain the three parts of the HTTP Status-Line of the same name. If |
62 |
an error occurs during the body phase of a request, then the |
63 |
original "Status" and "Reason" values from the header are available |
64 |
as "OrigStatus" and "OrigReason". |
65 |
|
66 |
The pseudo-header "URL" contains the actual URL (which can differ |
67 |
from the requested URL when following redirects - for example, you |
68 |
might get an error that your URL scheme is not supported even though |
69 |
your URL is a valid http URL because it redirected to an ftp URL, in |
70 |
which case you can look at the URL pseudo header). |
71 |
|
72 |
The pseudo-header "Redirect" only exists when the request was a |
73 |
result of an internal redirect. In that case it is an array |
74 |
reference with the "($data, $headers)" from the redirect response. |
75 |
Note that this response could in turn be the result of a redirect |
76 |
itself, and "$headers->{Redirect}[1]{Redirect}" will then contain |
77 |
the original response, and so on. |
78 |
|
79 |
If the server sends a header multiple times, then their contents |
80 |
will be joined together with a comma (","), as per the HTTP spec. |
81 |
|
82 |
If an internal error occurs, such as not being able to resolve a |
83 |
hostname, then $data will be "undef", "$headers->{Status}" will be |
84 |
590-599 and the "Reason" pseudo-header will contain an error |
85 |
message. Currently the following status codes are used: |
86 |
|
87 |
595 - errors during connection establishment, proxy handshake. |
88 |
596 - errors during TLS negotiation, request sending and header |
89 |
processing. |
90 |
597 - errors during body receiving or processing. |
91 |
598 - user aborted request via "on_header" or "on_body". |
92 |
599 - other, usually nonretryable, errors (garbled URL etc.). |
93 |
|
94 |
A typical callback might look like this: |
95 |
|
96 |
sub { |
97 |
my ($body, $hdr) = @_; |
98 |
|
99 |
if ($hdr->{Status} =~ /^2/) { |
100 |
... everything should be ok |
101 |
} else { |
102 |
print "error, $hdr->{Status} $hdr->{Reason}\n"; |
103 |
} |
104 |
} |
105 |
|
106 |
Additional parameters are key-value pairs, and are fully optional. |
107 |
They include: |
108 |
|
109 |
recurse => $count (default: $MAX_RECURSE) |
110 |
Whether to recurse requests or not, e.g. on redirects, |
111 |
authentication and other retries and so on, and how often to do |
112 |
so. |
113 |
|
114 |
Only redirects to http and https URLs are supported. While most |
115 |
common redirection forms are handled entirely within this |
116 |
module, some require the use of the optional URI module. If it |
117 |
is required but missing, then the request will fail with an |
118 |
error. |
119 |
|
120 |
headers => hashref |
121 |
The request headers to use. Currently, "http_request" may |
122 |
provide its own "Host:", "Content-Length:", "Connection:" and |
123 |
"Cookie:" headers and will provide defaults at least for "TE:", |
124 |
"Referer:" and "User-Agent:" (this can be suppressed by using |
125 |
"undef" for these headers in which case they won't be sent at |
126 |
all). |
127 |
|
128 |
You really should provide your own "User-Agent:" header value |
129 |
that is appropriate for your program - I wouldn't be surprised |
130 |
if the default AnyEvent string gets blocked by webservers sooner |
131 |
or later. |
132 |
|
133 |
Also, make sure that your headers names and values do not |
134 |
contain any embedded newlines. |
135 |
|
136 |
timeout => $seconds |
137 |
The time-out to use for various stages - each connect attempt |
138 |
will reset the timeout, as will read or write activity, i.e. |
139 |
this is not an overall timeout. |
140 |
|
141 |
Default timeout is 5 minutes. |
142 |
|
143 |
proxy => [$host, $port[, $scheme]] or undef |
144 |
Use the given http proxy for all requests, or no proxy if |
145 |
"undef" is used. |
146 |
|
147 |
$scheme must be either missing or must be "http" for HTTP. |
148 |
|
149 |
If not specified, then the default proxy is used (see |
150 |
"AnyEvent::HTTP::set_proxy"). |
151 |
|
152 |
Currently, if your proxy requires authorization, you have to |
153 |
specify an appropriate "Proxy-Authorization" header in every |
154 |
request. |
155 |
|
156 |
body => $string |
157 |
The request body, usually empty. Will be sent as-is (future |
158 |
versions of this module might offer more options). |
159 |
|
160 |
cookie_jar => $hash_ref |
161 |
Passing this parameter enables (simplified) cookie-processing, |
162 |
loosely based on the original netscape specification. |
163 |
|
164 |
The $hash_ref must be an (initially empty) hash reference which |
165 |
will get updated automatically. It is possible to save the |
166 |
cookie jar to persistent storage with something like JSON or |
167 |
Storable - see the "AnyEvent::HTTP::cookie_jar_expire" function |
168 |
if you wish to remove expired or session-only cookies, and also |
169 |
for documentation on the format of the cookie jar. |
170 |
|
171 |
Note that this cookie implementation is not meant to be |
172 |
complete. If you want complete cookie management you have to do |
173 |
that on your own. "cookie_jar" is meant as a quick fix to get |
174 |
most cookie-using sites working. Cookies are a privacy disaster, |
175 |
do not use them unless required to. |
176 |
|
177 |
When cookie processing is enabled, the "Cookie:" and |
178 |
"Set-Cookie:" headers will be set and handled by this module, |
179 |
otherwise they will be left untouched. |
180 |
|
181 |
tls_ctx => $scheme | $tls_ctx |
182 |
Specifies the AnyEvent::TLS context to be used for https |
183 |
connections. This parameter follows the same rules as the |
184 |
"tls_ctx" parameter to AnyEvent::Handle, but additionally, the |
185 |
two strings "low" or "high" can be specified, which give you a |
186 |
predefined low-security (no verification, highest compatibility) |
187 |
and high-security (CA and common-name verification) TLS context. |
188 |
|
189 |
The default for this option is "low", which could be interpreted |
190 |
as "give me the page, no matter what". |
191 |
|
192 |
See also the "sessionid" parameter. |
193 |
|
194 |
session => $string |
195 |
The module might reuse connections to the same host internally. |
196 |
Sometimes (e.g. when using TLS), you do not want to reuse |
197 |
connections from other sessions. This can be achieved by setting |
198 |
this parameter to some unique ID (such as the address of an |
199 |
object storing your state data, or the TLS context) - only |
200 |
connections using the same unique ID will be reused. |
201 |
|
202 |
on_prepare => $callback->($fh) |
203 |
In rare cases you need to "tune" the socket before it is used to |
204 |
connect (for example, to bind it on a given IP address). This |
205 |
parameter overrides the prepare callback passed to |
206 |
"AnyEvent::Socket::tcp_connect" and behaves exactly the same way |
207 |
(e.g. it has to provide a timeout). See the description for the |
208 |
$prepare_cb argument of "AnyEvent::Socket::tcp_connect" for |
209 |
details. |
210 |
|
211 |
tcp_connect => $callback->($host, $service, $connect_cb, |
212 |
$prepare_cb) |
213 |
In even rarer cases you want total control over how |
214 |
AnyEvent::HTTP establishes connections. Normally it uses |
215 |
AnyEvent::Socket::tcp_connect to do this, but you can provide |
216 |
your own "tcp_connect" function - obviously, it has to follow |
217 |
the same calling conventions, except that it may always return a |
218 |
connection guard object. |
219 |
|
220 |
There are probably lots of weird uses for this function, |
221 |
starting from tracing the hosts "http_request" actually tries to |
222 |
connect, to (inexact but fast) host => IP address caching or |
223 |
even socks protocol support. |
224 |
|
225 |
on_header => $callback->($headers) |
226 |
When specified, this callback will be called with the header |
227 |
hash as soon as headers have been successfully received from the |
228 |
remote server (not on locally-generated errors). |
229 |
|
230 |
It has to return either true (in which case AnyEvent::HTTP will |
231 |
continue), or false, in which case AnyEvent::HTTP will cancel |
232 |
the download (and call the finish callback with an error code of |
233 |
598). |
234 |
|
235 |
This callback is useful, among other things, to quickly reject |
236 |
unwanted content, which, if it is supposed to be rare, can be |
237 |
faster than first doing a "HEAD" request. |
238 |
|
239 |
The downside is that cancelling the request makes it impossible |
240 |
to re-use the connection. Also, the "on_header" callback will |
241 |
not receive any trailer (headers sent after the response body). |
242 |
|
243 |
Example: cancel the request unless the content-type is |
244 |
"text/html". |
245 |
|
246 |
on_header => sub { |
247 |
$_[0]{"content-type"} =~ /^text\/html\s*(?:;|$)/ |
248 |
}, |
249 |
|
250 |
on_body => $callback->($partial_body, $headers) |
251 |
When specified, all body data will be passed to this callback |
252 |
instead of to the completion callback. The completion callback |
253 |
will get the empty string instead of the body data. |
254 |
|
255 |
It has to return either true (in which case AnyEvent::HTTP will |
256 |
continue), or false, in which case AnyEvent::HTTP will cancel |
257 |
the download (and call the completion callback with an error |
258 |
code of 598). |
259 |
|
260 |
The downside to cancelling the request is that it makes it |
261 |
impossible to re-use the connection. |
262 |
|
263 |
This callback is useful when the data is too large to be held in |
264 |
memory (so the callback writes it to a file) or when only some |
265 |
information should be extracted, or when the body should be |
266 |
processed incrementally. |
267 |
|
268 |
It is usually preferred over doing your own body handling via |
269 |
"want_body_handle", but in case of streaming APIs, where HTTP is |
270 |
only used to create a connection, "want_body_handle" is the |
271 |
better alternative, as it allows you to install your own event |
272 |
handler, reducing resource usage. |
273 |
|
274 |
want_body_handle => $enable |
275 |
When enabled (default is disabled), the behaviour of |
276 |
AnyEvent::HTTP changes considerably: after parsing the headers, |
277 |
and instead of downloading the body (if any), the completion |
278 |
callback will be called. Instead of the $body argument |
279 |
containing the body data, the callback will receive the |
280 |
AnyEvent::Handle object associated with the connection. In error |
281 |
cases, "undef" will be passed. When there is no body (e.g. |
282 |
status 304), the empty string will be passed. |
283 |
|
284 |
The handle object might or might not be in TLS mode, might be |
285 |
connected to a proxy, be a persistent connection, use chunked |
286 |
transfer encoding etc., and configured in unspecified ways. The |
287 |
user is responsible for this handle (it will not be used by this |
288 |
module anymore). |
289 |
|
290 |
This is useful with some push-type services, where, after the |
291 |
initial headers, an interactive protocol is used (typical |
292 |
example would be the push-style twitter API which starts a |
293 |
JSON/XML stream). |
294 |
|
295 |
If you think you need this, first have a look at "on_body", to |
296 |
see if that doesn't solve your problem in a better way. |
297 |
|
298 |
persistent => $boolean |
299 |
Try to create/reuse a persistent connection. When this flag is |
300 |
set (default: true for idempotent requests, false for all |
301 |
others), then "http_request" tries to re-use an existing |
302 |
(previously-created) persistent connection to the host and, |
303 |
failing that, tries to create a new one. |
304 |
|
305 |
Requests failing in certain ways will be automatically retried |
306 |
once, which is dangerous for non-idempotent requests, which is |
307 |
why it defaults to off for them. The reason for this is because |
308 |
the bozos who designed HTTP/1.1 made it impossible to |
309 |
distinguish between a fatal error and a normal connection |
310 |
timeout, so you never know whether there was a problem with your |
311 |
request or not. |
312 |
|
313 |
When reusing an existent connection, many parameters (such as |
314 |
TLS context) will be ignored. See the "session" parameter for a |
315 |
workaround. |
316 |
|
317 |
keepalive => $boolean |
318 |
Only used when "persistent" is also true. This parameter decides |
319 |
whether "http_request" tries to handshake a HTTP/1.0-style |
320 |
keep-alive connection (as opposed to only a HTTP/1.1 persistent |
321 |
connection). |
322 |
|
323 |
The default is true, except when using a proxy, in which case it |
324 |
defaults to false, as HTTP/1.0 proxies cannot support this in a |
325 |
meaningful way. |
326 |
|
327 |
handle_params => { key => value ... } |
328 |
The key-value pairs in this hash will be passed to any |
329 |
AnyEvent::Handle constructor that is called - not all requests |
330 |
will create a handle, and sometimes more than one is created, so |
331 |
this parameter is only good for setting hints. |
332 |
|
333 |
Example: set the maximum read size to 4096, to potentially |
334 |
conserve memory at the cost of speed. |
335 |
|
336 |
handle_params => { |
337 |
max_read_size => 4096, |
338 |
}, |
339 |
|
340 |
Example: do a simple HTTP GET request for http://www.nethype.de/ and |
341 |
print the response body. |
342 |
|
343 |
http_request GET => "http://www.nethype.de/", sub { |
344 |
my ($body, $hdr) = @_; |
345 |
print "$body\n"; |
346 |
}; |
347 |
|
348 |
Example: do a HTTP HEAD request on https://www.google.com/, use a |
349 |
timeout of 30 seconds. |
350 |
|
351 |
http_request |
352 |
HEAD => "https://www.google.com", |
353 |
headers => { "user-agent" => "MySearchClient 1.0" }, |
354 |
timeout => 30, |
355 |
sub { |
356 |
my ($body, $hdr) = @_; |
357 |
use Data::Dumper; |
358 |
print Dumper $hdr; |
359 |
} |
360 |
; |
361 |
|
362 |
Example: do another simple HTTP GET request, but immediately try to |
363 |
cancel it. |
364 |
|
365 |
my $request = http_request GET => "http://www.nethype.de/", sub { |
366 |
my ($body, $hdr) = @_; |
367 |
print "$body\n"; |
368 |
}; |
369 |
|
370 |
undef $request; |
371 |
|
372 |
DNS CACHING |
373 |
AnyEvent::HTTP uses the AnyEvent::Socket::tcp_connect function for the |
374 |
actual connection, which in turn uses AnyEvent::DNS to resolve |
375 |
hostnames. The latter is a simple stub resolver and does no caching on |
376 |
its own. If you want DNS caching, you currently have to provide your own |
377 |
default resolver (by storing a suitable resolver object in |
378 |
$AnyEvent::DNS::RESOLVER) or your own "tcp_connect" callback. |
379 |
|
380 |
GLOBAL FUNCTIONS AND VARIABLES |
381 |
AnyEvent::HTTP::set_proxy "proxy-url" |
382 |
Sets the default proxy server to use. The proxy-url must begin with |
383 |
a string of the form "http://host:port", croaks otherwise. |
384 |
|
385 |
To clear an already-set proxy, use "undef". |
386 |
|
387 |
When AnyEvent::HTTP is loaded for the first time it will query the |
388 |
default proxy from the operating system, currently by looking at |
389 |
"$ENV{http_proxy"}. |
390 |
|
391 |
AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] |
392 |
Remove all cookies from the cookie jar that have been expired. If |
393 |
$session_end is given and true, then additionally remove all session |
394 |
cookies. |
395 |
|
396 |
You should call this function (with a true $session_end) before you |
397 |
save cookies to disk, and you should call this function after |
398 |
loading them again. If you have a long-running program you can |
399 |
additionally call this function from time to time. |
400 |
|
401 |
A cookie jar is initially an empty hash-reference that is managed by |
402 |
this module. Its format is subject to change, but currently it is as |
403 |
follows: |
404 |
|
405 |
The key "version" has to contain 1, otherwise the hash gets emptied. |
406 |
All other keys are hostnames or IP addresses pointing to |
407 |
hash-references. The key for these inner hash references is the |
408 |
server path for which this cookie is meant, and the values are again |
409 |
hash-references. Each key of those hash-references is a cookie name, |
410 |
and the value, you guessed it, is another hash-reference, this time |
411 |
with the key-value pairs from the cookie, except for "expires" and |
412 |
"max-age", which have been replaced by a "_expires" key that |
413 |
contains the cookie expiry timestamp. Session cookies are indicated |
414 |
by not having an "_expires" key. |
415 |
|
416 |
Here is an example of a cookie jar with a single cookie, so you have |
417 |
a chance of understanding the above paragraph: |
418 |
|
419 |
{ |
420 |
version => 1, |
421 |
"10.0.0.1" => { |
422 |
"/" => { |
423 |
"mythweb_id" => { |
424 |
_expires => 1293917923, |
425 |
value => "ooRung9dThee3ooyXooM1Ohm", |
426 |
}, |
427 |
}, |
428 |
}, |
429 |
} |
430 |
|
431 |
$date = AnyEvent::HTTP::format_date $timestamp |
432 |
Takes a POSIX timestamp (seconds since the epoch) and formats it as |
433 |
a HTTP Date (RFC 2616). |
434 |
|
435 |
$timestamp = AnyEvent::HTTP::parse_date $date |
436 |
Takes a HTTP Date (RFC 2616) or a Cookie date (netscape cookie spec) |
437 |
or a bunch of minor variations of those, and returns the |
438 |
corresponding POSIX timestamp, or "undef" if the date cannot be |
439 |
parsed. |
440 |
|
441 |
$AnyEvent::HTTP::MAX_RECURSE |
442 |
The default value for the "recurse" request parameter (default: 10). |
443 |
|
444 |
$AnyEvent::HTTP::TIMEOUT |
445 |
The default timeout for connection operations (default: 300). |
446 |
|
447 |
$AnyEvent::HTTP::USERAGENT |
448 |
The default value for the "User-Agent" header (the default is |
449 |
"Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; |
450 |
+http://software.schmorp.de/pkg/AnyEvent)"). |
451 |
|
452 |
$AnyEvent::HTTP::MAX_PER_HOST |
453 |
The maximum number of concurrent connections to the same host |
454 |
(identified by the hostname). If the limit is exceeded, then |
455 |
additional requests are queued until previous connections are |
456 |
closed. Both persistent and non-persistent connections are counted |
457 |
in this limit. |
458 |
|
459 |
The default value for this is 4, and it is highly advisable to not |
460 |
increase it much. |
461 |
|
462 |
For comparison: the RFC's recommend 4 non-persistent or 2 persistent |
463 |
connections, older browsers used 2, newer ones (such as firefox 3) |
464 |
typically use 6, and Opera uses 8 because like, they have the |
465 |
fastest browser and give a shit for everybody else on the planet. |
466 |
|
467 |
$AnyEvent::HTTP::PERSISTENT_TIMEOUT |
468 |
The time after which idle persistent connections get closed by |
469 |
AnyEvent::HTTP (default: 3). |
470 |
|
471 |
$AnyEvent::HTTP::ACTIVE |
472 |
The number of active connections. This is not the number of |
473 |
currently running requests, but the number of currently open and |
474 |
non-idle TCP connections. This number can be useful for |
475 |
load-leveling. |
476 |
|
477 |
SHOWCASE |
478 |
This section contains some more elaborate "real-world" examples or code |
479 |
snippets. |
480 |
|
481 |
HTTP/1.1 FILE DOWNLOAD |
482 |
Downloading files with HTTP can be quite tricky, especially when |
483 |
something goes wrong and you want to resume. |
484 |
|
485 |
Here is a function that initiates and resumes a download. It uses the |
486 |
last modified time to check for file content changes, and works with |
487 |
many HTTP/1.0 servers as well, and usually falls back to a complete |
488 |
re-download on older servers. |
489 |
|
490 |
It calls the completion callback with either "undef", which means a |
491 |
nonretryable error occurred, 0 when the download was partial and should |
492 |
be retried, and 1 if it was successful. |
493 |
|
494 |
use AnyEvent::HTTP; |
495 |
|
496 |
sub download($$$) { |
497 |
my ($url, $file, $cb) = @_; |
498 |
|
499 |
open my $fh, "+<", $file |
500 |
or die "$file: $!"; |
501 |
|
502 |
my %hdr; |
503 |
my $ofs = 0; |
504 |
|
505 |
warn stat $fh; |
506 |
warn -s _; |
507 |
if (stat $fh and -s _) { |
508 |
$ofs = -s _; |
509 |
warn "-s is ", $ofs; |
510 |
$hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date +(stat _)[9]; |
511 |
$hdr{"range"} = "bytes=$ofs-"; |
512 |
} |
513 |
|
514 |
http_get $url, |
515 |
headers => \%hdr, |
516 |
on_header => sub { |
517 |
my ($hdr) = @_; |
518 |
|
519 |
if ($hdr->{Status} == 200 && $ofs) { |
520 |
# resume failed |
521 |
truncate $fh, $ofs = 0; |
522 |
} |
523 |
|
524 |
sysseek $fh, $ofs, 0; |
525 |
|
526 |
1 |
527 |
}, |
528 |
on_body => sub { |
529 |
my ($data, $hdr) = @_; |
530 |
|
531 |
if ($hdr->{Status} =~ /^2/) { |
532 |
length $data == syswrite $fh, $data |
533 |
or return; # abort on write errors |
534 |
} |
535 |
|
536 |
1 |
537 |
}, |
538 |
sub { |
539 |
my (undef, $hdr) = @_; |
540 |
|
541 |
my $status = $hdr->{Status}; |
542 |
|
543 |
if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) { |
544 |
utime $fh, $time, $time; |
545 |
} |
546 |
|
547 |
if ($status == 200 || $status == 206 || $status == 416) { |
548 |
# download ok || resume ok || file already fully downloaded |
549 |
$cb->(1, $hdr); |
550 |
|
551 |
} elsif ($status == 412) { |
552 |
# file has changed while resuming, delete and retry |
553 |
unlink $file; |
554 |
$cb->(0, $hdr); |
555 |
|
556 |
} elsif ($status == 500 or $status == 503 or $status =~ /^59/) { |
557 |
# retry later |
558 |
$cb->(0, $hdr); |
559 |
|
560 |
} else { |
561 |
$cb->(undef, $hdr); |
562 |
} |
563 |
} |
564 |
; |
565 |
} |
566 |
|
567 |
download "http://server/somelargefile", "/tmp/somelargefile", sub { |
568 |
if ($_[0]) { |
569 |
print "OK!\n"; |
570 |
} elsif (defined $_[0]) { |
571 |
print "please retry later\n"; |
572 |
} else { |
573 |
print "ERROR\n"; |
574 |
} |
575 |
}; |
576 |
|
577 |
SOCKS PROXIES |
578 |
Socks proxies are not directly supported by AnyEvent::HTTP. You can |
579 |
compile your perl to support socks, or use an external program such as |
580 |
socksify (dante) or tsocks to make your program use a socks proxy |
581 |
transparently. |
582 |
|
583 |
Alternatively, for AnyEvent::HTTP only, you can use your own |
584 |
"tcp_connect" function that does the proxy handshake - here is an |
585 |
example that works with socks4a proxies: |
586 |
|
587 |
use Errno; |
588 |
use AnyEvent::Util; |
589 |
use AnyEvent::Socket; |
590 |
use AnyEvent::Handle; |
591 |
|
592 |
# host, port and username of/for your socks4a proxy |
593 |
my $socks_host = "10.0.0.23"; |
594 |
my $socks_port = 9050; |
595 |
my $socks_user = ""; |
596 |
|
597 |
sub socks4a_connect { |
598 |
my ($host, $port, $connect_cb, $prepare_cb) = @_; |
599 |
|
600 |
my $hdl = new AnyEvent::Handle |
601 |
connect => [$socks_host, $socks_port], |
602 |
on_prepare => sub { $prepare_cb->($_[0]{fh}) }, |
603 |
on_error => sub { $connect_cb->() }, |
604 |
; |
605 |
|
606 |
$hdl->push_write (pack "CCnNZ*Z*", 4, 1, $port, 1, $socks_user, $host); |
607 |
|
608 |
$hdl->push_read (chunk => 8, sub { |
609 |
my ($hdl, $chunk) = @_; |
610 |
my ($status, $port, $ipn) = unpack "xCna4", $chunk; |
611 |
|
612 |
if ($status == 0x5a) { |
613 |
$connect_cb->($hdl->{fh}, (format_address $ipn) . ":$port"); |
614 |
} else { |
615 |
$! = Errno::ENXIO; $connect_cb->(); |
616 |
} |
617 |
}); |
618 |
|
619 |
$hdl |
620 |
} |
621 |
|
622 |
Use "socks4a_connect" instead of "tcp_connect" when doing |
623 |
"http_request"s, possibly after switching off other proxy types: |
624 |
|
625 |
AnyEvent::HTTP::set_proxy undef; # usually you do not want other proxies |
626 |
|
627 |
http_get 'http://www.google.com', tcp_connect => \&socks4a_connect, sub { |
628 |
my ($data, $headers) = @_; |
629 |
... |
630 |
}; |
631 |
|
632 |
SEE ALSO |
633 |
AnyEvent. |
634 |
|
635 |
AUTHOR |
636 |
Marc Lehmann <schmorp@schmorp.de> |
637 |
http://home.schmorp.de/ |
638 |
|
639 |
With many thanks to Дмитрий Шалашов, who provided countless testcases |
640 |
and bugreports. |
641 |
|