1 |
root |
1.1 |
NAME |
2 |
root |
1.2 |
AnyEvent::HTTP - simple but non-blocking HTTP/HTTPS client |
3 |
root |
1.1 |
|
4 |
|
|
SYNOPSIS |
5 |
root |
1.2 |
use AnyEvent::HTTP; |
6 |
root |
1.1 |
|
7 |
root |
1.3 |
http_get "http://www.nethype.de/", sub { print $_[1] }; |
8 |
|
|
|
9 |
|
|
# ... do something else here |
10 |
|
|
|
11 |
root |
1.1 |
DESCRIPTION |
12 |
|
|
This module is an AnyEvent user, you need to make sure that you use and |
13 |
|
|
run a supported event loop. |
14 |
|
|
|
15 |
root |
1.2 |
This module implements a simple, stateless and non-blocking HTTP client. |
16 |
|
|
It supports GET, POST and other request methods, cookies and more, all |
17 |
|
|
on a very low level. It can follow redirects supports proxies and |
18 |
|
|
automatically limits the number of connections to the values specified |
19 |
|
|
in the RFC. |
20 |
|
|
|
21 |
|
|
It should generally be a "good client" that is enough for most HTTP |
22 |
|
|
tasks. Simple tasks should be simple, but complex tasks should still be |
23 |
|
|
possible as the user retains control over request and response headers. |
24 |
|
|
|
25 |
|
|
The caller is responsible for authentication management, cookies (if the |
26 |
|
|
simplistic implementation in this module doesn't suffice), referer and |
27 |
|
|
other high-level protocol details for which this module offers only |
28 |
|
|
limited support. |
29 |
|
|
|
30 |
|
|
METHODS |
31 |
|
|
http_get $url, key => value..., $cb->($data, $headers) |
32 |
|
|
Executes an HTTP-GET request. See the http_request function for |
33 |
root |
1.5 |
details on additional parameters and the return value. |
34 |
root |
1.2 |
|
35 |
|
|
http_head $url, key => value..., $cb->($data, $headers) |
36 |
|
|
Executes an HTTP-HEAD request. See the http_request function for |
37 |
root |
1.5 |
details on additional parameters and the return value. |
38 |
root |
1.2 |
|
39 |
|
|
http_post $url, $body, key => value..., $cb->($data, $headers) |
40 |
root |
1.4 |
Executes an HTTP-POST request with a request body of $body. See the |
41 |
root |
1.5 |
http_request function for details on additional parameters and the |
42 |
|
|
return value. |
43 |
root |
1.2 |
|
44 |
|
|
http_request $method => $url, key => value..., $cb->($data, $headers) |
45 |
|
|
Executes a HTTP request of type $method (e.g. "GET", "POST"). The |
46 |
|
|
URL must be an absolute http or https URL. |
47 |
|
|
|
48 |
root |
1.5 |
When called in void context, nothing is returned. In other contexts, |
49 |
|
|
"http_request" returns a "cancellation guard" - you have to keep the |
50 |
|
|
object at least alive until the callback get called. If the object |
51 |
|
|
gets destroyed before the callbakc is called, the request will be |
52 |
|
|
cancelled. |
53 |
|
|
|
54 |
root |
1.8 |
The callback will be called with the response body data as first |
55 |
|
|
argument (or "undef" if an error occured), and a hash-ref with |
56 |
|
|
response headers as second argument. |
57 |
root |
1.2 |
|
58 |
|
|
All the headers in that hash are lowercased. In addition to the |
59 |
root |
1.3 |
response headers, the "pseudo-headers" "HTTPVersion", "Status" and |
60 |
|
|
"Reason" contain the three parts of the HTTP Status-Line of the same |
61 |
|
|
name. The pseudo-header "URL" contains the original URL (which can |
62 |
|
|
differ from the requested URL when following redirects). |
63 |
|
|
|
64 |
root |
1.6 |
If the server sends a header multiple times, then their contents |
65 |
|
|
will be joined together with a comma (","), as per the HTTP spec. |
66 |
root |
1.2 |
|
67 |
|
|
If an internal error occurs, such as not being able to resolve a |
68 |
|
|
hostname, then $data will be "undef", "$headers->{Status}" will be |
69 |
root |
1.8 |
"59x" (usually 599) and the "Reason" pseudo-header will contain an |
70 |
|
|
error message. |
71 |
root |
1.2 |
|
72 |
|
|
A typical callback might look like this: |
73 |
|
|
|
74 |
|
|
sub { |
75 |
|
|
my ($body, $hdr) = @_; |
76 |
|
|
|
77 |
|
|
if ($hdr->{Status} =~ /^2/) { |
78 |
|
|
... everything should be ok |
79 |
|
|
} else { |
80 |
|
|
print "error, $hdr->{Status} $hdr->{Reason}\n"; |
81 |
|
|
} |
82 |
|
|
} |
83 |
|
|
|
84 |
|
|
Additional parameters are key-value pairs, and are fully optional. |
85 |
|
|
They include: |
86 |
|
|
|
87 |
|
|
recurse => $count (default: $MAX_RECURSE) |
88 |
|
|
Whether to recurse requests or not, e.g. on redirects, |
89 |
|
|
authentication retries and so on, and how often to do so. |
90 |
|
|
|
91 |
|
|
headers => hashref |
92 |
|
|
The request headers to use. Currently, "http_request" may |
93 |
|
|
provide its own "Host:", "Content-Length:", "Connection:" and |
94 |
|
|
"Cookie:" headers and will provide defaults for "User-Agent:" |
95 |
|
|
and "Referer:". |
96 |
|
|
|
97 |
|
|
timeout => $seconds |
98 |
|
|
The time-out to use for various stages - each connect attempt |
99 |
|
|
will reset the timeout, as will read or write activity. Default |
100 |
|
|
timeout is 5 minutes. |
101 |
|
|
|
102 |
|
|
proxy => [$host, $port[, $scheme]] or undef |
103 |
|
|
Use the given http proxy for all requests. If not specified, |
104 |
|
|
then the default proxy (as specified by $ENV{http_proxy}) is |
105 |
|
|
used. |
106 |
|
|
|
107 |
|
|
$scheme must be either missing or "http" for HTTP, or "https" |
108 |
|
|
for HTTPS. |
109 |
|
|
|
110 |
|
|
body => $string |
111 |
|
|
The request body, usually empty. Will be-sent as-is (future |
112 |
|
|
versions of this module might offer more options). |
113 |
|
|
|
114 |
|
|
cookie_jar => $hash_ref |
115 |
|
|
Passing this parameter enables (simplified) cookie-processing, |
116 |
|
|
loosely based on the original netscape specification. |
117 |
|
|
|
118 |
|
|
The $hash_ref must be an (initially empty) hash reference which |
119 |
|
|
will get updated automatically. It is possible to save the |
120 |
|
|
cookie_jar to persistent storage with something like JSON or |
121 |
root |
1.8 |
Storable, but this is not recommended, as expiry times are |
122 |
root |
1.2 |
currently being ignored. |
123 |
|
|
|
124 |
|
|
Note that this cookie implementation is not of very high |
125 |
|
|
quality, nor meant to be complete. If you want complete cookie |
126 |
|
|
management you have to do that on your own. "cookie_jar" is |
127 |
|
|
meant as a quick fix to get some cookie-using sites working. |
128 |
|
|
Cookies are a privacy disaster, do not use them unless required |
129 |
|
|
to. |
130 |
|
|
|
131 |
root |
1.8 |
tls_ctx => $scheme | $tls_ctx |
132 |
|
|
Specifies the AnyEvent::TLS context to be used for https |
133 |
|
|
connections. This parameter follows the same rules as the |
134 |
|
|
"tls_ctx" parameter to AnyEvent::Handle, but additionally, the |
135 |
|
|
two strings "low" or "high" can be specified, which give you a |
136 |
|
|
predefined low-security (no verification, highest compatibility) |
137 |
|
|
and high-security (CA and common-name verification) TLS context. |
138 |
|
|
|
139 |
|
|
The default for this option is "low", which could be interpreted |
140 |
|
|
as "give me the page, no matter what". |
141 |
|
|
|
142 |
|
|
on_header => $callback->($headers) |
143 |
|
|
When specified, this callback will be called with the header |
144 |
|
|
hash as soon as headers have been successfully received from the |
145 |
|
|
remote server (not on locally-generated errors). |
146 |
|
|
|
147 |
|
|
It has to return either true (in which case AnyEvent::HTTP will |
148 |
|
|
continue), or false, in which case AnyEvent::HTTP will cancel |
149 |
|
|
the download (and call the finish callback with an error code of |
150 |
|
|
598). |
151 |
|
|
|
152 |
|
|
This callback is useful, among other things, to quickly reject |
153 |
|
|
unwanted content, which, if it is supposed to be rare, can be |
154 |
|
|
faster than first doing a "HEAD" request. |
155 |
|
|
|
156 |
|
|
Example: cancel the request unless the content-type is |
157 |
|
|
"text/html". |
158 |
|
|
|
159 |
|
|
on_header => sub { |
160 |
|
|
$_[0]{"content-type"} =~ /^text\/html\s*(?:;|$)/ |
161 |
|
|
}, |
162 |
|
|
|
163 |
|
|
on_body => $callback->($partial_body, $headers) |
164 |
|
|
When specified, all body data will be passed to this callback |
165 |
|
|
instead of to the completion callback. The completion callback |
166 |
|
|
will get the empty string instead of the body data. |
167 |
|
|
|
168 |
|
|
It has to return either true (in which case AnyEvent::HTTP will |
169 |
|
|
continue), or false, in which case AnyEvent::HTTP will cancel |
170 |
|
|
the download (and call the completion callback with an error |
171 |
|
|
code of 598). |
172 |
|
|
|
173 |
|
|
This callback is useful when the data is too large to be held in |
174 |
|
|
memory (so the callback writes it to a file) or when only some |
175 |
|
|
information should be extracted, or when the body should be |
176 |
|
|
processed incrementally. |
177 |
|
|
|
178 |
|
|
It is usually preferred over doing your own body handling via |
179 |
|
|
"want_body_handle". |
180 |
|
|
|
181 |
|
|
want_body_handle => $enable |
182 |
|
|
When enabled (default is disabled), the behaviour of |
183 |
|
|
AnyEvent::HTTP changes considerably: after parsing the headers, |
184 |
|
|
and instead of downloading the body (if any), the completion |
185 |
|
|
callback will be called. Instead of the $body argument |
186 |
|
|
containing the body data, the callback will receive the |
187 |
|
|
AnyEvent::Handle object associated with the connection. In error |
188 |
|
|
cases, "undef" will be passed. When there is no body (e.g. |
189 |
|
|
status 304), the empty string will be passed. |
190 |
|
|
|
191 |
|
|
The handle object might or might not be in TLS mode, might be |
192 |
|
|
connected to a proxy, be a persistent connection etc., and |
193 |
|
|
configured in unspecified ways. The user is responsible for this |
194 |
|
|
handle (it will not be used by this module anymore). |
195 |
|
|
|
196 |
|
|
This is useful with some push-type services, where, after the |
197 |
|
|
initial headers, an interactive protocol is used (typical |
198 |
|
|
example would be the push-style twitter API which starts a |
199 |
|
|
JSON/XML stream). |
200 |
|
|
|
201 |
|
|
If you think you need this, first have a look at "on_body", to |
202 |
|
|
see if that doesn'T solve your problem in a better way. |
203 |
|
|
|
204 |
root |
1.2 |
Example: make a simple HTTP GET request for http://www.nethype.de/ |
205 |
|
|
|
206 |
|
|
http_request GET => "http://www.nethype.de/", sub { |
207 |
|
|
my ($body, $hdr) = @_; |
208 |
|
|
print "$body\n"; |
209 |
|
|
}; |
210 |
|
|
|
211 |
|
|
Example: make a HTTP HEAD request on https://www.google.com/, use a |
212 |
|
|
timeout of 30 seconds. |
213 |
|
|
|
214 |
|
|
http_request |
215 |
|
|
GET => "https://www.google.com", |
216 |
|
|
timeout => 30, |
217 |
|
|
sub { |
218 |
|
|
my ($body, $hdr) = @_; |
219 |
|
|
use Data::Dumper; |
220 |
|
|
print Dumper $hdr; |
221 |
|
|
} |
222 |
|
|
; |
223 |
|
|
|
224 |
root |
1.5 |
Example: make another simple HTTP GET request, but immediately try |
225 |
|
|
to cancel it. |
226 |
|
|
|
227 |
|
|
my $request = http_request GET => "http://www.nethype.de/", sub { |
228 |
|
|
my ($body, $hdr) = @_; |
229 |
|
|
print "$body\n"; |
230 |
|
|
}; |
231 |
|
|
|
232 |
|
|
undef $request; |
233 |
|
|
|
234 |
root |
1.2 |
GLOBAL FUNCTIONS AND VARIABLES |
235 |
|
|
AnyEvent::HTTP::set_proxy "proxy-url" |
236 |
|
|
Sets the default proxy server to use. The proxy-url must begin with |
237 |
|
|
a string of the form "http://host:port" (optionally "https:..."). |
238 |
|
|
|
239 |
|
|
$AnyEvent::HTTP::MAX_RECURSE |
240 |
|
|
The default value for the "recurse" request parameter (default: 10). |
241 |
|
|
|
242 |
|
|
$AnyEvent::HTTP::USERAGENT |
243 |
|
|
The default value for the "User-Agent" header (the default is |
244 |
root |
1.8 |
"Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; |
245 |
root |
1.2 |
+http://software.schmorp.de/pkg/AnyEvent)"). |
246 |
|
|
|
247 |
root |
1.8 |
$AnyEvent::HTTP::MAX_PER_HOST |
248 |
|
|
The maximum number of concurrent conenctions to the same host |
249 |
|
|
(identified by the hostname). If the limit is exceeded, then the |
250 |
|
|
additional requests are queued until previous connections are |
251 |
|
|
closed. |
252 |
root |
1.2 |
|
253 |
root |
1.8 |
The default value for this is 4, and it is highly advisable to not |
254 |
|
|
increase it. |
255 |
root |
1.2 |
|
256 |
|
|
$AnyEvent::HTTP::ACTIVE |
257 |
|
|
The number of active connections. This is not the number of |
258 |
|
|
currently running requests, but the number of currently open and |
259 |
|
|
non-idle TCP connections. This number of can be useful for |
260 |
|
|
load-leveling. |
261 |
root |
1.1 |
|
262 |
|
|
SEE ALSO |
263 |
root |
1.2 |
AnyEvent. |
264 |
root |
1.1 |
|
265 |
|
|
AUTHOR |
266 |
root |
1.3 |
Marc Lehmann <schmorp@schmorp.de> |
267 |
|
|
http://home.schmorp.de/ |
268 |
root |
1.1 |
|
269 |
root |
1.7 |
With many thanks to Дмитрий Шалашов, who provided |
270 |
|
|
countless testcases and bugreports. |
271 |
|
|
|