… | |
… | |
49 | "http_request" returns a "cancellation guard" - you have to keep the |
49 | "http_request" returns a "cancellation guard" - you have to keep the |
50 | object at least alive until the callback get called. If the object |
50 | object at least alive until the callback get called. If the object |
51 | gets destroyed before the callbakc is called, the request will be |
51 | gets destroyed before the callbakc is called, the request will be |
52 | cancelled. |
52 | cancelled. |
53 | |
53 | |
54 | The callback will be called with the response data as first argument |
54 | The callback will be called with the response body data as first |
55 | (or "undef" if it wasn't available due to errors), and a hash-ref |
55 | argument (or "undef" if an error occured), and a hash-ref with |
56 | with response headers as second argument. |
56 | response headers as second argument. |
57 | |
57 | |
58 | All the headers in that hash are lowercased. In addition to the |
58 | All the headers in that hash are lowercased. In addition to the |
59 | response headers, the "pseudo-headers" "HTTPVersion", "Status" and |
59 | response headers, the "pseudo-headers" (uppercase to avoid clashing |
|
|
60 | with possible response headers) "HTTPVersion", "Status" and "Reason" |
60 | "Reason" contain the three parts of the HTTP Status-Line of the same |
61 | contain the three parts of the HTTP Status-Line of the same name. |
|
|
62 | |
61 | name. The pseudo-header "URL" contains the original URL (which can |
63 | The pseudo-header "URL" contains the actual URL (which can differ |
62 | differ from the requested URL when following redirects). |
64 | from the requested URL when following redirects - for example, you |
|
|
65 | might get an error that your URL scheme is not supported even though |
|
|
66 | your URL is a valid http URL because it redirected to an ftp URL, in |
|
|
67 | which case you can look at the URL pseudo header). |
63 | |
68 | |
|
|
69 | The pseudo-header "Redirect" only exists when the request was a |
|
|
70 | result of an internal redirect. In that case it is an array |
|
|
71 | reference with the "($data, $headers)" from the redirect response. |
|
|
72 | Note that this response could in turn be the result of a redirect |
|
|
73 | itself, and "$headers->{Redirect}[1]{Redirect}" will then contain |
|
|
74 | the original response, and so on. |
|
|
75 | |
64 | If the server sends a header multiple lines, then their contents |
76 | If the server sends a header multiple times, then their contents |
65 | will be joined together with "\x00". |
77 | will be joined together with a comma (","), as per the HTTP spec. |
66 | |
78 | |
67 | If an internal error occurs, such as not being able to resolve a |
79 | If an internal error occurs, such as not being able to resolve a |
68 | hostname, then $data will be "undef", "$headers->{Status}" will be |
80 | hostname, then $data will be "undef", "$headers->{Status}" will be |
69 | 599 and the "Reason" pseudo-header will contain an error message. |
81 | "59x" (usually 599) and the "Reason" pseudo-header will contain an |
|
|
82 | error message. |
70 | |
83 | |
71 | A typical callback might look like this: |
84 | A typical callback might look like this: |
72 | |
85 | |
73 | sub { |
86 | sub { |
74 | my ($body, $hdr) = @_; |
87 | my ($body, $hdr) = @_; |
… | |
… | |
89 | |
102 | |
90 | headers => hashref |
103 | headers => hashref |
91 | The request headers to use. Currently, "http_request" may |
104 | The request headers to use. Currently, "http_request" may |
92 | provide its own "Host:", "Content-Length:", "Connection:" and |
105 | provide its own "Host:", "Content-Length:", "Connection:" and |
93 | "Cookie:" headers and will provide defaults for "User-Agent:" |
106 | "Cookie:" headers and will provide defaults for "User-Agent:" |
94 | and "Referer:". |
107 | and "Referer:" (this can be suppressed by using "undef" for |
|
|
108 | these headers in which case they won't be sent at all). |
95 | |
109 | |
96 | timeout => $seconds |
110 | timeout => $seconds |
97 | The time-out to use for various stages - each connect attempt |
111 | The time-out to use for various stages - each connect attempt |
98 | will reset the timeout, as will read or write activity. Default |
112 | will reset the timeout, as will read or write activity, i.e. |
|
|
113 | this is not an overall timeout. |
|
|
114 | |
99 | timeout is 5 minutes. |
115 | Default timeout is 5 minutes. |
100 | |
116 | |
101 | proxy => [$host, $port[, $scheme]] or undef |
117 | proxy => [$host, $port[, $scheme]] or undef |
102 | Use the given http proxy for all requests. If not specified, |
118 | Use the given http proxy for all requests. If not specified, |
103 | then the default proxy (as specified by $ENV{http_proxy}) is |
119 | then the default proxy (as specified by $ENV{http_proxy}) is |
104 | used. |
120 | used. |
105 | |
121 | |
106 | $scheme must be either missing or "http" for HTTP, or "https" |
122 | $scheme must be either missing, "http" for HTTP or "https" for |
107 | for HTTPS. |
123 | HTTPS. |
108 | |
124 | |
109 | body => $string |
125 | body => $string |
110 | The request body, usually empty. Will be-sent as-is (future |
126 | The request body, usually empty. Will be-sent as-is (future |
111 | versions of this module might offer more options). |
127 | versions of this module might offer more options). |
112 | |
128 | |
… | |
… | |
115 | loosely based on the original netscape specification. |
131 | loosely based on the original netscape specification. |
116 | |
132 | |
117 | The $hash_ref must be an (initially empty) hash reference which |
133 | The $hash_ref must be an (initially empty) hash reference which |
118 | will get updated automatically. It is possible to save the |
134 | will get updated automatically. It is possible to save the |
119 | cookie_jar to persistent storage with something like JSON or |
135 | cookie_jar to persistent storage with something like JSON or |
120 | Storable, but this is not recommended, as expire times are |
136 | Storable, but this is not recommended, as expiry times are |
121 | currently being ignored. |
137 | currently being ignored. |
122 | |
138 | |
123 | Note that this cookie implementation is not of very high |
139 | Note that this cookie implementation is not of very high |
124 | quality, nor meant to be complete. If you want complete cookie |
140 | quality, nor meant to be complete. If you want complete cookie |
125 | management you have to do that on your own. "cookie_jar" is |
141 | management you have to do that on your own. "cookie_jar" is |
126 | meant as a quick fix to get some cookie-using sites working. |
142 | meant as a quick fix to get some cookie-using sites working. |
127 | Cookies are a privacy disaster, do not use them unless required |
143 | Cookies are a privacy disaster, do not use them unless required |
128 | to. |
144 | to. |
|
|
145 | |
|
|
146 | tls_ctx => $scheme | $tls_ctx |
|
|
147 | Specifies the AnyEvent::TLS context to be used for https |
|
|
148 | connections. This parameter follows the same rules as the |
|
|
149 | "tls_ctx" parameter to AnyEvent::Handle, but additionally, the |
|
|
150 | two strings "low" or "high" can be specified, which give you a |
|
|
151 | predefined low-security (no verification, highest compatibility) |
|
|
152 | and high-security (CA and common-name verification) TLS context. |
|
|
153 | |
|
|
154 | The default for this option is "low", which could be interpreted |
|
|
155 | as "give me the page, no matter what". |
|
|
156 | |
|
|
157 | on_prepare => $callback->($fh) |
|
|
158 | In rare cases you need to "tune" the socket before it is used to |
|
|
159 | connect (for exmaple, to bind it on a given IP address). This |
|
|
160 | parameter overrides the prepare callback passed to |
|
|
161 | "AnyEvent::Socket::tcp_connect" and behaves exactly the same way |
|
|
162 | (e.g. it has to provide a timeout). See the description for the |
|
|
163 | $prepare_cb argument of "AnyEvent::Socket::tcp_connect" for |
|
|
164 | details. |
|
|
165 | |
|
|
166 | on_header => $callback->($headers) |
|
|
167 | When specified, this callback will be called with the header |
|
|
168 | hash as soon as headers have been successfully received from the |
|
|
169 | remote server (not on locally-generated errors). |
|
|
170 | |
|
|
171 | It has to return either true (in which case AnyEvent::HTTP will |
|
|
172 | continue), or false, in which case AnyEvent::HTTP will cancel |
|
|
173 | the download (and call the finish callback with an error code of |
|
|
174 | 598). |
|
|
175 | |
|
|
176 | This callback is useful, among other things, to quickly reject |
|
|
177 | unwanted content, which, if it is supposed to be rare, can be |
|
|
178 | faster than first doing a "HEAD" request. |
|
|
179 | |
|
|
180 | Example: cancel the request unless the content-type is |
|
|
181 | "text/html". |
|
|
182 | |
|
|
183 | on_header => sub { |
|
|
184 | $_[0]{"content-type"} =~ /^text\/html\s*(?:;|$)/ |
|
|
185 | }, |
|
|
186 | |
|
|
187 | on_body => $callback->($partial_body, $headers) |
|
|
188 | When specified, all body data will be passed to this callback |
|
|
189 | instead of to the completion callback. The completion callback |
|
|
190 | will get the empty string instead of the body data. |
|
|
191 | |
|
|
192 | It has to return either true (in which case AnyEvent::HTTP will |
|
|
193 | continue), or false, in which case AnyEvent::HTTP will cancel |
|
|
194 | the download (and call the completion callback with an error |
|
|
195 | code of 598). |
|
|
196 | |
|
|
197 | This callback is useful when the data is too large to be held in |
|
|
198 | memory (so the callback writes it to a file) or when only some |
|
|
199 | information should be extracted, or when the body should be |
|
|
200 | processed incrementally. |
|
|
201 | |
|
|
202 | It is usually preferred over doing your own body handling via |
|
|
203 | "want_body_handle", but in case of streaming APIs, where HTTP is |
|
|
204 | only used to create a connection, "want_body_handle" is the |
|
|
205 | better alternative, as it allows you to install your own event |
|
|
206 | handler, reducing resource usage. |
|
|
207 | |
|
|
208 | want_body_handle => $enable |
|
|
209 | When enabled (default is disabled), the behaviour of |
|
|
210 | AnyEvent::HTTP changes considerably: after parsing the headers, |
|
|
211 | and instead of downloading the body (if any), the completion |
|
|
212 | callback will be called. Instead of the $body argument |
|
|
213 | containing the body data, the callback will receive the |
|
|
214 | AnyEvent::Handle object associated with the connection. In error |
|
|
215 | cases, "undef" will be passed. When there is no body (e.g. |
|
|
216 | status 304), the empty string will be passed. |
|
|
217 | |
|
|
218 | The handle object might or might not be in TLS mode, might be |
|
|
219 | connected to a proxy, be a persistent connection etc., and |
|
|
220 | configured in unspecified ways. The user is responsible for this |
|
|
221 | handle (it will not be used by this module anymore). |
|
|
222 | |
|
|
223 | This is useful with some push-type services, where, after the |
|
|
224 | initial headers, an interactive protocol is used (typical |
|
|
225 | example would be the push-style twitter API which starts a |
|
|
226 | JSON/XML stream). |
|
|
227 | |
|
|
228 | If you think you need this, first have a look at "on_body", to |
|
|
229 | see if that doesn't solve your problem in a better way. |
129 | |
230 | |
130 | Example: make a simple HTTP GET request for http://www.nethype.de/ |
231 | Example: make a simple HTTP GET request for http://www.nethype.de/ |
131 | |
232 | |
132 | http_request GET => "http://www.nethype.de/", sub { |
233 | http_request GET => "http://www.nethype.de/", sub { |
133 | my ($body, $hdr) = @_; |
234 | my ($body, $hdr) = @_; |
… | |
… | |
155 | print "$body\n"; |
256 | print "$body\n"; |
156 | }; |
257 | }; |
157 | |
258 | |
158 | undef $request; |
259 | undef $request; |
159 | |
260 | |
|
|
261 | DNS CACHING |
|
|
262 | AnyEvent::HTTP uses the AnyEvent::Socket::tcp_connect function for the |
|
|
263 | actual connection, which in turn uses AnyEvent::DNS to resolve |
|
|
264 | hostnames. The latter is a simple stub resolver and does no caching on |
|
|
265 | its own. If you want DNS caching, you currently have to provide your own |
|
|
266 | default resolver (by storing a suitable resolver object in |
|
|
267 | $AnyEvent::DNS::RESOLVER). |
|
|
268 | |
160 | GLOBAL FUNCTIONS AND VARIABLES |
269 | GLOBAL FUNCTIONS AND VARIABLES |
161 | AnyEvent::HTTP::set_proxy "proxy-url" |
270 | AnyEvent::HTTP::set_proxy "proxy-url" |
162 | Sets the default proxy server to use. The proxy-url must begin with |
271 | Sets the default proxy server to use. The proxy-url must begin with |
163 | a string of the form "http://host:port" (optionally "https:..."). |
272 | a string of the form "http://host:port" (optionally "https:..."), |
|
|
273 | croaks otherwise. |
|
|
274 | |
|
|
275 | To clear an already-set proxy, use "undef". |
164 | |
276 | |
165 | $AnyEvent::HTTP::MAX_RECURSE |
277 | $AnyEvent::HTTP::MAX_RECURSE |
166 | The default value for the "recurse" request parameter (default: 10). |
278 | The default value for the "recurse" request parameter (default: 10). |
167 | |
279 | |
168 | $AnyEvent::HTTP::USERAGENT |
280 | $AnyEvent::HTTP::USERAGENT |
169 | The default value for the "User-Agent" header (the default is |
281 | The default value for the "User-Agent" header (the default is |
170 | "Mozilla/5.0 (compatible; AnyEvent::HTTP/$VERSION; |
282 | "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; |
171 | +http://software.schmorp.de/pkg/AnyEvent)"). |
283 | +http://software.schmorp.de/pkg/AnyEvent)"). |
172 | |
284 | |
173 | $AnyEvent::HTTP::MAX_PERSISTENT |
285 | $AnyEvent::HTTP::MAX_PER_HOST |
174 | The maximum number of persistent connections to keep open (default: |
286 | The maximum number of concurrent connections to the same host |
175 | 8). |
287 | (identified by the hostname). If the limit is exceeded, then the |
|
|
288 | additional requests are queued until previous connections are |
|
|
289 | closed. |
176 | |
290 | |
177 | Not implemented currently. |
291 | The default value for this is 4, and it is highly advisable to not |
178 | |
292 | increase it. |
179 | $AnyEvent::HTTP::PERSISTENT_TIMEOUT |
|
|
180 | The maximum time to cache a persistent connection, in seconds |
|
|
181 | (default: 2). |
|
|
182 | |
|
|
183 | Not implemented currently. |
|
|
184 | |
293 | |
185 | $AnyEvent::HTTP::ACTIVE |
294 | $AnyEvent::HTTP::ACTIVE |
186 | The number of active connections. This is not the number of |
295 | The number of active connections. This is not the number of |
187 | currently running requests, but the number of currently open and |
296 | currently running requests, but the number of currently open and |
188 | non-idle TCP connections. This number of can be useful for |
297 | non-idle TCP connections. This number of can be useful for |
… | |
… | |
193 | |
302 | |
194 | AUTHOR |
303 | AUTHOR |
195 | Marc Lehmann <schmorp@schmorp.de> |
304 | Marc Lehmann <schmorp@schmorp.de> |
196 | http://home.schmorp.de/ |
305 | http://home.schmorp.de/ |
197 | |
306 | |
|
|
307 | With many thanks to Дмитрий Шалашов, who provided |
|
|
308 | countless testcases and bugreports. |
|
|
309 | |