ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.29 by root, Wed Oct 22 23:28:11 2008 UTC vs.
Revision 1.40 by root, Sun Jul 5 01:45:01 2009 UTC

41use strict; 41use strict;
42no warnings; 42no warnings;
43 43
44use Carp; 44use Carp;
45 45
46use AnyEvent (); 46use AnyEvent 4.452 ();
47use AnyEvent::Util (); 47use AnyEvent::Util ();
48use AnyEvent::Socket (); 48use AnyEvent::Socket ();
49use AnyEvent::Handle (); 49use AnyEvent::Handle ();
50 50
51use base Exporter::; 51use base Exporter::;
52 52
53our $VERSION = '1.05'; 53our $VERSION = '1.12';
54 54
55our @EXPORT = qw(http_get http_post http_head http_request); 55our @EXPORT = qw(http_get http_post http_head http_request);
56 56
57our $USERAGENT = "Mozilla/5.0 (compatible; AnyEvent::HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)"; 57our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)";
58our $MAX_RECURSE = 10; 58our $MAX_RECURSE = 10;
59our $MAX_PERSISTENT = 8; 59our $MAX_PERSISTENT = 8;
60our $PERSISTENT_TIMEOUT = 2; 60our $PERSISTENT_TIMEOUT = 2;
61our $TIMEOUT = 300; 61our $TIMEOUT = 300;
62 62
104headers, the "pseudo-headers" C<HTTPVersion>, C<Status> and C<Reason> 104headers, the "pseudo-headers" C<HTTPVersion>, C<Status> and C<Reason>
105contain the three parts of the HTTP Status-Line of the same name. The 105contain the three parts of the HTTP Status-Line of the same name. The
106pseudo-header C<URL> contains the original URL (which can differ from the 106pseudo-header C<URL> contains the original URL (which can differ from the
107requested URL when following redirects). 107requested URL when following redirects).
108 108
109If the server sends a header multiple lines, then their contents will be 109If the server sends a header multiple times, then their contents will be
110joined together with C<\x00>. 110joined together with a comma (C<,>), as per the HTTP spec.
111 111
112If an internal error occurs, such as not being able to resolve a hostname, 112If an internal error occurs, such as not being able to resolve a hostname,
113then C<$data> will be C<undef>, C<< $headers->{Status} >> will be C<599> 113then C<$data> will be C<undef>, C<< $headers->{Status} >> will be C<599>
114and the C<Reason> pseudo-header will contain an error message. 114and the C<Reason> pseudo-header will contain an error message.
115 115
165based on the original netscape specification. 165based on the original netscape specification.
166 166
167The C<$hash_ref> must be an (initially empty) hash reference which will 167The C<$hash_ref> must be an (initially empty) hash reference which will
168get updated automatically. It is possible to save the cookie_jar to 168get updated automatically. It is possible to save the cookie_jar to
169persistent storage with something like JSON or Storable, but this is not 169persistent storage with something like JSON or Storable, but this is not
170recommended, as expire times are currently being ignored. 170recommended, as expiry times are currently being ignored.
171 171
172Note that this cookie implementation is not of very high quality, nor 172Note that this cookie implementation is not of very high quality, nor
173meant to be complete. If you want complete cookie management you have to 173meant to be complete. If you want complete cookie management you have to
174do that on your own. C<cookie_jar> is meant as a quick fix to get some 174do that on your own. C<cookie_jar> is meant as a quick fix to get some
175cookie-using sites working. Cookies are a privacy disaster, do not use 175cookie-using sites working. Cookies are a privacy disaster, do not use
176them unless required to. 176them unless required to.
177
178=item tls_ctx => $scheme | $tls_ctx
179
180Specifies the AnyEvent::TLS context to be used for https connections. This
181parameter follows the same rules as the C<tls_ctx> parameter to
182L<AnyEvent::Handle>, but additionally, the two strings C<low> or
183C<high> can be specified, which give you a predefined low-security (no
184verification, highest compatibility) and high-security (CA and common-name
185verification) TLS context.
186
187The default for this option is C<low>, which could be interpreted as "give
188me the page, no matter what".
177 189
178=back 190=back
179 191
180Example: make a simple HTTP GET request for http://www.nethype.de/ 192Example: make a simple HTTP GET request for http://www.nethype.de/
181 193
237 push @{ $CO_SLOT{$_[0]}[1] }, $_[1]; 249 push @{ $CO_SLOT{$_[0]}[1] }, $_[1];
238 250
239 _slot_schedule $_[0]; 251 _slot_schedule $_[0];
240} 252}
241 253
254our $qr_nl = qr<\015?\012>;
255our $qr_nlnl = qr<\015?\012\015?\012>;
256
257our $TLS_CTX_LOW = { cache => 1, dh => undef, sslv2 => 1 };
258our $TLS_CTX_HIGH = { cache => 1, verify => 1, verify_cn => "https", dh => "skip4096" };
259
242sub http_request($$@) { 260sub http_request($$@) {
243 my $cb = pop; 261 my $cb = pop;
244 my ($method, $url, %arg) = @_; 262 my ($method, $url, %arg) = @_;
245 263
246 my %hdr; 264 my %hdr;
265
266 $arg{tls_ctx} = $TLS_CTX_LOW if $arg{tls_ctx} eq "low" || !exists $arg{tls_ctx};
267 $arg{tls_ctx} = $TLS_CTX_HIGH if $arg{tls_ctx} eq "high";
247 268
248 $method = uc $method; 269 $method = uc $method;
249 270
250 if (my $hdr = $arg{headers}) { 271 if (my $hdr = $arg{headers}) {
251 while (my ($k, $v) = each %$hdr) { 272 while (my ($k, $v) = each %$hdr) {
253 } 274 }
254 } 275 }
255 276
256 my $recurse = exists $arg{recurse} ? delete $arg{recurse} : $MAX_RECURSE; 277 my $recurse = exists $arg{recurse} ? delete $arg{recurse} : $MAX_RECURSE;
257 278
258 return $cb->(undef, { Status => 599, Reason => "recursion limit reached", URL => $url }) 279 return $cb->(undef, { Status => 599, Reason => "Too many redirections", URL => $url })
259 if $recurse < 0; 280 if $recurse < 0;
260 281
261 my $proxy = $arg{proxy} || $PROXY; 282 my $proxy = $arg{proxy} || $PROXY;
262 my $timeout = $arg{timeout} || $TIMEOUT; 283 my $timeout = $arg{timeout} || $TIMEOUT;
263 284
264 $hdr{"user-agent"} ||= $USERAGENT; 285 $hdr{"user-agent"} ||= $USERAGENT;
265 286
266 my ($scheme, $authority, $upath, $query, $fragment) = 287 my ($uscheme, $uauthority, $upath, $query, $fragment) =
267 $url =~ m|(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?|; 288 $url =~ m|(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?|;
268 289
269 $scheme = lc $scheme; 290 $uscheme = lc $uscheme;
270 291
271 my $uport = $scheme eq "http" ? 80 292 my $uport = $uscheme eq "http" ? 80
272 : $scheme eq "https" ? 443 293 : $uscheme eq "https" ? 443
273 : return $cb->(undef, { Status => 599, Reason => "only http and https URL schemes supported", URL => $url }); 294 : return $cb->(undef, { Status => 599, Reason => "Only http and https URL schemes supported (not '$uscheme')", URL => $url });
274 295
275 $hdr{referer} ||= "$scheme://$authority$upath"; # leave out fragment and query string, just a heuristic 296 $hdr{referer} ||= "$uscheme://$uauthority$upath"; # leave out fragment and query string, just a heuristic
276 297
277 $authority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x 298 $uauthority =~ /^(?: .*\@ )? ([^\@:]+) (?: : (\d+) )?$/x
278 or return $cb->(undef, { Status => 599, Reason => "unparsable URL", URL => $url }); 299 or return $cb->(undef, { Status => 599, Reason => "Unparsable URL", URL => $url });
279 300
280 my $uhost = $1; 301 my $uhost = $1;
281 $uport = $2 if defined $2; 302 $uport = $2 if defined $2;
282 303
283 $uhost =~ s/^\[(.*)\]$/$1/; 304 $uhost =~ s/^\[(.*)\]$/$1/;
285 306
286 $upath =~ s%^/?%/%; 307 $upath =~ s%^/?%/%;
287 308
288 # cookie processing 309 # cookie processing
289 if (my $jar = $arg{cookie_jar}) { 310 if (my $jar = $arg{cookie_jar}) {
290 %$jar = () if $jar->{version} < 1; 311 %$jar = () if $jar->{version} != 1;
291 312
292 my @cookie; 313 my @cookie;
293 314
294 while (my ($chost, $v) = each %$jar) { 315 while (my ($chost, $v) = each %$jar) {
316 if ($chost =~ /^\./) {
295 next unless $chost eq substr $uhost, -length $chost; 317 next unless $chost eq substr $uhost, -length $chost;
296 next unless $chost =~ /^\./; 318 } elsif ($chost =~ /\./) {
319 next unless $chost eq $uhost;
320 } else {
321 next;
322 }
297 323
298 while (my ($cpath, $v) = each %$v) { 324 while (my ($cpath, $v) = each %$v) {
299 next unless $cpath eq substr $upath, 0, length $cpath; 325 next unless $cpath eq substr $upath, 0, length $cpath;
300 326
301 while (my ($k, $v) = each %$v) { 327 while (my ($k, $v) = each %$v) {
302 next if $scheme ne "https" && exists $v->{secure}; 328 next if $uscheme ne "https" && exists $v->{secure};
329 my $value = $v->{value};
330 $value =~ s/([\\"])/\\$1/g;
303 push @cookie, "$k=$v->{value}"; 331 push @cookie, "$k=\"$value\"";
304 } 332 }
305 } 333 }
306 } 334 }
307 335
308 $hdr{cookie} = join "; ", @cookie 336 $hdr{cookie} = join "; ", @cookie
309 if @cookie; 337 if @cookie;
310 } 338 }
311 339
312 my ($rhost, $rport, $rpath); # request host, port, path 340 my ($rhost, $rport, $rscheme, $rpath); # request host, port, path
313 341
314 if ($proxy) { 342 if ($proxy) {
315 ($rhost, $rport, $scheme) = @$proxy; 343 ($rpath, $rhost, $rport, $rscheme) = ($url, @$proxy);
316 $rpath = $url; 344
345 # don't support https requests over https-proxy transport,
346 # can't be done with tls as spec'ed, unless you double-encrypt.
347 $rscheme = "http" if $uscheme eq "https" && $rscheme eq "https";
317 } else { 348 } else {
318 ($rhost, $rport, $rpath) = ($uhost, $uport, $upath); 349 ($rhost, $rport, $rscheme, $rpath) = ($uhost, $uport, $uscheme, $upath);
319 $hdr{host} = $uhost;
320 } 350 }
321 351
352 $hdr{host} = $uhost;
322 $hdr{"content-length"} = length $arg{body}; 353 $hdr{"content-length"} = length $arg{body};
323 354
324 my %state = (connect_guard => 1); 355 my %state = (connect_guard => 1);
325 356
326 _get_slot $uhost, sub { 357 _get_slot $uhost, sub {
328 359
329 return unless $state{connect_guard}; 360 return unless $state{connect_guard};
330 361
331 $state{connect_guard} = AnyEvent::Socket::tcp_connect $rhost, $rport, sub { 362 $state{connect_guard} = AnyEvent::Socket::tcp_connect $rhost, $rport, sub {
332 $state{fh} = shift 363 $state{fh} = shift
333 or return $cb->(undef, { Status => 599, Reason => "$!", URL => $url }); 364 or return (%state = (), $cb->(undef, { Status => 599, Reason => "$!", URL => $url }));
365 pop; # free memory, save a tree
334 366
335 delete $state{connect_guard}; # reduce memory usage, save a tree 367 return unless delete $state{connect_guard};
336 368
337 # get handle 369 # get handle
338 $state{handle} = new AnyEvent::Handle 370 $state{handle} = new AnyEvent::Handle
339 fh => $state{fh}, 371 fh => $state{fh},
340 ($scheme eq "https" ? (tls => "connect") : ()); 372 timeout => $timeout,
373 peername => $rhost,
374 tls_ctx => $arg{tls_ctx};
341 375
342 # limit the number of persistent connections 376 # limit the number of persistent connections
377 # keepalive not yet supported
343 if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) { 378 if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) {
344 ++$KA_COUNT{$_[1]}; 379 ++$KA_COUNT{$_[1]};
345 $state{handle}{ka_count_guard} = AnyEvent::Util::guard { --$KA_COUNT{$_[1]} }; 380 $state{handle}{ka_count_guard} = AnyEvent::Util::guard {
381 --$KA_COUNT{$_[1]}
382 };
346 $hdr{connection} = "keep-alive"; 383 $hdr{connection} = "keep-alive";
347 delete $hdr{connection}; # keep-alive not yet supported
348 } else { 384 } else {
349 delete $hdr{connection}; 385 delete $hdr{connection};
350 } 386 }
351 387
352 # (re-)configure handle 388 # (re-)configure handle
353 $state{handle}->timeout ($timeout);
354 $state{handle}->on_error (sub { 389 $state{handle}->on_error (sub {
355 my $errno = "$!";
356 %state = (); 390 %state = ();
357 $cb->(undef, { Status => 599, Reason => $errno, URL => $url }); 391 $cb->(undef, { Status => 599, Reason => $_[2], URL => $url });
358 }); 392 });
359 $state{handle}->on_eof (sub { 393 $state{handle}->on_eof (sub {
360 %state = (); 394 %state = ();
361 $cb->(undef, { Status => 599, Reason => "unexpected end-of-file", URL => $url }); 395 $cb->(undef, { Status => 599, Reason => "Unexpected end-of-file", URL => $url });
362 }); 396 });
363 397
398 $state{handle}->starttls ("connect") if $rscheme eq "https";
399
400 # handle actual, non-tunneled, request
401 my $handle_actual_request = sub {
402 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls};
403
364 # send request 404 # send request
365 $state{handle}->push_write ( 405 $state{handle}->push_write (
366 "$method $rpath HTTP/1.0\015\012" 406 "$method $rpath HTTP/1.0\015\012"
367 . (join "", map "$_: $hdr{$_}\015\012", keys %hdr) 407 . (join "", map "\u$_: $hdr{$_}\015\012", keys %hdr)
368 . "\015\012" 408 . "\015\012"
369 . (delete $arg{body}) 409 . (delete $arg{body})
370 );
371
372 %hdr = (); # reduce memory usage, save a kitten
373
374 # status line
375 $state{handle}->push_read (line => qr/\015?\012/, sub {
376 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
377 or return (%state = (), $cb->(undef, { Status => 599, Reason => "invalid server response ($_[1])", URL => $url }));
378
379 my %hdr = ( # response headers
380 HTTPVersion => "\x00$1",
381 Status => "\x00$2",
382 Reason => "\x00$3",
383 URL => "\x00$url"
384 ); 410 );
385 411
412 %hdr = (); # reduce memory usage, save a kitten
413
414 # status line
415 $state{handle}->push_read (line => $qr_nl, sub {
416 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
417 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Invalid server response ($_[1])", URL => $url }));
418
419 my %hdr = ( # response headers
420 HTTPVersion => ",$1",
421 Status => ",$2",
422 Reason => ",$3",
423 URL => ",$url"
424 );
425
386 # headers, could be optimized a bit 426 # headers, could be optimized a bit
387 $state{handle}->unshift_read (line => qr/\015?\012\015?\012/, sub { 427 $state{handle}->unshift_read (line => $qr_nlnl, sub {
388 for ("$_[1]\012") { 428 for ("$_[1]\012") {
389 # we support spaces in field names, as lotus domino 429 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
390 # creates them. 430
431 # things seen, not parsed:
432 # p3pP="NON CUR OTPi OUR NOR UNI"
433
391 $hdr{lc $1} .= "\x00$2" 434 $hdr{lc $1} .= ",$2"
392 while /\G 435 while /\G
393 ([^:\000-\037]+): 436 ([^:\000-\037]+):
394 [\011\040]* 437 [\011\040]*
395 ((?: [^\015\012]+ | \015?\012[\011\040] )*) 438 ((?: [^\012]+ | \012[\011\040] )*)
396 \015?\012 439 \012
397 /gxc; 440 /gxc;
398 441
399 /\G$/ 442 /\G$/
400 or return (%state = (), $cb->(undef, { Status => 599, Reason => "garbled response headers", URL => $url })); 443 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Garbled response headers", URL => $url }));
401 } 444 }
402 445
403 substr $_, 0, 1, "" 446 substr $_, 0, 1, ""
404 for values %hdr; 447 for values %hdr;
405 448
406 my $finish = sub { 449 my $finish = sub {
450 $state{handle}->destroy;
407 %state = (); 451 %state = ();
408 452
409 # set-cookie processing 453 # set-cookie processing
410 if ($arg{cookie_jar} && exists $hdr{"set-cookie"}) { 454 if ($arg{cookie_jar}) {
411 for (split /\x00/, $hdr{"set-cookie"}) { 455 for ($hdr{"set-cookie"}) {
412 my ($cookie, @arg) = split /;\s*/; 456 # parse NAME=VALUE
413 my ($name, $value) = split /=/, $cookie, 2; 457 my @kv;
414 my %kv = (value => $value, map { split /=/, $_, 2 } @arg); 458
415 459 while (/\G\s* ([^=;,[:space:]]+) \s*=\s* (?: "((?:[^\\"]+|\\.)*)" | ([^=;,[:space:]]*) )/gcxs) {
416 my $cdom = (delete $kv{domain}) || $uhost; 460 my $name = $1;
461 my $value = $3;
462
463 unless ($value) {
464 $value = $2;
465 $value =~ s/\\(.)/$1/gs;
466 }
467
468 push @kv, $name => $value;
469
470 last unless /\G\s*;/gc;
471 }
472
473 last unless @kv;
474
475 my $name = shift @kv;
476 my %kv = (value => shift @kv, @kv);
477
478 my $cdom;
417 my $cpath = (delete $kv{path}) || "/"; 479 my $cpath = (delete $kv{path}) || "/";
418 480
481 if (exists $kv{domain}) {
482 $cdom = delete $kv{domain};
483
419 $cdom =~ s/^\.?/./; # make sure it starts with a "." 484 $cdom =~ s/^\.?/./; # make sure it starts with a "."
420 485
421 next if $cdom =~ /\.$/; 486 next if $cdom =~ /\.$/;
422 487
423 # this is not rfc-like and not netscape-like. go figure. 488 # this is not rfc-like and not netscape-like. go figure.
424 my $ndots = $cdom =~ y/.//; 489 my $ndots = $cdom =~ y/.//;
425 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2); 490 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2);
426 491 } else {
492 $cdom = $uhost;
493 }
494
427 # store it 495 # store it
428 $arg{cookie_jar}{version} = 1; 496 $arg{cookie_jar}{version} = 1;
429 $arg{cookie_jar}{$cdom}{$cpath}{$name} = \%kv; 497 $arg{cookie_jar}{$cdom}{$cpath}{$name} = \%kv;
498
499 redo if /\G\s*,/gc;
500 }
501 }
502
503 # microsoft and other shitheads don't give a shit for following standards,
504 # try to support some common forms of broken Location headers.
505 if ($_[1]{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
506 $_[1]{location} =~ s/^\.\/+//;
507
508 my $url = "$rscheme://$uhost:$uport";
509
510 unless ($_[1]{location} =~ s/^\///) {
511 $url .= $upath;
512 $url =~ s/\/[^\/]*$//;
513 }
514
515 $_[1]{location} = "$url/$_[1]{location}";
516 }
517
518 if ($_[1]{Status} =~ /^30[12]$/ && $recurse && $method ne "POST") {
519 # apparently, mozilla et al. just change POST to GET here
520 # more research is needed before we do the same
521 http_request ($method => $_[1]{location}, %arg, recurse => $recurse - 1, $cb);
522 } elsif ($_[1]{Status} == 303 && $recurse) {
523 # even http/1.1 is unclear on how to mutate the method
524 $method = "GET" unless $method eq "HEAD";
525 http_request ($method => $_[1]{location}, %arg, recurse => $recurse - 1, $cb);
526 } elsif ($_[1]{Status} == 307 && $recurse && $method =~ /^(?:GET|HEAD)$/) {
527 http_request ($method => $_[1]{location}, %arg, recurse => $recurse - 1, $cb);
528 } else {
529 $cb->($_[0], $_[1]);
530 }
531 };
532
533 if ($hdr{Status} =~ /^(?:1..|204|304)$/ or $method eq "HEAD") {
534 $finish->(undef, \%hdr);
535 } else {
536 if (exists $hdr{"content-length"}) {
537 $_[0]->unshift_read (chunk => $hdr{"content-length"}, sub {
538 # could cache persistent connection now
539 if ($hdr{connection} =~ /\bkeep-alive\b/i) {
540 # but we don't, due to misdesigns, this is annoyingly complex
541 };
542
543 $finish->($_[1], \%hdr);
544 });
545 } else {
546 # too bad, need to read until we get an error or EOF,
547 # no way to detect winged data.
548 $_[0]->on_error (sub {
549 # delete ought to be more efficient, as we would have to make
550 # a copy otherwise as $_[0] gets destroyed.
551 $finish->(delete $_[0]{rbuf}, \%hdr);
552 });
553 $_[0]->on_eof (undef);
554 $_[0]->on_read (sub { });
430 } 555 }
431 } 556 }
432
433 # microsoft and other shitheads don't give a shit for following standards,
434 # try to support some common forms of broken Location headers.
435 if ($_[1]{location} !~ /^(?: $ | [^:\/?\#]+ : )/x) {
436 $_[1]{location} =~ s/^\.\/+//;
437
438 my $url = "$scheme://$uhost:$uport";
439
440 unless ($_[1]{location} =~ s/^\///) {
441 $url .= $upath;
442 $url =~ s/\/[^\/]*$//;
443 }
444
445 $_[1]{location} = "$url/$_[1]{location}";
446 }
447
448 if ($_[1]{Status} =~ /^30[12]$/ && $recurse && $method ne "POST") {
449 # apparently, mozilla et al. just change POST to GET here
450 # more research is needed before we do the same
451 http_request ($method, $_[1]{location}, %arg, recurse => $recurse - 1, $cb);
452 } elsif ($_[1]{Status} == 303 && $recurse) {
453 # even http/1.1 is unlear on how to mutate the method
454 $method = "GET" unless $method eq "HEAD";
455 http_request ($method => $_[1]{location}, %arg, recurse => $recurse - 1, $cb);
456 } elsif ($_[1]{Status} == 307 && $recurse && $method =~ /^(?:GET|HEAD)$/) {
457 http_request ($method => $_[1]{location}, %arg, recurse => $recurse - 1, $cb);
458 } else {
459 $cb->($_[0], $_[1]);
460 }
461 }; 557 });
558 });
559 };
462 560
463 if ($hdr{Status} =~ /^(?:1..|204|304)$/ or $method eq "HEAD") { 561 # now handle proxy-CONNECT method
464 $finish->(undef, \%hdr); 562 if ($proxy && $uscheme eq "https") {
563 # oh dear, we have to wrap it into a connect request
564
565 # maybe re-use $uauthority with patched port?
566 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012Host: $uhost\015\012\015\012");
567 $state{handle}->push_read (line => $qr_nlnl, sub {
568 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
569 or return (%state = (), $cb->(undef, { Status => 599, Reason => "Invalid proxy connect response ($_[1])", URL => $url }));
570
571 if ($2 == 200) {
572 $rpath = $upath;
573 &$handle_actual_request;
465 } else { 574 } else {
466 if (exists $hdr{"content-length"}) { 575 %state = ();
467 $_[0]->unshift_read (chunk => $hdr{"content-length"}, sub { 576 $cb->(undef, { Status => $2, Reason => $3, URL => $url });
468 # could cache persistent connection now
469 if ($hdr{connection} =~ /\bkeep-alive\b/i) {
470 # but we don't, due to misdesigns, this is annoyingly complex
471 };
472
473 $finish->($_[1], \%hdr);
474 });
475 } else {
476 # too bad, need to read until we get an error or EOF,
477 # no way to detect winged data.
478 $_[0]->on_error (sub {
479 $finish->($_[0]{rbuf}, \%hdr);
480 });
481 $_[0]->on_eof (undef);
482 $_[0]->on_read (sub { });
483 }
484 } 577 }
485 }); 578 });
579 } else {
580 &$handle_actual_request;
486 }); 581 }
582
487 }, sub { 583 }, sub {
488 $timeout 584 $timeout
489 }; 585 };
490 }; 586 };
491 587
524The default value for the C<recurse> request parameter (default: C<10>). 620The default value for the C<recurse> request parameter (default: C<10>).
525 621
526=item $AnyEvent::HTTP::USERAGENT 622=item $AnyEvent::HTTP::USERAGENT
527 623
528The default value for the C<User-Agent> header (the default is 624The default value for the C<User-Agent> header (the default is
529C<Mozilla/5.0 (compatible; AnyEvent::HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>). 625C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>).
530 626
531=item $AnyEvent::HTTP::MAX_PERSISTENT 627=item $AnyEvent::HTTP::MAX_PERSISTENT
532 628
533The maximum number of persistent connections to keep open (default: 8). 629The maximum number of persistent connections to keep open (default: 8).
534 630
564=head1 AUTHOR 660=head1 AUTHOR
565 661
566 Marc Lehmann <schmorp@schmorp.de> 662 Marc Lehmann <schmorp@schmorp.de>
567 http://home.schmorp.de/ 663 http://home.schmorp.de/
568 664
665With many thanks to Дмитрий Шалашов, who provided countless
666testcases and bugreports.
667
569=cut 668=cut
570 669
5711 6701
572 671

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines