ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-HTTP/HTTP.pm
(Generate patch)

Comparing AnyEvent-HTTP/HTTP.pm (file contents):
Revision 1.91 by root, Mon Jan 3 01:03:29 2011 UTC vs.
Revision 1.92 by root, Tue Jan 4 08:17:59 2011 UTC

46use AnyEvent::Util (); 46use AnyEvent::Util ();
47use AnyEvent::Handle (); 47use AnyEvent::Handle ();
48 48
49use base Exporter::; 49use base Exporter::;
50 50
51our $VERSION = '1.5'; 51our $VERSION = '2.0';
52 52
53our @EXPORT = qw(http_get http_post http_head http_request); 53our @EXPORT = qw(http_get http_post http_head http_request);
54 54
55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)"; 55our $USERAGENT = "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)";
56our $MAX_RECURSE = 10; 56our $MAX_RECURSE = 10;
57our $MAX_PERSISTENT = 8;
58our $PERSISTENT_TIMEOUT = 2; 57our $PERSISTENT_TIMEOUT = 3;
59our $TIMEOUT = 300; 58our $TIMEOUT = 300;
60 59our $MAX_PER_HOST = 4; # changing this is evil
61# changing these is evil
62our $MAX_PERSISTENT_PER_HOST = 2;
63our $MAX_PER_HOST = 4;
64 60
65our $PROXY; 61our $PROXY;
66our $ACTIVE = 0; 62our $ACTIVE = 0;
67 63
68my %KA_COUNT; # number of open keep-alive connections per host 64my %KA_CACHE; # indexed by uhost currently, points to [$handle...] array
69my %CO_SLOT; # number of open connections, and wait queue, per host 65my %CO_SLOT; # number of open connections, and wait queue, per host
70 66
71=item http_get $url, key => value..., $cb->($data, $headers) 67=item http_get $url, key => value..., $cb->($data, $headers)
72 68
73Executes an HTTP-GET request. See the http_request function for details on 69Executes an HTTP-GET request. See the http_request function for details on
186=item proxy => [$host, $port[, $scheme]] or undef 182=item proxy => [$host, $port[, $scheme]] or undef
187 183
188Use the given http proxy for all requests. If not specified, then the 184Use the given http proxy for all requests. If not specified, then the
189default proxy (as specified by C<$ENV{http_proxy}>) is used. 185default proxy (as specified by C<$ENV{http_proxy}>) is used.
190 186
191C<$scheme> must be either missing, C<http> for HTTP or C<https> for 187C<$scheme> must be either missing or must be C<http> for HTTP.
192HTTPS.
193 188
194=item body => $string 189=item body => $string
195 190
196The request body, usually empty. Will be sent as-is (future versions of 191The request body, usually empty. Will be sent as-is (future versions of
197this module might offer more options). 192this module might offer more options).
227verification, highest compatibility) and high-security (CA and common-name 222verification, highest compatibility) and high-security (CA and common-name
228verification) TLS context. 223verification) TLS context.
229 224
230The default for this option is C<low>, which could be interpreted as "give 225The default for this option is C<low>, which could be interpreted as "give
231me the page, no matter what". 226me the page, no matter what".
227
228See also the C<sessionid> parameter.
229
230=item session => $string
231
232The module might reuse connections to the same host internally. Sometimes
233(e.g. when using TLS), you do not want to reuse connections from other
234sessions. This can be achieved by setting this parameter to some unique
235ID (such as the address of an object storing your state data, or the TLS
236context) - only connections using the same unique ID will be reused.
232 237
233=item on_prepare => $callback->($fh) 238=item on_prepare => $callback->($fh)
234 239
235In rare cases you need to "tune" the socket before it is used to 240In rare cases you need to "tune" the socket before it is used to
236connect (for exmaple, to bind it on a given IP address). This parameter 241connect (for exmaple, to bind it on a given IP address). This parameter
306called. Instead of the C<$body> argument containing the body data, the 311called. Instead of the C<$body> argument containing the body data, the
307callback will receive the L<AnyEvent::Handle> object associated with the 312callback will receive the L<AnyEvent::Handle> object associated with the
308connection. In error cases, C<undef> will be passed. When there is no body 313connection. In error cases, C<undef> will be passed. When there is no body
309(e.g. status C<304>), the empty string will be passed. 314(e.g. status C<304>), the empty string will be passed.
310 315
311The handle object might or might not be in TLS mode, might be connected to 316The handle object might or might not be in TLS mode, might be connected
312a proxy, be a persistent connection etc., and configured in unspecified 317to a proxy, be a persistent connection, use chunked transfer encoding
313ways. The user is responsible for this handle (it will not be used by this 318etc., and configured in unspecified ways. The user is responsible for this
314module anymore). 319handle (it will not be used by this module anymore).
315 320
316This is useful with some push-type services, where, after the initial 321This is useful with some push-type services, where, after the initial
317headers, an interactive protocol is used (typical example would be the 322headers, an interactive protocol is used (typical example would be the
318push-style twitter API which starts a JSON/XML stream). 323push-style twitter API which starts a JSON/XML stream).
319 324
320If you think you need this, first have a look at C<on_body>, to see if 325If you think you need this, first have a look at C<on_body>, to see if
321that doesn't solve your problem in a better way. 326that doesn't solve your problem in a better way.
327
328=item persistent => $boolean
329
330Try to create/reuse a persistent connection. When this flag is set
331(default: true for idempotent requests, false for all others), then
332C<http_request> tries to re-use an existing (previously-created)
333persistent connection to the host and, failing that, tries to create a new
334one.
335
336Requests failing in certain ways will be automatically retried once, which
337is dangerous for non-idempotent requests, which is why it defaults to off
338for them. The reason for this is because the bozos who designed HTTP/1.1
339made it impossible to distinguish between a fatal error and a normal
340connection timeout, so you never know whether there was a problem with
341your request or not.
342
343When reusing an existent connection, many parameters (such as TLS context)
344will be ignored. See the C<session> parameter for a workaround.
345
346=item keepalive => $boolean
347
348Only used when C<persistent> is also true. This parameter decides whether
349C<http_request> tries to handshake a HTTP/1.0-style keep-alive connection
350(as opposed to only a HTTP/1.1 persistent connection).
351
352The default is true, except when using a proxy, in which case it defaults
353to false, as HTTP/1.0 proxies cannot support this in a meaningful way.
354
355=item handle_params => { key => value ... }
356
357The key-value pairs in this hash will be passed to any L<AnyEvent::Handle>
358constructor that is called - not all requests will create a handle, and
359sometimes more than one is created, so this parameter is only good for
360setting hints.
361
362Example: set the maximum read size to 4096, to potentially conserve memory
363at the cost of speed.
364
365 handle_params => {
366 max_read_size => 4096,
367 },
322 368
323=back 369=back
324 370
325Example: do a simple HTTP GET request for http://www.nethype.de/ and print 371Example: do a simple HTTP GET request for http://www.nethype.de/ and print
326the response body. 372the response body.
353 }; 399 };
354 400
355 undef $request; 401 undef $request;
356 402
357=cut 403=cut
404
405#############################################################################
406# wait queue/slots
358 407
359sub _slot_schedule; 408sub _slot_schedule;
360sub _slot_schedule($) { 409sub _slot_schedule($) {
361 my $host = shift; 410 my $host = shift;
362 411
385 434
386 _slot_schedule $_[0]; 435 _slot_schedule $_[0];
387} 436}
388 437
389############################################################################# 438#############################################################################
439# cookie handling
390 440
391# expire cookies 441# expire cookies
392sub cookie_jar_expire($;$) { 442sub cookie_jar_expire($;$) {
393 my ($jar, $session_end) = @_; 443 my ($jar, $session_end) = @_;
394 444
418 } 468 }
419} 469}
420 470
421# extract cookies from jar 471# extract cookies from jar
422sub cookie_jar_extract($$$$) { 472sub cookie_jar_extract($$$$) {
423 my ($jar, $uscheme, $uhost, $upath) = @_; 473 my ($jar, $scheme, $host, $path) = @_;
424 474
425 %$jar = () if $jar->{version} != 1; 475 %$jar = () if $jar->{version} != 1;
426 476
427 my @cookies; 477 my @cookies;
428 478
429 while (my ($chost, $paths) = each %$jar) { 479 while (my ($chost, $paths) = each %$jar) {
430 next unless ref $paths; 480 next unless ref $paths;
431 481
432 if ($chost =~ /^\./) { 482 if ($chost =~ /^\./) {
433 next unless $chost eq substr $uhost, -length $chost; 483 next unless $chost eq substr $host, -length $chost;
434 } elsif ($chost =~ /\./) { 484 } elsif ($chost =~ /\./) {
435 next unless $chost eq $uhost; 485 next unless $chost eq $host;
436 } else { 486 } else {
437 next; 487 next;
438 } 488 }
439 489
440 while (my ($cpath, $cookies) = each %$paths) { 490 while (my ($cpath, $cookies) = each %$paths) {
441 next unless $cpath eq substr $upath, 0, length $cpath; 491 next unless $cpath eq substr $path, 0, length $cpath;
442 492
443 while (my ($cookie, $kv) = each %$cookies) { 493 while (my ($cookie, $kv) = each %$cookies) {
444 next if $uscheme ne "https" && exists $kv->{secure}; 494 next if $scheme ne "https" && exists $kv->{secure};
445 495
446 if (exists $kv->{_expires} and AE::now > $kv->{_expires}) { 496 if (exists $kv->{_expires} and AE::now > $kv->{_expires}) {
447 delete $cookies->{$cookie}; 497 delete $cookies->{$cookie};
448 next; 498 next;
449 } 499 }
463 \@cookies 513 \@cookies
464} 514}
465 515
466# parse set_cookie header into jar 516# parse set_cookie header into jar
467sub cookie_jar_set_cookie($$$$) { 517sub cookie_jar_set_cookie($$$$) {
468 my ($jar, $set_cookie, $uhost, $date) = @_; 518 my ($jar, $set_cookie, $host, $date) = @_;
469 519
470 my $anow = int AE::now; 520 my $anow = int AE::now;
471 my $snow; # server-now 521 my $snow; # server-now
472 522
473 for ($set_cookie) { 523 for ($set_cookie) {
529 579
530 # this is not rfc-like and not netscape-like. go figure. 580 # this is not rfc-like and not netscape-like. go figure.
531 my $ndots = $cdom =~ y/.//; 581 my $ndots = $cdom =~ y/.//;
532 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2); 582 next if $ndots < ($cdom =~ /\.[^.][^.]\.[^.][^.]$/ ? 3 : 2);
533 } else { 583 } else {
534 $cdom = $uhost; 584 $cdom = $host;
535 } 585 }
536 586
537 # store it 587 # store it
538 $jar->{version} = 1; 588 $jar->{version} = 1;
539 $jar->{lc $cdom}{$cpath}{$name} = \%kv; 589 $jar->{lc $cdom}{$cpath}{$name} = \%kv;
540 590
541 redo if /\G\s*,/gc; 591 redo if /\G\s*,/gc;
542 } 592 }
543} 593}
544 594
595#############################################################################
596# keepalive/persistent connection cache
597
598# fetch a connection from the keepalive cache
599sub ka_fetch($) {
600 my $ka_key = shift;
601
602 my $hdl = pop @{ $KA_CACHE{$ka_key} }; # currently we reuse the MOST RECENTLY USED connection
603 delete $KA_CACHE{$ka_key}
604 unless @{ $KA_CACHE{$ka_key} };
605
606 $hdl
607}
608
609sub ka_store($$) {
610 my ($ka_key, $hdl) = @_;
611
612 my $kaa = $KA_CACHE{$ka_key} ||= [];
613
614 my $destroy = sub {
615 my @ka = grep $_ != $hdl, @{ $KA_CACHE{$ka_key} };
616
617 $hdl->destroy;
618
619 @ka
620 ? $KA_CACHE{$ka_key} = \@ka
621 : delete $KA_CACHE{$ka_key};
622 };
623
624 # on error etc., destroy
625 $hdl->on_error ($destroy);
626 $hdl->on_eof ($destroy);
627 $hdl->on_read ($destroy);
628 $hdl->timeout ($PERSISTENT_TIMEOUT);
629
630 push @$kaa, $hdl;
631 shift @$kaa while @$kaa > $MAX_PER_HOST;
632}
633
634#############################################################################
635# utilities
636
545# continue to parse $_ for headers and place them into the arg 637# continue to parse $_ for headers and place them into the arg
546sub parse_hdr() { 638sub _parse_hdr() {
547 my %hdr; 639 my %hdr;
548 640
549 # things seen, not parsed: 641 # things seen, not parsed:
550 # p3pP="NON CUR OTPi OUR NOR UNI" 642 # p3pP="NON CUR OTPi OUR NOR UNI"
551 643
565 for values %hdr; 657 for values %hdr;
566 658
567 \%hdr 659 \%hdr
568} 660}
569 661
662#############################################################################
663# http_get
664
570our $qr_nlnl = qr{(?<![^\012])\015?\012}; 665our $qr_nlnl = qr{(?<![^\012])\015?\012};
571 666
572our $TLS_CTX_LOW = { cache => 1, sslv2 => 1 }; 667our $TLS_CTX_LOW = { cache => 1, sslv2 => 1 };
573our $TLS_CTX_HIGH = { cache => 1, verify => 1, verify_peername => "https" }; 668our $TLS_CTX_HIGH = { cache => 1, verify => 1, verify_peername => "https" };
669
670# maybe it should just become a normal object :/
671
672sub _destroy_state(\%) {
673 my ($state) = @_;
674
675 $state->{handle}->destroy if $state->{handle};
676 %$state = ();
677}
678
679sub _error(\%$$) {
680 my ($state, $cb, $hdr) = @_;
681
682 &_destroy_state ($state);
683
684 $cb->(undef, $hdr);
685 ()
686}
574 687
575sub http_request($$@) { 688sub http_request($$@) {
576 my $cb = pop; 689 my $cb = pop;
577 my ($method, $url, %arg) = @_; 690 my ($method, $url, %arg) = @_;
578 691
599 if $recurse < 0; 712 if $recurse < 0;
600 713
601 my $proxy = $arg{proxy} || $PROXY; 714 my $proxy = $arg{proxy} || $PROXY;
602 my $timeout = $arg{timeout} || $TIMEOUT; 715 my $timeout = $arg{timeout} || $TIMEOUT;
603 716
604 my ($uscheme, $uauthority, $upath, $query, $fragment) = 717 my ($uscheme, $uauthority, $upath, $query, undef) = # ignore fragment
605 $url =~ m|(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:(\?[^#]*))?(?:#(.*))?|; 718 $url =~ m|(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:(\?[^#]*))?(?:#(.*))?|;
606 719
607 $uscheme = lc $uscheme; 720 $uscheme = lc $uscheme;
608 721
609 my $uport = $uscheme eq "http" ? 80 722 my $uport = $uscheme eq "http" ? 80
657 if length $arg{body} || $method ne "GET"; 770 if length $arg{body} || $method ne "GET";
658 771
659 my $idempotent = $method =~ /^(?:GET|HEAD|PUT|DELETE|OPTIONS|TRACE)$/; 772 my $idempotent = $method =~ /^(?:GET|HEAD|PUT|DELETE|OPTIONS|TRACE)$/;
660 773
661 # default value for keepalive is true iff the request is for an idempotent method 774 # default value for keepalive is true iff the request is for an idempotent method
662 my $keepalive = exists $arg{keepalive} 775 my $keepalive = exists $arg{keepalive} ? !!$arg{keepalive} : $idempotent;
663 ? $arg{keepalive}*1 776 my $keepalive10 = exists $arg{keepalive10} ? $arg{keepalive10} : !$proxy;
664 : $idempotent ? $PERSISTENT_TIMEOUT : 0; 777 my $keptalive; # true if this is actually a recycled connection
665 778
779 # the key to use in the keepalive cache
780 my $ka_key = "$uhost\x00$arg{sessionid}";
781
666 $hdr{connection} = ($keepalive ? "" : "close ") . "Te"; #1.1 782 $hdr{connection} = ($keepalive ? $keepalive10 ? "keep-alive " : "" : "close ") . "Te"; #1.1
667 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1 783 $hdr{te} = "trailers" unless exists $hdr{te}; #1.1
668 784
669 my %state = (connect_guard => 1); 785 my %state = (connect_guard => 1);
670 786
671 my $ae_error = 595; # connecting 787 my $ae_error = 595; # connecting
672 788
673 # handle actual, non-tunneled, request 789 # handle actual, non-tunneled, request
674 my $handle_actual_request = sub { 790 my $handle_actual_request = sub {
675 $ae_error = 596; # request phase 791 $ae_error = 596; # request phase
676 792
793 my $hdl = $state{handle};
794
677 $state{handle}->starttls ("connect") if $uscheme eq "https" && !exists $state{handle}{tls}; 795 $hdl->starttls ("connect") if $uscheme eq "https" && !exists $hdl->{tls};
678 796
679 # send request 797 # send request
680 $state{handle}->push_write ( 798 $hdl->push_write (
681 "$method $rpath HTTP/1.1\015\012" 799 "$method $rpath HTTP/1.1\015\012"
682 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr) 800 . (join "", map "\u$_: $hdr{$_}\015\012", grep defined $hdr{$_}, keys %hdr)
683 . "\015\012" 801 . "\015\012"
684 . (delete $arg{body}) 802 . (delete $arg{body})
685 ); 803 );
690 # reduce memory usage, save a kitten, also re-use it for the response headers. 808 # reduce memory usage, save a kitten, also re-use it for the response headers.
691 %hdr = (); 809 %hdr = ();
692 810
693 # status line and headers 811 # status line and headers
694 $state{read_response} = sub { 812 $state{read_response} = sub {
813 return unless %state;
814
695 for ("$_[1]") { 815 for ("$_[1]") {
696 y/\015//d; # weed out any \015, as they show up in the weirdest of places. 816 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
697 817
698 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci 818 /^HTTP\/0*([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\012]*) )? \012/gxci
699 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid server response" })); 819 or return _error %state, $cb, { @pseudo, Status => 599, Reason => "Invalid server response" };
700 820
701 # 100 Continue handling 821 # 100 Continue handling
702 # should not happen as we don't send expect: 100-continue, 822 # should not happen as we don't send expect: 100-continue,
703 # but we handle it just in case. 823 # but we handle it just in case.
704 # since we send the request body regardless, if we get an error 824 # since we send the request body regardless, if we get an error
710 HTTPVersion => $1, 830 HTTPVersion => $1,
711 Status => $2, 831 Status => $2,
712 Reason => $3, 832 Reason => $3,
713 ; 833 ;
714 834
715 my $hdr = parse_hdr 835 my $hdr = _parse_hdr
716 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Garbled response headers" })); 836 or return _error %state, $cb, { @pseudo, Status => 599, Reason => "Garbled response headers" };
717 837
718 %hdr = (%$hdr, @pseudo); 838 %hdr = (%$hdr, @pseudo);
719 } 839 }
720 840
721 # redirect handling 841 # redirect handling
752 $redirect = 1; 872 $redirect = 1;
753 } 873 }
754 } 874 }
755 875
756 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive]) 876 my $finish = sub { # ($data, $err_status, $err_reason[, $keepalive])
757 my $may_keep_alive = $_[3]; 877 if ($state{handle}) {
758 878 # handle keepalive
879 if (
880 $keepalive
881 && $_[3]
882 && ($hdr{HTTPVersion} < 1.1
883 ? $hdr{connection} =~ /\bkeep-?alive\b/i
884 : $hdr{connection} !~ /\bclose\b/i)
885 ) {
886 ka_store $ka_key, delete $state{handle};
887 } else {
888 # no keepalive, destroy the handle
759 $state{handle}->destroy if $state{handle}; 889 $state{handle}->destroy;
890 }
891 }
892
760 %state = (); 893 %state = ();
761 894
762 if (defined $_[1]) { 895 if (defined $_[1]) {
763 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1]; 896 $hdr{OrigStatus} = $hdr{Status}; $hdr{Status} = $_[1];
764 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2]; 897 $hdr{OrigReason} = $hdr{Reason}; $hdr{Reason} = $_[2];
776 http_request ( 909 http_request (
777 $method => $hdr{location}, 910 $method => $hdr{location},
778 %arg, 911 %arg,
779 recurse => $recurse - 1, 912 recurse => $recurse - 1,
780 Redirect => [$_[0], \%hdr], 913 Redirect => [$_[0], \%hdr],
781 $cb); 914 $cb
915 );
782 } else { 916 } else {
783 $cb->($_[0], \%hdr); 917 $cb->($_[0], \%hdr);
784 } 918 }
785 }; 919 };
786 920
813 947
814 $finish->(delete $state{handle}); 948 $finish->(delete $state{handle});
815 949
816 } elsif ($chunked) { 950 } elsif ($chunked) {
817 my $cl = 0; 951 my $cl = 0;
818 my $body = undef; 952 my $body = "";
819 my $on_body = $arg{on_body} || sub { $body .= shift; 1 }; 953 my $on_body = $arg{on_body} || sub { $body .= shift; 1 };
820 954
821 $state{read_chunk} = sub { 955 $state{read_chunk} = sub {
822 $_[1] =~ /^([0-9a-fA-F]+)/ 956 $_[1] =~ /^([0-9a-fA-F]+)/
823 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding"); 957 or $finish->(undef, $ae_error => "Garbled chunked transfer encoding");
843 $_[0]->push_read (line => $qr_nlnl, sub { 977 $_[0]->push_read (line => $qr_nlnl, sub {
844 if (length $_[1]) { 978 if (length $_[1]) {
845 for ("$_[1]") { 979 for ("$_[1]") {
846 y/\015//d; # weed out any \015, as they show up in the weirdest of places. 980 y/\015//d; # weed out any \015, as they show up in the weirdest of places.
847 981
848 my $hdr = parse_hdr 982 my $hdr = _parse_hdr
849 or return $finish->(undef, $ae_error => "Garbled response trailers"); 983 or return $finish->(undef, $ae_error => "Garbled response trailers");
850 984
851 %hdr = (%hdr, %$hdr); 985 %hdr = (%hdr, %$hdr);
852 } 986 }
853 } 987 }
896 $_[0]->on_read (sub { }); 1030 $_[0]->on_read (sub { });
897 } 1031 }
898 } 1032 }
899 }; 1033 };
900 1034
1035 # if keepalive is enabled, then the server closing the connection
1036 # before a response can happen legally - we retry on idempotent methods.
1037 if ($keptalive && $idempotent) {
1038 my $old_eof = $hdl->{on_eof};
1039 $hdl->{on_eof} = sub {
1040 _destroy_state %state;
1041
1042 http_request (
1043 $method => $url,
1044 %arg,
1045 keepalive => 0,
1046 $cb
1047 );
1048 };
1049 $hdl->on_read (sub {
1050 return unless %state;
1051
1052 # as soon as we receive something, a connection close
1053 # once more becomes a hard error
1054 $hdl->{on_eof} = $old_eof;
1055 $hdl->push_read (line => $qr_nlnl, $state{read_response});
1056 });
1057 } else {
901 $state{handle}->push_read (line => $qr_nlnl, $state{read_response}); 1058 $hdl->push_read (line => $qr_nlnl, $state{read_response});
1059 }
902 }; 1060 };
903 1061
1062 my $prepare_handle = sub {
1063 my ($hdl) = $state{handle};
1064
1065 $hdl->timeout ($timeout);
1066 $hdl->on_error (sub {
1067 _error %state, $cb, { @pseudo, Status => $ae_error, Reason => $_[2] };
1068 });
1069 $hdl->on_eof (sub {
1070 _error %state, $cb, { @pseudo, Status => $ae_error, Reason => "Unexpected end-of-file" };
1071 });
1072 };
1073
1074 # connected to proxy (or origin server)
904 my $connect_cb = sub { 1075 my $connect_cb = sub {
905 $state{fh} = shift 1076 my $fh = shift
906 or do {
907 my $err = "$!";
908 %state = ();
909 return $cb->(undef, { @pseudo, Status => $ae_error, Reason => $err }); 1077 or return _error %state, $cb, { @pseudo, Status => $ae_error, Reason => "$!" };
910 };
911 1078
912 return unless delete $state{connect_guard}; 1079 return unless delete $state{connect_guard};
913 1080
914 # get handle 1081 # get handle
915 $state{handle} = new AnyEvent::Handle 1082 $state{handle} = new AnyEvent::Handle
1083 %{ $arg{handle_params} },
916 fh => $state{fh}, 1084 fh => $fh,
917 peername => $rhost, 1085 peername => $uhost,
918 tls_ctx => $arg{tls_ctx}, 1086 tls_ctx => $arg{tls_ctx},
919 # these need to be reconfigured on keepalive handles
920 timeout => $timeout,
921 on_error => sub {
922 %state = ();
923 $cb->(undef, { @pseudo, Status => $ae_error, Reason => $_[2] });
924 },
925 on_eof => sub {
926 %state = ();
927 $cb->(undef, { @pseudo, Status => $ae_error, Reason => "Unexpected end-of-file" });
928 },
929 ; 1087 ;
930 1088
931 # limit the number of persistent connections 1089 $prepare_handle->();
932 # keepalive not yet supported
933# if ($KA_COUNT{$_[1]} < $MAX_PERSISTENT_PER_HOST) {
934# ++$KA_COUNT{$_[1]};
935# $state{handle}{ka_count_guard} = AnyEvent::Util::guard {
936# --$KA_COUNT{$_[1]}
937# };
938# $hdr{connection} = "keep-alive";
939# }
940 1090
941 $state{handle}->starttls ("connect") if $rscheme eq "https"; 1091 #$state{handle}->starttls ("connect") if $rscheme eq "https";
942 1092
943 # now handle proxy-CONNECT method 1093 # now handle proxy-CONNECT method
944 if ($proxy && $uscheme eq "https") { 1094 if ($proxy && $uscheme eq "https") {
945 # oh dear, we have to wrap it into a connect request 1095 # oh dear, we have to wrap it into a connect request
946 1096
947 # maybe re-use $uauthority with patched port? 1097 # maybe re-use $uauthority with patched port?
948 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012\015\012"); 1098 $state{handle}->push_write ("CONNECT $uhost:$uport HTTP/1.0\015\012\015\012");
949 $state{handle}->push_read (line => $qr_nlnl, sub { 1099 $state{handle}->push_read (line => $qr_nlnl, sub {
950 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix 1100 $_[1] =~ /^HTTP\/([0-9\.]+) \s+ ([0-9]{3}) (?: \s+ ([^\015\012]*) )?/ix
951 or return (%state = (), $cb->(undef, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" })); 1101 or return _error %state, $cb, { @pseudo, Status => 599, Reason => "Invalid proxy connect response ($_[1])" };
952 1102
953 if ($2 == 200) { 1103 if ($2 == 200) {
954 $rpath = $upath; 1104 $rpath = $upath;
955 $handle_actual_request->(); 1105 $handle_actual_request->();
956 } else { 1106 } else {
957 %state = ();
958 $cb->(undef, { @pseudo, Status => $2, Reason => $3 }); 1107 _error %state, $cb, { @pseudo, Status => $2, Reason => $3 };
959 } 1108 }
960 }); 1109 });
961 } else { 1110 } else {
962 $handle_actual_request->(); 1111 $handle_actual_request->();
963 } 1112 }
966 _get_slot $uhost, sub { 1115 _get_slot $uhost, sub {
967 $state{slot_guard} = shift; 1116 $state{slot_guard} = shift;
968 1117
969 return unless $state{connect_guard}; 1118 return unless $state{connect_guard};
970 1119
1120 # try to use an existing keepalive connection, but only if we, ourselves, plan
1121 # on a keepalive request (in theory, this should be a separate config option).
1122 if ($keepalive && $KA_CACHE{$ka_key}) {
1123 $keptalive = 1;
1124 $state{handle} = ka_fetch $ka_key;
1125 $prepare_handle->();
1126 $handle_actual_request->();
1127
1128 } else {
971 my $tcp_connect = $arg{tcp_connect} 1129 my $tcp_connect = $arg{tcp_connect}
972 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect }; 1130 || do { require AnyEvent::Socket; \&AnyEvent::Socket::tcp_connect };
973 1131
974 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout }); 1132 $state{connect_guard} = $tcp_connect->($rhost, $rport, $connect_cb, $arg{on_prepare} || sub { $timeout });
1133 }
975 }; 1134 };
976 1135
977 defined wantarray && AnyEvent::Util::guard { %state = () } 1136 defined wantarray && AnyEvent::Util::guard { _destroy_state %state }
978} 1137}
979 1138
980sub http_get($@) { 1139sub http_get($@) {
981 unshift @_, "GET"; 1140 unshift @_, "GET";
982 &http_request 1141 &http_request
1000AnyEvent::HTTP uses the AnyEvent::Socket::tcp_connect function for 1159AnyEvent::HTTP uses the AnyEvent::Socket::tcp_connect function for
1001the actual connection, which in turn uses AnyEvent::DNS to resolve 1160the actual connection, which in turn uses AnyEvent::DNS to resolve
1002hostnames. The latter is a simple stub resolver and does no caching 1161hostnames. The latter is a simple stub resolver and does no caching
1003on its own. If you want DNS caching, you currently have to provide 1162on its own. If you want DNS caching, you currently have to provide
1004your own default resolver (by storing a suitable resolver object in 1163your own default resolver (by storing a suitable resolver object in
1005C<$AnyEvent::DNS::RESOLVER>). 1164C<$AnyEvent::DNS::RESOLVER>) or your own C<tcp_connect> callback.
1006 1165
1007=head2 GLOBAL FUNCTIONS AND VARIABLES 1166=head2 GLOBAL FUNCTIONS AND VARIABLES
1008 1167
1009=over 4 1168=over 4
1010 1169
1011=item AnyEvent::HTTP::set_proxy "proxy-url" 1170=item AnyEvent::HTTP::set_proxy "proxy-url"
1012 1171
1013Sets the default proxy server to use. The proxy-url must begin with a 1172Sets the default proxy server to use. The proxy-url must begin with a
1014string of the form C<http://host:port> (optionally C<https:...>), croaks 1173string of the form C<http://host:port>, croaks otherwise.
1015otherwise.
1016 1174
1017To clear an already-set proxy, use C<undef>. 1175To clear an already-set proxy, use C<undef>.
1018 1176
1019=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end] 1177=item AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
1020 1178
1068 1226
1069=item $AnyEvent::HTTP::MAX_RECURSE 1227=item $AnyEvent::HTTP::MAX_RECURSE
1070 1228
1071The default value for the C<recurse> request parameter (default: C<10>). 1229The default value for the C<recurse> request parameter (default: C<10>).
1072 1230
1231=item $AnyEvent::HTTP::TIMEOUT
1232
1233The default timeout for conenction operations (default: C<300>).
1234
1073=item $AnyEvent::HTTP::USERAGENT 1235=item $AnyEvent::HTTP::USERAGENT
1074 1236
1075The default value for the C<User-Agent> header (the default is 1237The default value for the C<User-Agent> header (the default is
1076C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>). 1238C<Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION; +http://software.schmorp.de/pkg/AnyEvent)>).
1077 1239
1078=item $AnyEvent::HTTP::MAX_PER_HOST 1240=item $AnyEvent::HTTP::MAX_PER_HOST
1079 1241
1080The maximum number of concurrent connections to the same host (identified 1242The maximum number of concurrent connections to the same host (identified
1081by the hostname). If the limit is exceeded, then the additional requests 1243by the hostname). If the limit is exceeded, then the additional requests
1082are queued until previous connections are closed. 1244are queued until previous connections are closed. Both persistent and
1245non-persistent connections are counted in this limit.
1083 1246
1084The default value for this is C<4>, and it is highly advisable to not 1247The default value for this is C<4>, and it is highly advisable to not
1085increase it. 1248increase it much.
1249
1250For comparison: the RFC's recommend 4 non-persistent or 2 persistent
1251connections, older browsers used 2, newers (such as firefox 3) typically
1252use 6, and Opera uses 8 because like, they have the fastest browser and
1253give a shit for everybody else on the planet.
1254
1255=item $AnyEvent::HTTP::PERSISTENT_TIMEOUT
1256
1257The time after which idle persistent conenctions get closed by
1258AnyEvent::HTTP (default: C<3>).
1086 1259
1087=item $AnyEvent::HTTP::ACTIVE 1260=item $AnyEvent::HTTP::ACTIVE
1088 1261
1089The number of active connections. This is not the number of currently 1262The number of active connections. This is not the number of currently
1090running requests, but the number of currently open and non-idle TCP 1263running requests, but the number of currently open and non-idle TCP
1091connections. This number of can be useful for load-leveling. 1264connections. This number can be useful for load-leveling.
1092 1265
1093=back 1266=back
1094 1267
1095=cut 1268=cut
1096 1269
1139 undef 1312 undef
1140} 1313}
1141 1314
1142sub set_proxy($) { 1315sub set_proxy($) {
1143 if (length $_[0]) { 1316 if (length $_[0]) {
1144 $_[0] =~ m%^(https?):// ([^:/]+) (?: : (\d*) )?%ix 1317 $_[0] =~ m%^(http):// ([^:/]+) (?: : (\d*) )?%ix
1145 or Carp::croak "$_[0]: invalid proxy URL"; 1318 or Carp::croak "$_[0]: invalid proxy URL";
1146 $PROXY = [$2, $3 || 3128, $1] 1319 $PROXY = [$2, $3 || 3128, $1]
1147 } else { 1320 } else {
1148 undef $PROXY; 1321 undef $PROXY;
1149 } 1322 }

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines