ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent/lib/AnyEvent/Intro.pod
(Generate patch)

Comparing AnyEvent/lib/AnyEvent/Intro.pod (file contents):
Revision 1.11 by root, Mon Jun 2 09:10:38 2008 UTC vs.
Revision 1.14 by root, Mon Jun 2 10:05:00 2008 UTC

754The C<GET ...> and the empty line were entered manually, the rest of the 754The C<GET ...> and the empty line were entered manually, the rest of the
755telnet output is google's response, in which case a C<404 not found> one. 755telnet output is google's response, in which case a C<404 not found> one.
756 756
757So, here is how you would do it with C<AnyEvent::Handle>: 757So, here is how you would do it with C<AnyEvent::Handle>:
758 758
759###TODO 759 sub http_get {
760 my ($host, $uri, $cb) = @_;
761
762 tcp_connect $host, "http", sub {
763 my ($fh) = @_
764 or $cb->("HTTP/1.0 500 $!");
765
766 # store results here
767 my ($response, $header, $body);
768
769 my $handle; $handle = new AnyEvent::Handle
770 fh => $fh,
771 on_error => sub {
772 undef $handle;
773 $cb->("HTTP/1.0 500 $!");
774 },
775 on_eof => sub {
776 undef $handle; # keep it alive till eof
777 $cb->($response, $header, $body);
778 };
779
780 $handle->push_write ("GET $uri HTTP/1.0\015\012\015\012");
781
782 # now fetch response status line
783 $handle->push_read (line => sub {
784 my ($handle, $line) = @_;
785 $response = $line;
786 });
787
788 # then the headers
789 $handle->push_read (line => "\015\012\015\012", sub {
790 my ($handle, $line) = @_;
791 $header = $line;
792 });
793
794 # and finally handle any remaining data as body
795 $handle->on_read (sub {
796 $body .= $_[0]->rbuf;
797 $_[0]->rbuf = "";
798 });
799 };
800 }
760 801
761And now let's go through it step by step. First, as usual, the overall 802And now let's go through it step by step. First, as usual, the overall
762C<http_get> function structure: 803C<http_get> function structure:
763 804
764 sub http_get { 805 sub http_get {
860 901
861Reading the response if far more interesting: 902Reading the response if far more interesting:
862 903
863=head3 The read queue 904=head3 The read queue
864 905
865the response consists of three parts: a single line of response status, a 906The response consists of three parts: a single line of response status, a
866single paragraph of headers ended by an empty line, and the request body, 907single paragraph of headers ended by an empty line, and the request body,
867which is simply the remaining data on that connection. 908which is simply the remaining data on that connection.
868 909
869For the first two, we push two read requests onto the read queue: 910For the first two, we push two read requests onto the read queue:
870 911
904 $_[0]->rbuf = ""; 945 $_[0]->rbuf = "";
905 }); 946 });
906 947
907This callback is invoked every time data arrives and the read queue is 948This callback is invoked every time data arrives and the read queue is
908empty - which in this example will only be the case when both response and 949empty - which in this example will only be the case when both response and
909header have been read. 950header have been read. The C<on_read> callback could actually have been
951specified when constructing the object, but doing it this way preserves
952logical ordering.
910 953
954The read callback simply adds the current read buffer to it's C<$body>
955variable and, most importantly, I<empties> it by assign the empty string
956to it.
911 957
912############################################################################# 958After AnyEvent::Handle has been so instructed, it will now handle incoming
959data according to these instructions - if all goes well, the callback will
960be invoked with the response data, if not, it will get an error.
913 961
914Now let's start with something simple: a program that reads from standard 962In general, you get pipelining very easy with AnyEvent::Handle: If
915input in a non-blocking way, that is, in a way that lets your program do 963you have a protocol with a request/response structure, your request
916other things while it is waiting for input. 964methods/functions will all look like this (simplified):
917 965
918First, the full program listing: 966 sub request {
919 967
920 #!/usr/bin/perl 968 # send the request to the server
969 $handle->push_write (...);
921 970
922 use AnyEvent; 971 # push some response handlers
923 use AnyEvent::Handle; 972 $handle->push_read (...);
973 }
924 974
925 my $end_prog = AnyEvent->condvar; 975=head3 Using it
926 976
927 my $handle = 977And here is how you would use it:
928 AnyEvent::Handle->new ( 978
929 fh => \*STDIN, 979 http_get "www.google.com", "/", sub {
930 on_eof => sub { 980 my ($response, $header, $body) = @_;
931 print "received EOF, exiting...\n"; 981
932 $end_prog->broadcast; 982 print
933 }, 983 $response, "\n",
934 on_error => sub { 984 $body;
935 print "error while reading from STDIN: $!\n"; 985 };
936 $end_prog->broadcast; 986
987And of course, you can run as many of these requests in parallel as you
988want (and your memory supports).
989
990=head3 HTTPS
991
992Now, as promised, let's implement the same thing for HTTPS, or more
993correctly, let's change our C<http_get> function into a function that
994speaks HTTPS instead.
995
996HTTPS is, quite simply, a standard TLS connection (B<T>ransport B<L>ayer
997B<S>ecurity is the official name for what most people refer to as C<SSL>)
998that contains standard HTTP protocol exchanges. The other difference to
999HTTP is that it uses port C<443> instead of port C<80>.
1000
1001To implement these two differences we need two tiny changes, first, in the C<tcp_connect> call
1002we replace C<http> by C<https>):
1003
1004 tcp_connect $host, "https", sub { ...
1005
1006The other change deals with TLS, which is something L<AnyEvent::Handle>
1007does for us, as long as I<you> made sure that the L<Net::SSLeay> module is
1008around. To enable TLS with L<AnyEvent::Handle>, we simply pass an addition
1009C<tls> parameter to the call to C<AnyEvent::Handle::new>:
1010
1011 tls => "connect",
1012
1013Specifying C<tls> enables TLS, and the argument specifies whether
1014AnyEvent::Handle is the server side ("accept") or the client side
1015("connect") for the TLS connection, as unlike TCP, there is a clear
1016server/client relationship in TLS.
1017
1018That's all.
1019
1020Of course, all this should be handled transparently by C<http_get> after
1021parsing the URL. See the part about exercising your inspiration earlier in
1022this document.
1023
1024=head3 The read queue - revisited
1025
1026HTTP always uses the same structure in its responses, but many protocols
1027require parsing responses different depending on the response itself.
1028
1029For example, in SMTP, you normally get a single response line:
1030
1031 220 mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1032
1033But SMTP also supports multi-line responses:
1034
1035 220-mail.example.net Neverusesendmail 8.8.8 <mailme@example.net>
1036 220-hey guys
1037 220 my response is longer than yours
1038
1039To handle this, we need C<unshift_read>. As the name (hopefully) implies,
1040C<unshift_read> will not append your read request tot he end of the read
1041queue, but instead it will prepend it to the queue.
1042
1043This is useful for this this situation: You push your response-line read
1044request when sending the SMTP command, and when handling it, you look at
1045the line to see if more is to come, and C<unshift_read> another reader,
1046like this:
1047
1048 my $response; # response lines end up in here
1049
1050 my $read_response; $read_response = sub {
1051 my ($handle, $line) = @_;
1052
1053 $response .= "$line\n";
1054
1055 # check for continuation lines ("-" as 4th character")
1056 if ($line =~ /^...-/) {
1057 # if yes, then unshift another line read
1058 $handle->unshift_read (line => $read_response);
1059
1060 } else {
1061 # otherwise we are done
1062
1063 # free callback
1064 undef $read_response;
937 } 1065
1066 print "we are don reading: $response\n";
938 ); 1067 }
1068 };
1069
1070 $handle->push_read (line => $read_response);
1071
1072This recipe can be used for all similar parsing problems, for example in
1073NNTP, the response code to some commands indicates that more data will be
1074sent:
1075
1076 $handle->push_write ("article 42");
1077
1078 # read response line
1079 $handle->push_read (line => sub {
1080 my ($handle, $status) = @_;
1081
1082 # article data following?
1083 if ($status =~ /^2/) {
1084 # yes, read article body
1085
1086 $handle->unshift_read (line => "\012.\015\012", sub {
1087 my ($handle, $body) = @_;
1088
1089 $finish->($status, $body);
1090 });
1091
1092 } else {
1093 # some error occured, no article data
1094
1095 $finish->($status);
1096 }
1097 }
1098
1099=head3 Your own read queue handler
1100
1101Sometimes, your protocol doesn't play nice and uses lines or chunks of
1102data, in which case you have to implement your own read parser.
1103
1104To make up a contorted example, imagine you are looking for an even
1105number of characters followed by a colon (":"). Also imagine that
1106AnyEvent::Handle had no C<regex> read type which could be used, so you'd
1107had to do it manually.
1108
1109To implement this, you would C<push_read> (or C<unshift_read>) just a
1110single code reference.
1111
1112This code reference will then be called each time there is (new) data
1113available in the read buffer, and is expected to either eat/consume some
1114of that data (and return true) or to return false to indicate that it
1115wants to be called again.
1116
1117If the code reference returns true, then it will be removed from the read
1118queue, otherwise it stays in front of it.
1119
1120The example above could be coded like this:
939 1121
940 $handle->push_read (sub { 1122 $handle->push_read (sub {
941 my ($handle) = @_; 1123 my ($handle) = @_;
942 1124
943 if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) { 1125 # check for even number of characters + ":"
944 print "got 'end', existing...\n"; 1126 # and remove the data if a match is found.
945 $end_prog->broadcast; 1127 # if not, return false (actually nothing)
1128
1129 $handle->{rbuf} =~ s/^( (?:..)* ) ://x
946 return 1 1130 or return;
1131
1132 # we got some data in $1, pass it to whoever wants it
1133 $finish->($1);
1134
1135 # and return true to indicate we are done
947 } 1136 1
948
949 0
950 }); 1137 });
951 1138
952 $end_prog->recv;
953
954That's a mouthful, so let's go through it step by step:
955
956 #!/usr/bin/perl
957
958 use AnyEvent;
959 use AnyEvent::Handle;
960
961Nothing unexpected here, just load AnyEvent for the event functionality
962and AnyEvent::Handle for your file handling needs.
963
964 my $end_prog = AnyEvent->condvar;
965
966Here the program creates a so-called 'condition variable': Condition
967variables are a great way to signal the completion of some event, or to
968state that some condition became true (thus the name).
969
970This condition variable represents the condition that the program wants to
971terminate. Later in the program, we will 'recv' that condition (call the
972C<recv> method on it), which will wait until the condition gets signalled
973(which is done by calling the C<send> method on it).
974
975The next step is to create the handle object:
976
977 my $handle =
978 AnyEvent::Handle->new (
979 fh => \*STDIN,
980 on_eof => sub {
981 print "received EOF, exiting...\n";
982 $end_prog->broadcast;
983 },
984
985This handle object will read from standard input. Setting the C<on_eof>
986callback should be done for every file handle, as that is a condition that
987we always need to check for when working with file handles, to prevent
988reading or writing to a closed file handle, or getting stuck indefinitely
989in case of an error.
990
991Speaking of errors:
992
993 on_error => sub {
994 print "error while reading from STDIN: $!\n";
995 $end_prog->broadcast;
996 }
997 );
998
999The C<on_error> callback is also not required, but we set it here in case
1000any error happens when we read from the file handle. It is usually a good
1001idea to set this callback and at least print some diagnostic message: Even
1002in our small example an error can happen. More on this later...
1003
1004 $handle->push_read (sub {
1005
1006Next we push a general read callback on the read queue, which
1007will wait until we have received all the data we wanted to
1008receive. L<AnyEvent::Handle> has two queues per file handle, a read and a
1009write queue. The write queue queues pending data that waits to be written
1010to the file handle. And the read queue queues reading callbacks. For more
1011details see the documentation L<AnyEvent::Handle> about the READ QUEUE and
1012WRITE QUEUE.
1013
1014 my ($handle) = @_;
1015
1016 if ($handle->rbuf =~ s/^.*?\bend\b.*$//s) {
1017 print "got 'end', existing...\n";
1018 $end_prog->broadcast;
1019 return 1
1020 }
1021
1022 0
1023 });
1024
1025The actual callback waits until the word 'end' has been seen in the data
1026received on standard input. Once we encounter the stop word 'end' we
1027remove everything from the read buffer and call the condition variable
1028we setup earlier, that signals our 'end of program' condition. And the
1029callback returns with a true value, that signals we are done with reading
1030all the data we were interested in (all data until the word 'end' has been
1031seen).
1032
1033In all other cases, when the stop word has not been seen yet, we just
1034return a false value, to indicate that we are not finished yet.
1035
1036The C<rbuf> method returns our read buffer, that we can directly modify as
1037lvalue. Alternatively we also could have written:
1038
1039 if ($handle->{rbuf} =~ s/^.*?\bend\b.*$//s) {
1040
1041The last line will wait for the condition that our program wants to exit:
1042
1043 $end_prog->recv;
1044
1045The call to C<recv> will setup an event loop for us and wait for IO, timer
1046or signal events and will handle them until the condition gets sent (by
1047calling its C<send> method).
1048
1049The key points to learn from this example are:
1050
1051=over 4
1052
1053=item * Condition variables are used to start an event loop.
1054
1055=item * How to registering some basic callbacks on AnyEvent::Handle's.
1056
1057=item * How to process data in the read buffer.
1058
1059=back
1060 1139
1061=head1 AUTHORS 1140=head1 AUTHORS
1062 1141
1063Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>. 1142Robin Redeker C<< <elmex at ta-sa.org> >>, Marc Lehmann <schmorp@schmorp.de>.
1064 1143

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines