ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-MP/MP/Intro.pod
Revision: 1.24
Committed: Sat Aug 29 15:13:36 2009 UTC (14 years, 9 months ago) by root
Branch: MAIN
Changes since 1.23: +36 -29 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.4 =head1 Message Passing for the Non-Blocked Mind
2 elmex 1.1
3 root 1.8 =head1 Introduction and Terminology
4 elmex 1.1
5 root 1.4 This is a tutorial about how to get the swing of the new L<AnyEvent::MP>
6 root 1.23 module, which allows programs to transparently pass messages within the
7     process and to other processes on the same or a different host.
8 elmex 1.1
9 root 1.23 What kind of messages? Basically a message here means a list of Perl
10 root 1.15 strings, numbers, hashes and arrays, anything that can be expressed as a
11 root 1.23 L<JSON> text (as JSON is used by default in the protocol). Here are two
12     examples:
13 elmex 1.1
14 root 1.23 write_log => 1251555874, "action was successful.\n"
15     123, ["a", "b", "c"], { foo => "bar" }
16 elmex 1.21
17 root 1.23 When using L<AnyEvent::MP> it is customary to use a descriptive string as
18     first element of a message, that indictes the type of the message. This
19     element is called a I<tag> in L<AnyEvent::MP>, as some API functions
20     (C<rcv>) support matching it directly.
21    
22     Supposedly you want to send a ping message with your current time to
23     somewhere, this is how such a message might look like (in Perl syntax):
24    
25     ping => 1251381636
26    
27     Now that we know what a message is, to which entities are those
28     messages being I<passed>? They are I<passed> to I<ports>. A I<port> is
29     a destination for messages but also a context to execute code: when
30     a runtime error occurs while executing code belonging to a port, the
31     exception will be raised on the port and can even travel to interested
32     parties on other nodes, which makes supervision of distributed processes
33     easy.
34    
35     How do these ports relate to things you know? Each I<port> belongs
36     to a I<node>, and a I<node> is just the UNIX process that runs your
37     L<AnyEvent::MP> application.
38    
39     Each I<node> is distinguished from other I<nodes> running on the same or
40     another host in a network by its I<node ID>. A I<node ID> is simply a
41     unique string chosen manually or assigned by L<AnyEvent::MP> in some way
42     (UNIX nodename, random string...).
43    
44     Here is a diagram about how I<nodes>, I<ports> and UNIX processes relate
45     to each other. The setup consists of two nodes (more are of course
46     possible): Node C<A> (in UNIX process 7066) with the ports C<ABC> and
47     C<DEF>. And the node C<B> (in UNIX process 8321) with the ports C<FOO> and
48     C<BAR>.
49 elmex 1.17
50    
51     |- PID: 7066 -| |- PID: 8321 -|
52     | | | |
53     | Node ID: A | | Node ID: B |
54     | | | |
55     | Port ABC =|= <----\ /-----> =|= Port FOO |
56     | | X | |
57     | Port DEF =|= <----/ \-----> =|= Port BAR |
58     | | | |
59     |-------------| |-------------|
60    
61 root 1.23 The strings for the I<port IDs> here are just for illustrative
62     purposes: Even though I<ports> in L<AnyEvent::MP> are also identified by
63     strings, they can't be choosen manually and are assigned by the system
64     dynamically. These I<port IDs> are unique within a network and can also be
65     used to identify senders or as message tags for instance.
66    
67     The next sections will explain the API of L<AnyEvent::MP> by going through
68     a few simple examples. Later some more complex idioms are introduced,
69     which are hopefully useful to solve some real world problems.
70 root 1.8
71 elmex 1.16 =head1 Passing Your First Message
72    
73 root 1.24 As a start lets have a look at the messaging API. The following example
74     is just a demo to show the basic elements of message passing with
75     L<AnyEvent::MP>.
76    
77     The example should print: C<Ending with: 123>, in a rather complicated
78     way, by passing some message to a port.
79 elmex 1.16
80     use AnyEvent;
81     use AnyEvent::MP;
82    
83     my $end_cv = AnyEvent->condvar;
84    
85     my $port = port;
86    
87     rcv $port, test => sub {
88     my ($data) = @_;
89     $end_cv->send ($data);
90     };
91    
92     snd $port, test => 123;
93    
94     print "Ending with: " . $end_cv->recv . "\n";
95    
96 root 1.24 It already uses most of the essential functions inside
97     L<AnyEvent::MP>: First there is the C<port> function which will create a
98     I<port> and will return it's I<port ID>, a simple string.
99    
100     This I<port ID> can be used to send messages to the port and install
101     handlers to receive messages on the port. Since it is a simple string
102     it can be safely passed to other I<nodes> in the network when you want
103     to refer to that specific port (usually used for RPC, where you need
104     to tell the other end which I<port> to send the reply to - messages in
105     L<AnyEvent::MP> have a destination, but no source).
106 elmex 1.17
107 root 1.24 The next function is C<rcv>:
108 elmex 1.16
109 elmex 1.17 rcv $port, test => sub { ... };
110 elmex 1.16
111 root 1.24 It installs a receiver callback on the I<port> that specified as the first
112     argument (it only works for "local" ports, i.e. ports created on the same
113     node). The next argument, in this example C<test>, specifies a I<tag> to
114     match. This means that whenever a message with the first element being
115     the string C<test> is received, the callback is called with the remaining
116 elmex 1.17 parts of that message.
117    
118 root 1.24 Messages can be sent with the C<snd> function, which is used like this in
119     the example above:
120 elmex 1.17
121     snd $port, test => 123;
122    
123 root 1.24 This will send the message C<'test', 123> to the I<port> with the I<port
124     ID> stored in C<$port>. Since in this case the receiver has a I<tag> match
125     on C<test> it will call the callback with the first argument being the
126     number C<123>.
127    
128     The callback is a typicall AnyEvent idiom: the callback just passes
129     that number on to the I<condition variable> C<$end_cv> which will then
130     pass the value to the print. Condition variables are out of the scope
131     of this tutorial and not often used with ports, so please consult the
132 elmex 1.17 L<AnyEvent::Intro> about them.
133    
134 root 1.24 Passing messages inside just one process is boring. Before we can move on
135     and do interprocess message passing we first have to make sure some things
136     have been set up correctly for our nodes to talk to each other.
137 elmex 1.17
138     =head1 System Requirements and System Setup
139    
140     Before we can start with real IPC we have to make sure some things work on your
141     system.
142    
143     First we have to setup a I<shared secret>: for two L<AnyEvent::MP> I<nodes> to
144     be able to communicate with each other and authenticate each other it is
145     necessary to setup the same I<shared secret> for both of them (or use TLS
146     certificates).
147    
148     The easiest way is to set this up is to use the F<aemp> utility:
149    
150     aemp gensecret
151    
152     This creates a F<$HOME/.perl-anyevent-mp> config file and generates a random
153     shared secret. You can copy this file to any other system and then communicate
154     over the network (via TCP) with it. You can also select your own shared secret
155     (F<aemp setsecret>) and for increased security requirements you can even create
156     a TLS certificate (F<aemp gencert>), causing connections to not just be
157     authenticated, but also to be encrypted.
158    
159     Connections will only be successful when the I<nodes> that want to connect to
160     each other have the same I<shared secret> (or successfully verify the TLS
161     certificate of the other side).
162    
163     B<If something does not work as expected, and for example tcpdump shows
164     that the connections are closed almost immediately, you should make sure
165     that F<~/.perl-anyevent-mp> is the same on all hosts/user accounts that
166     you try to connect with each other!>
167 elmex 1.16
168 elmex 1.18 Thats all for now, there is more fiddling around with the C<aemp> utility
169     later.
170    
171     =head1 Passing Messages Between Processes
172    
173     =head2 The Receiver
174    
175     Lets split the previous example up into two small programs. First the
176     receiver application:
177    
178     #!/opt/perl/bin/perl
179     use AnyEvent;
180     use AnyEvent::MP;
181     use AnyEvent::MP::Global;
182    
183     initialise_node "eg_simple_receiver";
184    
185     my $port = port;
186    
187     AnyEvent::MP::Global::register $port, "eg_receivers";
188    
189     rcv $port, test => sub {
190     my ($data, $reply_port) = @_;
191    
192     print "Received data: " . $data . "\n";
193     };
194    
195     AnyEvent->condvar->recv;
196    
197     =head3 AnyEvent::MP::Global
198    
199     Now, that wasn't too bad, was it? Ok, lets step through the new functions
200     and modules that have been used. For starters there is now an additional
201     module loaded: L<AnyEvent::MP::Global>.
202    
203     That module provides us with a I<global registry>, which lets us share data
204     among all I<nodes> in a network. Why do we need it you might ask?
205    
206 elmex 1.20 The thing is, that the I<port ids> are just random strings, assigned by
207     L<AnyEvent::MP>. We can't know those I<port ids> in advance, so we don't know
208     which I<port id> to send messages to if the message is to be passed between
209     I<nodes> (or UNIX processes). To find the right I<port> of another I<node> in
210     the network we will need to communicate that somehow to the sender. And
211     exactly that is what L<AnyEvent::MP::Global> provides.
212 elmex 1.18
213     =head3 initialise_node And The Network
214    
215     Now, lets have a look at the next new thing, the C<initialise_node>:
216    
217     initialise_node "eg_simple_receiver";
218    
219     Before we are able to send messages to other nodes we have to initialise
220     ourself. The first argument, the string C<"eg_simple_receiver">, is called the
221     I<profile> of this node. A profile holds some information about the application
222     that is going to be a node in an L<AnyEvent::MP> network.
223    
224     Most importantly the profile allows you to set the I<node id> that your
225     application will use. You can also set I<binds> in the profile, meaning that
226     you can define TCP ports that the application will listen on for incoming
227     connections from other nodes of the network.
228    
229     Next you can configure I<seeds> in profile. A I<seed> is just a TCP endpoint
230     which tells the application where to find other nodes of it's network. To
231     explain this a bit more detailed we have to look at the topology of an
232     L<AnyEvent::MP> network. The topology is called a I<fully connected mesh>, here
233     an example with 4 nodes:
234    
235     N1--N2
236     | \/ |
237     | /\ |
238     N3--N4
239    
240     Now imagine another I<node> C<N5>. wants to connect itself to that network:
241    
242     N1--N2
243     | \/ | N5
244     | /\ |
245     N3--N4
246    
247     The new node needs to know the I<binds> of all of those 4 already connected
248     nodes. And exactly this is what the I<seeds> are for. Now lets assume that
249     the new node C<N5> has as I<seed> the TCP endpoint of the node C<N2>.
250     It then connects to C<N2>:
251    
252     N1--N2____
253     | \/ | N5
254     | /\ |
255     N3--N4
256    
257     C<N2> then tells C<N5> the I<binds> of the other nodes it is connected to,
258     and C<N5> builds up the rest of the connections:
259    
260     /--------\
261     N1--N2____|
262     | \/ | N5
263     | /\ | /|
264     N3--N4--- |
265     \________/
266    
267     Finished. C<N5> is now happily connected to the rest of the network.
268    
269 elmex 1.19 =head3 Setting Up The Profiles
270    
271     Ok, so much to the profile. Now lets setup the C<eg_simple_receiver> I<profile>
272     for later. For the receiver we just give the receiver a I<bind>:
273    
274     aemp profile eg_simple_receiver setbinds localhost:12266
275    
276     And while we are at it, just setup the I<profile> for the sender in the second
277     part of this example too. We will call the sender I<profile>
278     C<eg_simple_sender>. For the sender we will just setup a I<seed> to the
279     receiver:
280    
281     aemp profile eg_simple_sender setseeds localhost:12266
282 elmex 1.22 aemp profile eg_simple_sender setbinds
283 elmex 1.19
284 elmex 1.22 You might wonder why we setup I<binds> to be empty here. Well, there can be
285 elmex 1.19 exceptions to the I<fully> in the I<fully connected mesh> in L<AnyEvent::MP>.
286     If you don't configure a I<bind> for a node's profile it won't bind itself
287     somewhere. These kinds of I<nodes> will not be able to send messages to other
288     I<nodes> that also didn't I<bind> them self to some TCP address. For this
289     example, as well as some cases in the real world, we can live with this
290     limitation.
291    
292     =head3 Registering The Receiver
293    
294     Ok, where were we. We now discussed the basic purpose of L<AnyEvent::MP::Global>
295     and initialise_node with it's relations to profiles. We also setup our profiles
296     for later use and now have to continue talking about the receiver example.
297    
298     Lets look at the next undiscussed line(s) of code:
299    
300     my $port = port;
301     AnyEvent::MP::Global::register $port, "eg_receivers";
302    
303     The C<port> function already has been discussed. It just creates a new I<port>
304     and gives us the I<port id>. Now to the C<register> function of
305     L<AnyEvent::MP::Global>: The first argument is a I<port id> that we want to add
306     to a I<global group>, and it's second argument is the name of that I<global
307     group>.
308    
309     You can choose that name of such a I<global group> freely, and it's purpose is
310     to store a set of I<port ids>. That set is made available throughout the whole
311     L<AnyEvent::MP> network, so that each node can see which ports belong to that
312     group.
313    
314     The sender will later look for the ports in that I<global group> and send
315     messages to them.
316    
317     Last step in the example is to setup a receiver callback for those messages
318     like we have discussed in the first example. We again match for the I<tag>
319     C<test>. The difference is just that we don't end the application after
320     receiving the first message. We just infinitely continue to look out for new
321     messages.
322    
323 elmex 1.20 =head2 The Sender
324 root 1.8
325 elmex 1.20 Ok, now lets take a look at the sender:
326 root 1.4
327 elmex 1.20 #!/opt/perl/bin/perl
328 elmex 1.1 use AnyEvent;
329     use AnyEvent::MP;
330 elmex 1.20 use AnyEvent::MP::Global;
331 elmex 1.1
332 elmex 1.20 initialise_node "eg_simple_sender";
333 elmex 1.1
334 elmex 1.20 my $find_timer =
335     AnyEvent->timer (after => 0, interval => 1, cb => sub {
336     my $ports = AnyEvent::MP::Global::find "eg_receivers"
337     or return;
338    
339     snd $_, test => time
340     for @$ports;
341     });
342 elmex 1.1
343     AnyEvent->condvar->recv;
344    
345 elmex 1.20 It's even less code. The C<initialise_node> is known now from the receiver
346     above. As discussed in the section where we setup the profiles we configure
347     this application to use the I<profile> C<eg_simple_sender>.
348 root 1.10
349 elmex 1.20 Next we setup a timer that repeatedly calls this chunk of code:
350 elmex 1.1
351 elmex 1.20 my $ports = AnyEvent::MP::Global::find "eg_receivers"
352     or return;
353 elmex 1.2
354 elmex 1.20 snd $_, test => time
355     for @$ports;
356 elmex 1.1
357 elmex 1.20 The new function here is the C<find> function of L<AnyEvent::MP::Global>. It
358     searches in the I<global group> named C<eg_receivers> for ports. If none are
359     found C<undef> is returned and we wait for the next time the timer fires.
360 elmex 1.1
361 elmex 1.20 In case the receiver application has been connected and the newly added port by
362     the receiver has propagated to the sender C<find> returns an array reference
363     that contains the I<port id> of the receiver I<port(s)>.
364 elmex 1.1
365 elmex 1.20 We then just send to every I<port> in the I<global group> a message consisting
366     of the I<tag> C<test> and the current time in form of a UNIX timestamp.
367 elmex 1.7
368 elmex 1.20 And thats all.
369 elmex 1.7
370 elmex 1.1 =head1 SEE ALSO
371    
372     L<AnyEvent>
373    
374     L<AnyEvent::Handle>
375    
376     L<AnyEvent::MP>
377    
378 elmex 1.20 L<AnyEvent::MP::Global>
379    
380 elmex 1.1 =head1 AUTHOR
381    
382     Robin Redeker <elmex@ta-sa.org>
383 root 1.4