ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-MP/MP/Intro.pod
Revision: 1.26
Committed: Sat Aug 29 16:08:03 2009 UTC (14 years, 11 months ago) by root
Branch: MAIN
Changes since 1.25: +56 -34 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.4 =head1 Message Passing for the Non-Blocked Mind
2 elmex 1.1
3 root 1.8 =head1 Introduction and Terminology
4 elmex 1.1
5 root 1.4 This is a tutorial about how to get the swing of the new L<AnyEvent::MP>
6 root 1.23 module, which allows programs to transparently pass messages within the
7     process and to other processes on the same or a different host.
8 elmex 1.1
9 root 1.23 What kind of messages? Basically a message here means a list of Perl
10 root 1.15 strings, numbers, hashes and arrays, anything that can be expressed as a
11 root 1.23 L<JSON> text (as JSON is used by default in the protocol). Here are two
12     examples:
13 elmex 1.1
14 root 1.23 write_log => 1251555874, "action was successful.\n"
15     123, ["a", "b", "c"], { foo => "bar" }
16 elmex 1.21
17 root 1.23 When using L<AnyEvent::MP> it is customary to use a descriptive string as
18     first element of a message, that indictes the type of the message. This
19     element is called a I<tag> in L<AnyEvent::MP>, as some API functions
20     (C<rcv>) support matching it directly.
21    
22     Supposedly you want to send a ping message with your current time to
23     somewhere, this is how such a message might look like (in Perl syntax):
24    
25     ping => 1251381636
26    
27     Now that we know what a message is, to which entities are those
28     messages being I<passed>? They are I<passed> to I<ports>. A I<port> is
29     a destination for messages but also a context to execute code: when
30     a runtime error occurs while executing code belonging to a port, the
31     exception will be raised on the port and can even travel to interested
32     parties on other nodes, which makes supervision of distributed processes
33     easy.
34    
35     How do these ports relate to things you know? Each I<port> belongs
36     to a I<node>, and a I<node> is just the UNIX process that runs your
37     L<AnyEvent::MP> application.
38    
39     Each I<node> is distinguished from other I<nodes> running on the same or
40     another host in a network by its I<node ID>. A I<node ID> is simply a
41     unique string chosen manually or assigned by L<AnyEvent::MP> in some way
42     (UNIX nodename, random string...).
43    
44     Here is a diagram about how I<nodes>, I<ports> and UNIX processes relate
45     to each other. The setup consists of two nodes (more are of course
46     possible): Node C<A> (in UNIX process 7066) with the ports C<ABC> and
47     C<DEF>. And the node C<B> (in UNIX process 8321) with the ports C<FOO> and
48     C<BAR>.
49 elmex 1.17
50    
51     |- PID: 7066 -| |- PID: 8321 -|
52     | | | |
53     | Node ID: A | | Node ID: B |
54     | | | |
55     | Port ABC =|= <----\ /-----> =|= Port FOO |
56     | | X | |
57     | Port DEF =|= <----/ \-----> =|= Port BAR |
58     | | | |
59     |-------------| |-------------|
60    
61 root 1.23 The strings for the I<port IDs> here are just for illustrative
62     purposes: Even though I<ports> in L<AnyEvent::MP> are also identified by
63     strings, they can't be choosen manually and are assigned by the system
64     dynamically. These I<port IDs> are unique within a network and can also be
65     used to identify senders or as message tags for instance.
66    
67     The next sections will explain the API of L<AnyEvent::MP> by going through
68     a few simple examples. Later some more complex idioms are introduced,
69     which are hopefully useful to solve some real world problems.
70 root 1.8
71 elmex 1.16 =head1 Passing Your First Message
72    
73 root 1.24 As a start lets have a look at the messaging API. The following example
74     is just a demo to show the basic elements of message passing with
75     L<AnyEvent::MP>.
76    
77     The example should print: C<Ending with: 123>, in a rather complicated
78     way, by passing some message to a port.
79 elmex 1.16
80     use AnyEvent;
81     use AnyEvent::MP;
82    
83     my $end_cv = AnyEvent->condvar;
84    
85     my $port = port;
86    
87     rcv $port, test => sub {
88     my ($data) = @_;
89     $end_cv->send ($data);
90     };
91    
92     snd $port, test => 123;
93    
94     print "Ending with: " . $end_cv->recv . "\n";
95    
96 root 1.24 It already uses most of the essential functions inside
97     L<AnyEvent::MP>: First there is the C<port> function which will create a
98     I<port> and will return it's I<port ID>, a simple string.
99    
100     This I<port ID> can be used to send messages to the port and install
101     handlers to receive messages on the port. Since it is a simple string
102     it can be safely passed to other I<nodes> in the network when you want
103     to refer to that specific port (usually used for RPC, where you need
104     to tell the other end which I<port> to send the reply to - messages in
105     L<AnyEvent::MP> have a destination, but no source).
106 elmex 1.17
107 root 1.24 The next function is C<rcv>:
108 elmex 1.16
109 elmex 1.17 rcv $port, test => sub { ... };
110 elmex 1.16
111 root 1.24 It installs a receiver callback on the I<port> that specified as the first
112     argument (it only works for "local" ports, i.e. ports created on the same
113     node). The next argument, in this example C<test>, specifies a I<tag> to
114     match. This means that whenever a message with the first element being
115     the string C<test> is received, the callback is called with the remaining
116 elmex 1.17 parts of that message.
117    
118 root 1.24 Messages can be sent with the C<snd> function, which is used like this in
119     the example above:
120 elmex 1.17
121     snd $port, test => 123;
122    
123 root 1.24 This will send the message C<'test', 123> to the I<port> with the I<port
124     ID> stored in C<$port>. Since in this case the receiver has a I<tag> match
125     on C<test> it will call the callback with the first argument being the
126     number C<123>.
127    
128     The callback is a typicall AnyEvent idiom: the callback just passes
129     that number on to the I<condition variable> C<$end_cv> which will then
130     pass the value to the print. Condition variables are out of the scope
131     of this tutorial and not often used with ports, so please consult the
132 elmex 1.17 L<AnyEvent::Intro> about them.
133    
134 root 1.24 Passing messages inside just one process is boring. Before we can move on
135     and do interprocess message passing we first have to make sure some things
136     have been set up correctly for our nodes to talk to each other.
137 elmex 1.17
138     =head1 System Requirements and System Setup
139    
140 root 1.25 Before we can start with real IPC we have to make sure some things work on
141     your system.
142 elmex 1.17
143 root 1.25 First we have to setup a I<shared secret>: for two L<AnyEvent::MP>
144     I<nodes> to be able to communicate with each other over the network it is
145     necessary to setup the same I<shared secret> for both of them, so they can
146     prove their trustworthyness to each other.
147 elmex 1.17
148     The easiest way is to set this up is to use the F<aemp> utility:
149    
150     aemp gensecret
151    
152 root 1.25 This creates a F<$HOME/.perl-anyevent-mp> config file and generates a
153     random shared secret. You can copy this file to any other system and
154     then communicate over the network (via TCP) with it. You can also select
155     your own shared secret (F<aemp setsecret>) and for increased security
156     requirements you can even create (or configure) a TLS certificate (F<aemp
157     gencert>), causing connections to not just be securely authenticated, but
158     also to be encrypted and protected against tinkering.
159    
160     Connections will only be successfully established when the I<nodes>
161     that want to connect to each other have the same I<shared secret> (or
162     successfully verify the TLS certificate of the other side, in which case
163     no shared secret is required).
164 elmex 1.17
165     B<If something does not work as expected, and for example tcpdump shows
166     that the connections are closed almost immediately, you should make sure
167     that F<~/.perl-anyevent-mp> is the same on all hosts/user accounts that
168     you try to connect with each other!>
169 elmex 1.16
170 root 1.25 Thats is all for now, you will find some more advanced fiddling with the
171     C<aemp> utility later.
172    
173 elmex 1.18
174     =head1 Passing Messages Between Processes
175    
176     =head2 The Receiver
177    
178 root 1.25 Lets split the previous example up into two programs: one that contains
179     the sender and one for the receiver. First the receiver application, in
180     full:
181 elmex 1.18
182     use AnyEvent;
183     use AnyEvent::MP;
184     use AnyEvent::MP::Global;
185    
186     initialise_node "eg_simple_receiver";
187    
188     my $port = port;
189    
190     AnyEvent::MP::Global::register $port, "eg_receivers";
191    
192     rcv $port, test => sub {
193     my ($data, $reply_port) = @_;
194    
195     print "Received data: " . $data . "\n";
196     };
197    
198     AnyEvent->condvar->recv;
199    
200     =head3 AnyEvent::MP::Global
201    
202 root 1.25 Now, that wasn't too bad, was it? Ok, let's step through the new functions
203     and modules that have been used.
204    
205     For starters, there is now an additional module being
206     used: L<AnyEvent::MP::Global>. This module provides us with a I<global
207     registry>, which lets us register ports in groups that are visible on all
208     I<nodes> in a network.
209    
210     What is this useful for? Well, the I<port IDs> are random-looking strings,
211     assigned by L<AnyEvent::MP>. We cannot know those I<port IDs> in advance,
212     so we don't know which I<port ID> to send messages to, especially when the
213     message is to be passed between different I<nodes> (or UNIX processes). To
214     find the right I<port> of another I<node> in the network we will need
215     to communicate this somehow to the sender. And exactly that is what
216     L<AnyEvent::MP::Global> provides.
217    
218     Especially in larger, more anonymous networks this is handy: imagine you
219     have a few database backends, a few web frontends and some processing
220     distributed over a number of hosts: all of these would simply register
221     themselves in the appropriate group, and your web frontends can start to
222     find some database backend.
223 elmex 1.18
224 root 1.25 =head3 C<initialise_node> And The Network
225 elmex 1.18
226 root 1.26 Now, let's have a look at the new function, C<initialise_node>:
227 elmex 1.18
228     initialise_node "eg_simple_receiver";
229    
230     Before we are able to send messages to other nodes we have to initialise
231 root 1.26 ourself to become a "distributed node". Initialising a node means naming
232     the node, optionally binding some TCP listeners so that other nodes can
233     contact it and connecting to a predefined set of seed addresses so the
234     node can discover the existing network - and the existing network can
235     discover the node!
236    
237     The first argument, the string C<"eg_simple_receiver">, is the so-called
238     I<profile> to use: A profile holds some information about the application
239     that is going to be a node in an L<AnyEvent::MP> network. Customarily you
240     don't specify a profile name at all: in this case, AnyEvent::MP will use
241     the POSIX nodename.
242    
243     The profile allows you to set the I<node ID> that your application will
244     use (the node ID defaults to the profile name if not specified). You can
245     also set I<binds> in the profile, meaning that you can define TCP ports
246     that the application will listen on for incoming connections from other
247     nodes of the network.
248    
249     You should also configure I<seeds> in the profile: A I<seed> is just a
250     TCP address of some other node in the network. To explain this a bit
251     more detailed we have to look at the topology of an L<AnyEvent::MP>
252     network. The topology is called a I<fully connected mesh>, here an example
253     with 4 nodes:
254 elmex 1.18
255     N1--N2
256     | \/ |
257     | /\ |
258     N3--N4
259    
260     Now imagine another I<node> C<N5>. wants to connect itself to that network:
261    
262     N1--N2
263     | \/ | N5
264     | /\ |
265     N3--N4
266    
267 root 1.26 The new node needs to know the I<binds> of all nodes already
268     connected. Exactly this is what the I<seeds> are for: Let's assume that
269     the new node (C<N5>) uses the TCP address of the node C<N2> as seed. This
270     cuases it to connect to C<N2>:
271 elmex 1.18
272     N1--N2____
273     | \/ | N5
274     | /\ |
275     N3--N4
276    
277 root 1.26 C<N2> then tells C<N5> about the I<binds> of the other nodes it is
278     connected to, and C<N5> creates the rest of the connections:
279 elmex 1.18
280     /--------\
281     N1--N2____|
282     | \/ | N5
283     | /\ | /|
284     N3--N4--- |
285     \________/
286    
287 root 1.26 All done: C<N5> is now happily connected to the rest of the network.
288 elmex 1.18
289 elmex 1.19 =head3 Setting Up The Profiles
290    
291 root 1.26 Ok, so much to the profile. Now let's setup the C<eg_simple_receiver>
292     I<profile> for later use. For the receiver we just give the receiver a
293     I<bind>:
294 elmex 1.19
295     aemp profile eg_simple_receiver setbinds localhost:12266
296    
297 root 1.26 We use C<localhost> in the example, but in the real world, you usually
298     want to use the "real" IP address of your node, so hosts can connect to
299     it. Of course, you can specify many binds, and it is also perfectly useful
300     to run multiple nodes on the same host. Just keep in mind that other nodes
301     will try to I<connect> to those addresses, and this better succeeds if you
302     want your network to be in good working conditions.
303    
304     While we are at it, we setup the I<profile> for the sender in the
305     second part of this example, too. We will call the sender I<profile>
306     C<eg_simple_sender>. For the sender we set up a I<seed> pointing to the
307 elmex 1.19 receiver:
308    
309     aemp profile eg_simple_sender setseeds localhost:12266
310 elmex 1.22 aemp profile eg_simple_sender setbinds
311 elmex 1.19
312 root 1.26 You might wonder why we setup I<binds> to be empty here: actually, the the
313     I<fully> in the I<fully connected mesh> is not the complete truth: If you
314     don't configure any I<binds> for a node profile it will parse and try to
315     resolve the node ID to find addresses to bind to. In this case we pretend
316     that we do not want this and epxlicitly specify an empty binds list, so
317     the node will not actually listen on any TCP ports.
318    
319     Nodes without listeners will not be able to send messages to other nodes
320     without listeners, but they can still talk to all other nodes. For this
321     example, as well as in many cases in the real world, we can live with this
322     restriction, and this makes it easier to avoid DNS (assuming your setup is
323     broken, eliminating one potential problem :).
324 elmex 1.19
325     =head3 Registering The Receiver
326    
327     Ok, where were we. We now discussed the basic purpose of L<AnyEvent::MP::Global>
328     and initialise_node with it's relations to profiles. We also setup our profiles
329     for later use and now have to continue talking about the receiver example.
330    
331     Lets look at the next undiscussed line(s) of code:
332    
333     my $port = port;
334     AnyEvent::MP::Global::register $port, "eg_receivers";
335    
336     The C<port> function already has been discussed. It just creates a new I<port>
337     and gives us the I<port id>. Now to the C<register> function of
338     L<AnyEvent::MP::Global>: The first argument is a I<port id> that we want to add
339     to a I<global group>, and it's second argument is the name of that I<global
340     group>.
341    
342     You can choose that name of such a I<global group> freely, and it's purpose is
343     to store a set of I<port ids>. That set is made available throughout the whole
344     L<AnyEvent::MP> network, so that each node can see which ports belong to that
345     group.
346    
347     The sender will later look for the ports in that I<global group> and send
348     messages to them.
349    
350     Last step in the example is to setup a receiver callback for those messages
351     like we have discussed in the first example. We again match for the I<tag>
352     C<test>. The difference is just that we don't end the application after
353     receiving the first message. We just infinitely continue to look out for new
354     messages.
355    
356 elmex 1.20 =head2 The Sender
357 root 1.8
358 elmex 1.20 Ok, now lets take a look at the sender:
359 root 1.4
360 elmex 1.20 #!/opt/perl/bin/perl
361 elmex 1.1 use AnyEvent;
362     use AnyEvent::MP;
363 elmex 1.20 use AnyEvent::MP::Global;
364 elmex 1.1
365 elmex 1.20 initialise_node "eg_simple_sender";
366 elmex 1.1
367 elmex 1.20 my $find_timer =
368     AnyEvent->timer (after => 0, interval => 1, cb => sub {
369     my $ports = AnyEvent::MP::Global::find "eg_receivers"
370     or return;
371    
372     snd $_, test => time
373     for @$ports;
374     });
375 elmex 1.1
376     AnyEvent->condvar->recv;
377    
378 elmex 1.20 It's even less code. The C<initialise_node> is known now from the receiver
379     above. As discussed in the section where we setup the profiles we configure
380     this application to use the I<profile> C<eg_simple_sender>.
381 root 1.10
382 elmex 1.20 Next we setup a timer that repeatedly calls this chunk of code:
383 elmex 1.1
384 elmex 1.20 my $ports = AnyEvent::MP::Global::find "eg_receivers"
385     or return;
386 elmex 1.2
387 elmex 1.20 snd $_, test => time
388     for @$ports;
389 elmex 1.1
390 elmex 1.20 The new function here is the C<find> function of L<AnyEvent::MP::Global>. It
391     searches in the I<global group> named C<eg_receivers> for ports. If none are
392     found C<undef> is returned and we wait for the next time the timer fires.
393 elmex 1.1
394 elmex 1.20 In case the receiver application has been connected and the newly added port by
395     the receiver has propagated to the sender C<find> returns an array reference
396     that contains the I<port id> of the receiver I<port(s)>.
397 elmex 1.1
398 elmex 1.20 We then just send to every I<port> in the I<global group> a message consisting
399     of the I<tag> C<test> and the current time in form of a UNIX timestamp.
400 elmex 1.7
401 elmex 1.20 And thats all.
402 elmex 1.7
403 elmex 1.1 =head1 SEE ALSO
404    
405     L<AnyEvent>
406    
407     L<AnyEvent::Handle>
408    
409     L<AnyEvent::MP>
410    
411 elmex 1.20 L<AnyEvent::MP::Global>
412    
413 elmex 1.1 =head1 AUTHOR
414    
415     Robin Redeker <elmex@ta-sa.org>
416 root 1.4