1 |
=head1 Message Passing for the Non-Blocked Mind |
2 |
|
3 |
=head1 Introduction and Terminology |
4 |
|
5 |
This is a tutorial about how to get the swing of the new L<AnyEvent::MP> |
6 |
module, which allows programs to transparently pass messages within the |
7 |
process and to other processes on the same or a different host. |
8 |
|
9 |
What kind of messages? Basically a message here means a list of Perl |
10 |
strings, numbers, hashes and arrays, anything that can be expressed as a |
11 |
L<JSON> text (as JSON is used by default in the protocol). Here are two |
12 |
examples: |
13 |
|
14 |
write_log => 1251555874, "action was successful.\n" |
15 |
123, ["a", "b", "c"], { foo => "bar" } |
16 |
|
17 |
When using L<AnyEvent::MP> it is customary to use a descriptive string as |
18 |
first element of a message, that indictes the type of the message. This |
19 |
element is called a I<tag> in L<AnyEvent::MP>, as some API functions |
20 |
(C<rcv>) support matching it directly. |
21 |
|
22 |
Supposedly you want to send a ping message with your current time to |
23 |
somewhere, this is how such a message might look like (in Perl syntax): |
24 |
|
25 |
ping => 1251381636 |
26 |
|
27 |
Now that we know what a message is, to which entities are those |
28 |
messages being I<passed>? They are I<passed> to I<ports>. A I<port> is |
29 |
a destination for messages but also a context to execute code: when |
30 |
a runtime error occurs while executing code belonging to a port, the |
31 |
exception will be raised on the port and can even travel to interested |
32 |
parties on other nodes, which makes supervision of distributed processes |
33 |
easy. |
34 |
|
35 |
How do these ports relate to things you know? Each I<port> belongs |
36 |
to a I<node>, and a I<node> is just the UNIX process that runs your |
37 |
L<AnyEvent::MP> application. |
38 |
|
39 |
Each I<node> is distinguished from other I<nodes> running on the same or |
40 |
another host in a network by its I<node ID>. A I<node ID> is simply a |
41 |
unique string chosen manually or assigned by L<AnyEvent::MP> in some way |
42 |
(UNIX nodename, random string...). |
43 |
|
44 |
Here is a diagram about how I<nodes>, I<ports> and UNIX processes relate |
45 |
to each other. The setup consists of two nodes (more are of course |
46 |
possible): Node C<A> (in UNIX process 7066) with the ports C<ABC> and |
47 |
C<DEF>. And the node C<B> (in UNIX process 8321) with the ports C<FOO> and |
48 |
C<BAR>. |
49 |
|
50 |
|
51 |
|- PID: 7066 -| |- PID: 8321 -| |
52 |
| | | | |
53 |
| Node ID: A | | Node ID: B | |
54 |
| | | | |
55 |
| Port ABC =|= <----\ /-----> =|= Port FOO | |
56 |
| | X | | |
57 |
| Port DEF =|= <----/ \-----> =|= Port BAR | |
58 |
| | | | |
59 |
|-------------| |-------------| |
60 |
|
61 |
The strings for the I<port IDs> here are just for illustrative |
62 |
purposes: Even though I<ports> in L<AnyEvent::MP> are also identified by |
63 |
strings, they can't be choosen manually and are assigned by the system |
64 |
dynamically. These I<port IDs> are unique within a network and can also be |
65 |
used to identify senders or as message tags for instance. |
66 |
|
67 |
The next sections will explain the API of L<AnyEvent::MP> by going through |
68 |
a few simple examples. Later some more complex idioms are introduced, |
69 |
which are hopefully useful to solve some real world problems. |
70 |
|
71 |
=head1 Passing Your First Message |
72 |
|
73 |
As a start lets have a look at the messaging API. The following example |
74 |
is just a demo to show the basic elements of message passing with |
75 |
L<AnyEvent::MP>. |
76 |
|
77 |
The example should print: C<Ending with: 123>, in a rather complicated |
78 |
way, by passing some message to a port. |
79 |
|
80 |
use AnyEvent; |
81 |
use AnyEvent::MP; |
82 |
|
83 |
my $end_cv = AnyEvent->condvar; |
84 |
|
85 |
my $port = port; |
86 |
|
87 |
rcv $port, test => sub { |
88 |
my ($data) = @_; |
89 |
$end_cv->send ($data); |
90 |
}; |
91 |
|
92 |
snd $port, test => 123; |
93 |
|
94 |
print "Ending with: " . $end_cv->recv . "\n"; |
95 |
|
96 |
It already uses most of the essential functions inside |
97 |
L<AnyEvent::MP>: First there is the C<port> function which will create a |
98 |
I<port> and will return it's I<port ID>, a simple string. |
99 |
|
100 |
This I<port ID> can be used to send messages to the port and install |
101 |
handlers to receive messages on the port. Since it is a simple string |
102 |
it can be safely passed to other I<nodes> in the network when you want |
103 |
to refer to that specific port (usually used for RPC, where you need |
104 |
to tell the other end which I<port> to send the reply to - messages in |
105 |
L<AnyEvent::MP> have a destination, but no source). |
106 |
|
107 |
The next function is C<rcv>: |
108 |
|
109 |
rcv $port, test => sub { ... }; |
110 |
|
111 |
It installs a receiver callback on the I<port> that specified as the first |
112 |
argument (it only works for "local" ports, i.e. ports created on the same |
113 |
node). The next argument, in this example C<test>, specifies a I<tag> to |
114 |
match. This means that whenever a message with the first element being |
115 |
the string C<test> is received, the callback is called with the remaining |
116 |
parts of that message. |
117 |
|
118 |
Messages can be sent with the C<snd> function, which is used like this in |
119 |
the example above: |
120 |
|
121 |
snd $port, test => 123; |
122 |
|
123 |
This will send the message C<'test', 123> to the I<port> with the I<port |
124 |
ID> stored in C<$port>. Since in this case the receiver has a I<tag> match |
125 |
on C<test> it will call the callback with the first argument being the |
126 |
number C<123>. |
127 |
|
128 |
The callback is a typicall AnyEvent idiom: the callback just passes |
129 |
that number on to the I<condition variable> C<$end_cv> which will then |
130 |
pass the value to the print. Condition variables are out of the scope |
131 |
of this tutorial and not often used with ports, so please consult the |
132 |
L<AnyEvent::Intro> about them. |
133 |
|
134 |
Passing messages inside just one process is boring. Before we can move on |
135 |
and do interprocess message passing we first have to make sure some things |
136 |
have been set up correctly for our nodes to talk to each other. |
137 |
|
138 |
=head1 System Requirements and System Setup |
139 |
|
140 |
Before we can start with real IPC we have to make sure some things work on |
141 |
your system. |
142 |
|
143 |
First we have to setup a I<shared secret>: for two L<AnyEvent::MP> |
144 |
I<nodes> to be able to communicate with each other over the network it is |
145 |
necessary to setup the same I<shared secret> for both of them, so they can |
146 |
prove their trustworthyness to each other. |
147 |
|
148 |
The easiest way is to set this up is to use the F<aemp> utility: |
149 |
|
150 |
aemp gensecret |
151 |
|
152 |
This creates a F<$HOME/.perl-anyevent-mp> config file and generates a |
153 |
random shared secret. You can copy this file to any other system and |
154 |
then communicate over the network (via TCP) with it. You can also select |
155 |
your own shared secret (F<aemp setsecret>) and for increased security |
156 |
requirements you can even create (or configure) a TLS certificate (F<aemp |
157 |
gencert>), causing connections to not just be securely authenticated, but |
158 |
also to be encrypted and protected against tinkering. |
159 |
|
160 |
Connections will only be successfully established when the I<nodes> |
161 |
that want to connect to each other have the same I<shared secret> (or |
162 |
successfully verify the TLS certificate of the other side, in which case |
163 |
no shared secret is required). |
164 |
|
165 |
B<If something does not work as expected, and for example tcpdump shows |
166 |
that the connections are closed almost immediately, you should make sure |
167 |
that F<~/.perl-anyevent-mp> is the same on all hosts/user accounts that |
168 |
you try to connect with each other!> |
169 |
|
170 |
Thats is all for now, you will find some more advanced fiddling with the |
171 |
C<aemp> utility later. |
172 |
|
173 |
|
174 |
=head1 Passing Messages Between Processes |
175 |
|
176 |
=head2 The Receiver |
177 |
|
178 |
Lets split the previous example up into two programs: one that contains |
179 |
the sender and one for the receiver. First the receiver application, in |
180 |
full: |
181 |
|
182 |
use AnyEvent; |
183 |
use AnyEvent::MP; |
184 |
use AnyEvent::MP::Global; |
185 |
|
186 |
initialise_node "eg_simple_receiver"; |
187 |
|
188 |
my $port = port; |
189 |
|
190 |
AnyEvent::MP::Global::register $port, "eg_receivers"; |
191 |
|
192 |
rcv $port, test => sub { |
193 |
my ($data, $reply_port) = @_; |
194 |
|
195 |
print "Received data: " . $data . "\n"; |
196 |
}; |
197 |
|
198 |
AnyEvent->condvar->recv; |
199 |
|
200 |
=head3 AnyEvent::MP::Global |
201 |
|
202 |
Now, that wasn't too bad, was it? Ok, let's step through the new functions |
203 |
and modules that have been used. |
204 |
|
205 |
For starters, there is now an additional module being |
206 |
used: L<AnyEvent::MP::Global>. This module provides us with a I<global |
207 |
registry>, which lets us register ports in groups that are visible on all |
208 |
I<nodes> in a network. |
209 |
|
210 |
What is this useful for? Well, the I<port IDs> are random-looking strings, |
211 |
assigned by L<AnyEvent::MP>. We cannot know those I<port IDs> in advance, |
212 |
so we don't know which I<port ID> to send messages to, especially when the |
213 |
message is to be passed between different I<nodes> (or UNIX processes). To |
214 |
find the right I<port> of another I<node> in the network we will need |
215 |
to communicate this somehow to the sender. And exactly that is what |
216 |
L<AnyEvent::MP::Global> provides. |
217 |
|
218 |
Especially in larger, more anonymous networks this is handy: imagine you |
219 |
have a few database backends, a few web frontends and some processing |
220 |
distributed over a number of hosts: all of these would simply register |
221 |
themselves in the appropriate group, and your web frontends can start to |
222 |
find some database backend. |
223 |
|
224 |
=head3 C<initialise_node> And The Network |
225 |
|
226 |
Now, let's have a look at the new function, C<initialise_node>: |
227 |
|
228 |
initialise_node "eg_simple_receiver"; |
229 |
|
230 |
Before we are able to send messages to other nodes we have to initialise |
231 |
ourself to become a "distributed node". Initialising a node means naming |
232 |
the node, optionally binding some TCP listeners so that other nodes can |
233 |
contact it and connecting to a predefined set of seed addresses so the |
234 |
node can discover the existing network - and the existing network can |
235 |
discover the node! |
236 |
|
237 |
The first argument, the string C<"eg_simple_receiver">, is the so-called |
238 |
I<profile> to use: A profile holds some information about the application |
239 |
that is going to be a node in an L<AnyEvent::MP> network. Customarily you |
240 |
don't specify a profile name at all: in this case, AnyEvent::MP will use |
241 |
the POSIX nodename. |
242 |
|
243 |
The profile allows you to set the I<node ID> that your application will |
244 |
use (the node ID defaults to the profile name if not specified). You can |
245 |
also set I<binds> in the profile, meaning that you can define TCP ports |
246 |
that the application will listen on for incoming connections from other |
247 |
nodes of the network. |
248 |
|
249 |
You should also configure I<seeds> in the profile: A I<seed> is just a |
250 |
TCP address of some other node in the network. To explain this a bit |
251 |
more detailed we have to look at the topology of an L<AnyEvent::MP> |
252 |
network. The topology is called a I<fully connected mesh>, here an example |
253 |
with 4 nodes: |
254 |
|
255 |
N1--N2 |
256 |
| \/ | |
257 |
| /\ | |
258 |
N3--N4 |
259 |
|
260 |
Now imagine another I<node> C<N5>. wants to connect itself to that network: |
261 |
|
262 |
N1--N2 |
263 |
| \/ | N5 |
264 |
| /\ | |
265 |
N3--N4 |
266 |
|
267 |
The new node needs to know the I<binds> of all nodes already |
268 |
connected. Exactly this is what the I<seeds> are for: Let's assume that |
269 |
the new node (C<N5>) uses the TCP address of the node C<N2> as seed. This |
270 |
cuases it to connect to C<N2>: |
271 |
|
272 |
N1--N2____ |
273 |
| \/ | N5 |
274 |
| /\ | |
275 |
N3--N4 |
276 |
|
277 |
C<N2> then tells C<N5> about the I<binds> of the other nodes it is |
278 |
connected to, and C<N5> creates the rest of the connections: |
279 |
|
280 |
/--------\ |
281 |
N1--N2____| |
282 |
| \/ | N5 |
283 |
| /\ | /| |
284 |
N3--N4--- | |
285 |
\________/ |
286 |
|
287 |
All done: C<N5> is now happily connected to the rest of the network. |
288 |
|
289 |
=head3 Setting Up The Profiles |
290 |
|
291 |
Ok, so much to the profile. Now let's setup the C<eg_simple_receiver> |
292 |
I<profile> for later use. For the receiver we just give the receiver a |
293 |
I<bind>: |
294 |
|
295 |
aemp profile eg_simple_receiver setbinds localhost:12266 |
296 |
|
297 |
We use C<localhost> in the example, but in the real world, you usually |
298 |
want to use the "real" IP address of your node, so hosts can connect to |
299 |
it. Of course, you can specify many binds, and it is also perfectly useful |
300 |
to run multiple nodes on the same host. Just keep in mind that other nodes |
301 |
will try to I<connect> to those addresses, and this better succeeds if you |
302 |
want your network to be in good working conditions. |
303 |
|
304 |
While we are at it, we setup the I<profile> for the sender in the |
305 |
second part of this example, too. We will call the sender I<profile> |
306 |
C<eg_simple_sender>. For the sender we set up a I<seed> pointing to the |
307 |
receiver: |
308 |
|
309 |
aemp profile eg_simple_sender setseeds localhost:12266 |
310 |
aemp profile eg_simple_sender setbinds |
311 |
|
312 |
You might wonder why we setup I<binds> to be empty here: actually, the the |
313 |
I<fully> in the I<fully connected mesh> is not the complete truth: If you |
314 |
don't configure any I<binds> for a node profile it will parse and try to |
315 |
resolve the node ID to find addresses to bind to. In this case we pretend |
316 |
that we do not want this and epxlicitly specify an empty binds list, so |
317 |
the node will not actually listen on any TCP ports. |
318 |
|
319 |
Nodes without listeners will not be able to send messages to other nodes |
320 |
without listeners, but they can still talk to all other nodes. For this |
321 |
example, as well as in many cases in the real world, we can live with this |
322 |
restriction, and this makes it easier to avoid DNS (assuming your setup is |
323 |
broken, eliminating one potential problem :). |
324 |
|
325 |
=head3 Registering The Receiver |
326 |
|
327 |
Ok, where were we. We now discussed the basic purpose of L<AnyEvent::MP::Global> |
328 |
and initialise_node with it's relations to profiles. We also setup our profiles |
329 |
for later use and now have to continue talking about the receiver example. |
330 |
|
331 |
Lets look at the next undiscussed line(s) of code: |
332 |
|
333 |
my $port = port; |
334 |
AnyEvent::MP::Global::register $port, "eg_receivers"; |
335 |
|
336 |
The C<port> function already has been discussed. It just creates a new I<port> |
337 |
and gives us the I<port id>. Now to the C<register> function of |
338 |
L<AnyEvent::MP::Global>: The first argument is a I<port id> that we want to add |
339 |
to a I<global group>, and it's second argument is the name of that I<global |
340 |
group>. |
341 |
|
342 |
You can choose that name of such a I<global group> freely, and it's purpose is |
343 |
to store a set of I<port ids>. That set is made available throughout the whole |
344 |
L<AnyEvent::MP> network, so that each node can see which ports belong to that |
345 |
group. |
346 |
|
347 |
The sender will later look for the ports in that I<global group> and send |
348 |
messages to them. |
349 |
|
350 |
Last step in the example is to setup a receiver callback for those messages |
351 |
like we have discussed in the first example. We again match for the I<tag> |
352 |
C<test>. The difference is just that we don't end the application after |
353 |
receiving the first message. We just infinitely continue to look out for new |
354 |
messages. |
355 |
|
356 |
=head2 The Sender |
357 |
|
358 |
Ok, now lets take a look at the sender: |
359 |
|
360 |
#!/opt/perl/bin/perl |
361 |
use AnyEvent; |
362 |
use AnyEvent::MP; |
363 |
use AnyEvent::MP::Global; |
364 |
|
365 |
initialise_node "eg_simple_sender"; |
366 |
|
367 |
my $find_timer = |
368 |
AnyEvent->timer (after => 0, interval => 1, cb => sub { |
369 |
my $ports = AnyEvent::MP::Global::find "eg_receivers" |
370 |
or return; |
371 |
|
372 |
snd $_, test => time |
373 |
for @$ports; |
374 |
}); |
375 |
|
376 |
AnyEvent->condvar->recv; |
377 |
|
378 |
It's even less code. The C<initialise_node> is known now from the receiver |
379 |
above. As discussed in the section where we setup the profiles we configure |
380 |
this application to use the I<profile> C<eg_simple_sender>. |
381 |
|
382 |
Next we setup a timer that repeatedly calls this chunk of code: |
383 |
|
384 |
my $ports = AnyEvent::MP::Global::find "eg_receivers" |
385 |
or return; |
386 |
|
387 |
snd $_, test => time |
388 |
for @$ports; |
389 |
|
390 |
The new function here is the C<find> function of L<AnyEvent::MP::Global>. It |
391 |
searches in the I<global group> named C<eg_receivers> for ports. If none are |
392 |
found C<undef> is returned and we wait for the next time the timer fires. |
393 |
|
394 |
In case the receiver application has been connected and the newly added port by |
395 |
the receiver has propagated to the sender C<find> returns an array reference |
396 |
that contains the I<port id> of the receiver I<port(s)>. |
397 |
|
398 |
We then just send to every I<port> in the I<global group> a message consisting |
399 |
of the I<tag> C<test> and the current time in form of a UNIX timestamp. |
400 |
|
401 |
And thats all. |
402 |
|
403 |
=head1 SEE ALSO |
404 |
|
405 |
L<AnyEvent> |
406 |
|
407 |
L<AnyEvent::Handle> |
408 |
|
409 |
L<AnyEvent::MP> |
410 |
|
411 |
L<AnyEvent::MP::Global> |
412 |
|
413 |
=head1 AUTHOR |
414 |
|
415 |
Robin Redeker <elmex@ta-sa.org> |
416 |
|