ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-MP/MP/Intro.pod
Revision: 1.47
Committed: Sun Mar 4 19:45:03 2012 UTC (12 years, 3 months ago) by root
Branch: MAIN
Changes since 1.46: +129 -143 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 Message Passing for the Non-Blocked Mind
2
3 =head1 Introduction and Terminology
4
5 This is a tutorial about how to get the swing of the new L<AnyEvent::MP>
6 module, which allows programs to transparently pass messages within the
7 process and to other processes on the same or a different host.
8
9 What kind of messages? Basically a message here means a list of Perl
10 strings, numbers, hashes and arrays, anything that can be expressed as a
11 L<JSON> text (as JSON is the default serialiser in the protocol). Here are
12 two examples:
13
14 write_log => 1251555874, "action was successful.\n"
15 123, ["a", "b", "c"], { foo => "bar" }
16
17 When using L<AnyEvent::MP> it is customary to use a descriptive string as
18 first element of a message that indicates the type of the message. This
19 element is called a I<tag> in L<AnyEvent::MP>, as some API functions
20 (C<rcv>) support matching it directly.
21
22 Supposedly you want to send a ping message with your current time to
23 somewhere, this is how such a message might look like (in Perl syntax):
24
25 ping => 1251381636
26
27 Now that we know what a message is, to which entities are those
28 messages being I<passed>? They are I<passed> to I<ports>. A I<port> is
29 a destination for messages but also a context to execute code: when
30 a runtime error occurs while executing code belonging to a port, the
31 exception will be raised on the port and can even travel to interested
32 parties on other nodes, which makes supervision of distributed processes
33 easy.
34
35 How do these ports relate to things you know? Each I<port> belongs
36 to a I<node>, and a I<node> is just the UNIX process that runs your
37 L<AnyEvent::MP> application.
38
39 Each I<node> is distinguished from other I<nodes> running on the same or
40 another host in a network by its I<node ID>. A I<node ID> is simply a
41 unique string chosen manually or assigned by L<AnyEvent::MP> in some way
42 (UNIX nodename, random string...).
43
44 Here is a diagram about how I<nodes>, I<ports> and UNIX processes relate
45 to each other. The setup consists of two nodes (more are of course
46 possible): Node C<A> (in UNIX process 7066) with the ports C<ABC> and
47 C<DEF>. And the node C<B> (in UNIX process 8321) with the ports C<FOO> and
48 C<BAR>.
49
50
51 |- PID: 7066 -| |- PID: 8321 -|
52 | | | |
53 | Node ID: A | | Node ID: B |
54 | | | |
55 | Port ABC =|= <----\ /-----> =|= Port FOO |
56 | | X | |
57 | Port DEF =|= <----/ \-----> =|= Port BAR |
58 | | | |
59 |-------------| |-------------|
60
61 The strings for the I<port IDs> here are just for illustrative
62 purposes: Even though I<ports> in L<AnyEvent::MP> are also identified by
63 strings, they can't be chosen manually and are assigned by the system
64 dynamically. These I<port IDs> are unique within a network and can also be
65 used to identify senders, or even as message tags for instance.
66
67 The next sections will explain the API of L<AnyEvent::MP> by going through
68 a few simple examples. Later some more complex idioms are introduced,
69 which are hopefully useful to solve some real world problems.
70
71 =head2 Passing Your First Message
72
73 For starters, let's have a look at the messaging API. The following
74 example is just a demo to show the basic elements of message passing with
75 L<AnyEvent::MP>.
76
77 The example should print: C<Ending with: 123>, in a rather complicated
78 way, by passing some message to a port.
79
80 use AnyEvent;
81 use AnyEvent::MP;
82
83 my $end_cv = AnyEvent->condvar;
84
85 my $port = port;
86
87 rcv $port, test => sub {
88 my ($data) = @_;
89 $end_cv->send ($data);
90 };
91
92 snd $port, test => 123;
93
94 print "Ending with: " . $end_cv->recv . "\n";
95
96 It already uses most of the essential functions inside
97 L<AnyEvent::MP>: First there is the C<port> function which creates a
98 I<port> and will return it's I<port ID>, a simple string.
99
100 This I<port ID> can be used to send messages to the port and install
101 handlers to receive messages on the port. Since it is a simple string
102 it can be safely passed to other I<nodes> in the network when you want
103 to refer to that specific port (usually used for RPC, where you need
104 to tell the other end which I<port> to send the reply to - messages in
105 L<AnyEvent::MP> have a destination, but no source).
106
107 The next function is C<rcv>:
108
109 rcv $port, test => sub { ... };
110
111 It installs a receiver callback on the I<port> that specified as the first
112 argument (it only works for "local" ports, i.e. ports created on the same
113 node). The next argument, in this example C<test>, specifies a I<tag> to
114 match. This means that whenever a message with the first element being
115 the string C<test> is received, the callback is called with the remaining
116 parts of that message.
117
118 Messages can be sent with the C<snd> function, which is used like this in
119 the example above:
120
121 snd $port, test => 123;
122
123 This will send the message C<'test', 123> to the I<port> with the I<port
124 ID> stored in C<$port>. Since in this case the receiver has a I<tag> match
125 on C<test> it will call the callback with the first argument being the
126 number C<123>.
127
128 The callback is a typical AnyEvent idiom: the callback just passes
129 that number on to the I<condition variable> C<$end_cv> which will then
130 pass the value to the print. Condition variables are out of the scope
131 of this tutorial and not often used with ports, so please consult the
132 L<AnyEvent::Intro> about them.
133
134 Passing messages inside just one process is boring. Before we can move on
135 and do interprocess message passing we first have to make sure some things
136 have been set up correctly for our nodes to talk to each other.
137
138 =head2 System Requirements and System Setup
139
140 Before we can start with real IPC we have to make sure some things work on
141 your system.
142
143 First we have to setup a I<shared secret>: for two L<AnyEvent::MP>
144 I<nodes> to be able to communicate with each other over the network it is
145 necessary to setup the same I<shared secret> for both of them, so they can
146 prove their trustworthyness to each other.
147
148 The easiest way is to set this up is to use the F<aemp> utility:
149
150 aemp gensecret
151
152 This creates a F<$HOME/.perl-anyevent-mp> config file and generates a
153 random shared secret. You can copy this file to any other system and
154 then communicate over the network (via TCP) with it. You can also select
155 your own shared secret (F<aemp setsecret>) and for increased security
156 requirements you can even create (or configure) a TLS certificate (F<aemp
157 gencert>), causing connections to not just be securely authenticated, but
158 also to be encrypted and protected against tinkering.
159
160 Connections will only be successfully established when the I<nodes>
161 that want to connect to each other have the same I<shared secret> (or
162 successfully verify the TLS certificate of the other side, in which case
163 no shared secret is required).
164
165 B<If something does not work as expected, and for example tcpdump shows
166 that the connections are closed almost immediately, you should make sure
167 that F<~/.perl-anyevent-mp> is the same on all hosts/user accounts that
168 you try to connect with each other!>
169
170 Thats is all for now, you will find some more advanced fiddling with the
171 C<aemp> utility later.
172
173 =head2 Shooting the Trouble
174
175 Sometimes things go wrong, and AnyEvent::MP, being a professional module,
176 does not gratuitously spill out messages to your screen.
177
178 To help troubleshooting any issues, there are two environment variables
179 that you can set. The first, C<PERL_ANYEVENT_MP_WARNLEVEL> sets the
180 logging level. The default is C<5>, which means nothing much is
181 printed. You can increase it to C<8> or C<9> to get more verbose
182 output. This is example output when starting a node:
183
184 2012-03-04 19:41:10 <8> node cerebro starting up.
185 2012-03-04 19:41:10 <8> node listens on [10.0.0.1:4040].
186 2012-03-04 19:41:10 <9> trying connect to seed node 10.0.0.19:4040.
187 2012-03-04 19:41:10 <9> 10.0.0.19:4040 connected as rain
188 2012-03-04 19:41:10 <7> rain is up ()
189
190 A lot of info, but at least you can see that it does something.
191
192 The other environment variable that can be useful is
193 C<PERL_ANYEVENT_MP_TRACE>, which, when set to a true value, will cause
194 most messages that are sent or received to be printed. For example, F<aemp
195 restart rijk> might output these message exchanges:
196
197 SND rijk <- [null,"eval","AnyEvent::Watchdog::Util::restart; ()","aemp/cerebro/z4kUPp2JT4#b"]
198 SND rain <- [null,"g_slave",{"'l":{"aemp/cerebro/z4kUPp2JT4":["10.0.0.1:48168"]}}]
199 SND rain <- [null,"g_find","rijk"]
200 RCV rain -> ["","g_found","rijk",["10.0.0.23:4040"]]
201 RCV rijk -> ["b",""]
202
203 =head1 PART 1: Passing Messages Between Processes
204
205 =head2 The Receiver
206
207 Lets split the previous example up into two programs: one that contains
208 the sender and one for the receiver. First the receiver application, in
209 full:
210
211 use AnyEvent;
212 use AnyEvent::MP;
213
214 configure nodeid => "eg_receiver/%u", binds => ["*:4040"];
215
216 my $port = port;
217 db_set eg_receivers => $port;
218
219 rcv $port, test => sub {
220 my ($data, $reply_port) = @_;
221
222 print "Received data: " . $data . "\n";
223 };
224
225 AnyEvent->condvar->recv;
226
227 =head3 AnyEvent::MP::Global
228
229 Now, that wasn't too bad, was it? OK, let's step through the new functions
230 that have been used.
231
232 =head3 C<configure> and Joining and Maintaining the Network
233
234 First let's have a look at C<configure>:
235
236 configure nodeid => "eg_receiver/%u", binds => ["*:4040"];
237
238 Before we are able to send messages to other nodes we have to initialise
239 ourself to become a "distributed node". Initialising a node means naming
240 the node and binding some TCP listeners so that other nodes can
241 contact it.
242
243 Additionally, to actually link all nodes in a network together, you can
244 specify a number of seed addresses, which will be used by the node to
245 connect itself into an existing network, as we will see shortly.
246
247 All of this (and more) can be passed to the C<configure> function - later
248 we will see how we can do all this without even passing anything to
249 C<configure>!
250
251 The first parameter, C<nodeid>, specified the node ID (in this case
252 C<eg_receiver/%u> - the default is to use the node name of the current
253 host plus C</%u>, which goves the node a name with a random suffix to
254 make it unique, but for this example we want the node to have a bit more
255 personality, and name it C<eg_receiver> with a random suffix.
256
257 Why the random suffix? Node IDs need to be unique within the network and
258 appending a random suffix is the easiest way to do that.
259
260 The second parameter, C<binds>, specifies a list of C<address:port> pairs
261 to bind TCP listeners on. The special "address" of C<*> means to bind on
262 every local IP address (this might not work on every OS, so explicit IP
263 addresses are best).
264
265 The reason to bind on a TCP port is not just that other nodes can connect
266 to us: if no binds are specified, the node will still bind on a dynamic
267 port on all local addresses - but in this case we won't know the port, and
268 cannot tell other nodes to connect to it as seed node.
269
270 Now, a I<seed> is simply the TCP address of some other node in the
271 network, often the same string as used for the C<binds> parameter of the
272 other node. The need for seeds is easy to explain: I<somehow> the nodes
273 of an aemp network have to find each other, and often this means over the
274 internet. So broadcasts are out.
275
276 Instead, a node usually specifies the addresses of a few (for redundancy)
277 other nodes, some of which should be up. Two nodes can set each other as
278 seeds without any issues. You could even specify all nodes as seeds for
279 all nodes, for total redundancy. But the common case is to have some more
280 or less central, stable servers running seed services for other nodes.
281
282 All you need to do to ensure that an AnyEvent::MP network connects
283 together is to make sure that all connections from nodes to their seed
284 nodes I<somehow> span the whole network. The simplest way to do that would
285 be for all nodes to specify a single node as seed node, and you would get
286 a star topology. If you specify all nodes as seed nodes, you get a fully
287 meshed network (that's what previous releases of AnyEvent::MP actually
288 did).
289
290 A node tries to keep connections open to all of it's seed nodes at all
291 times, while other connections are made on demand only.
292
293 All of this ensures that the network stays one network - even if all the
294 nodes in one half of the net are separated from the nodes in the other
295 half by some network problem, once that is over, they will eventually
296 become a single network again.
297
298 In addition to creating the network, a node also expects the seed nodes to
299 run the shared database service - if need be, by automatically starting it,
300 so you don't normally need to configure this explicitly.
301
302 #TODO# later?#d#
303 The process of joining a network takes time, during which the node
304 is already running. This means it takes time until the node is
305 fully connected, and information about services in the network are
306 available. This is why most AnyEvent::MP programs start by waiting a while
307 until the information they need is available.
308
309 We will see how this is done later, in the sender program.
310
311 =head3 Registering the Receiver
312
313 Coming back to our example, after the node has been configured for network
314 access, it is time to publish some service, namely the receive service.
315
316 For that, let's look at the next lines:
317
318 my $port = port;
319 db_set eg_receivers => $port;
320
321 The C<port> function has already been discussed. It simply creates a new
322 I<port> and returns the I<port ID>. The C<db_reg> function, however, is
323 new: The first argument is the name of a I<database family> and the second
324 argument is the name of a I<subkey> within that family. The third argument
325 would be the I<value> to be associated with the family and subkey, but,
326 since it is missing, it will simply be C<undef>.
327
328 Ok, what's this weird tlak about families you wonder - AnyEvent::MP comes
329 with a distributed database. This database runs on so-called "global"
330 nodes, which usually are the seed nodes of your network. The database
331 structure is "simply" a hash of hashes of values.
332
333 In other words, if the database were stored in C<%DB>, then the C<db_set>
334 function more or less would do this:
335
336 $DB{eg_receivers}{$port} = undef;
337
338 So the ominous "family" selects a hash in the database, and the "subkey"
339 is simply the key in this hash. And C<db_set> very much works like an
340 assignment.
341
342 The family namespace is shared by all nodes in a network, so the names
343 should be reasonably unique, for example, they could start with the name
344 of your module, or the name of the program.
345
346 The purpose behind adding this key to the database is that the sender can
347 look it up and find our port. We will shortly see how.
348
349 The last step in the example is to set up a receiver callback for those
350 messages, just as was discussed in the first example. We again match
351 for the tag C<test>. The difference is that this time we don't exit the
352 application after receiving the first message. Instead we continue to wait
353 for new messages indefinitely.
354
355 =head2 The Sender
356
357 Ok, now let's take a look at the sender code:
358
359 use AnyEvent;
360 use AnyEvent::MP;
361
362 configure nodeid => "eg_sender/%u", seeds => ["*:4040"];
363
364 my $guard = db_mon eg_receivers => sub {
365 my ($family, $keys) = @_;
366 return unless %$family;
367
368 # now there are some receivers, send them a message
369 snd $_ => test => time, keys %$family
370 for keys %$family;
371 };
372
373 AnyEvent->condvar->recv;
374
375 It's even less code. The C<configure> serves the same purpose as in the
376 receiver, but instead of specifying binds we specify a list of seeds -
377 the seed happens to be the same as the bind used by the receiver, which
378 therefore becomes our seed node.
379
380 Remember the part about having to wait till things become available? After
381 configure returns, nothing has been done yet - the node is not connected
382 to the network, knows nothing about the database contents, and it can take
383 ages (for a computer :) for this situation to change.
384
385 Therefore, the sender waits, in this case by using the C<db_mon>
386 function. This function registers an interest in a specific database
387 family (C<eg_receivers>). Each time something inside the family changes
388 (a key is added, changed or deleted), it will call our callback with the
389 family hash as first argument, and the list of keys as second argument.
390
391 The callback only checks whether the C<%$family> has is empty - in this
392 case it does nothing.
393
394 Eventually the family will, however, contain the port we set in the
395 sender. Then it will send a message to it and any other receiver in the
396 group.
397
398 You can experiment by having multiple receivers - you have to change the
399 "binds" parameter in the receiver to the seeds used in the sender to start
400 up additional receivers, but then you can start as many as you like. If
401 you specify proper IP addresses for the seeds, you can even run them on
402 different computers.
403
404 Each time you start the sender, it will send a message to all receivers it
405 finds (you have to interrupt it manualy afterwards).
406
407 Things you could try include using C<PERL_ANYEVENT_MP_TRACE=1> to see
408 which messages are exchanged, or starting the sender first and see how
409 long it takes it to find the receiver.
410
411 =head3 Splitting Network Configuration and Application Code
412
413 #TODO#
414 OK, so far, this works. In the real world, however, the person configuring
415 your application to run on a specific network (the end user or network
416 administrator) is often different to the person coding the application.
417
418 Or to put it differently: the arguments passed to configure are usually
419 provided not by the programmer, but by whoever is deploying the program.
420
421 To make this easy, AnyEvent::MP supports a simple configuration database,
422 using profiles, which can be managed using the F<aemp> command-line
423 utility (yes, this section is about the advanced tinkering we mentioned
424 before).
425
426 When you change both programs above to simply call
427
428 configure;
429
430 then AnyEvent::MP tries to look up a profile using the current node name
431 in its configuration database, falling back to some global default.
432
433 You can run "generic" nodes using the F<aemp> utility as well, and we will
434 exploit this in the following way: we configure a profile "seed" and run
435 a node using it, whose sole purpose is to be a seed node for our example
436 programs.
437
438 We bind the seed node to port 4040 on all interfaces:
439
440 aemp profile seed binds "*:4040"
441
442 And we configure all nodes to use this as seed node (this only works when
443 running on the same host, for multiple machines you would provide the IP
444 address or hostname of the node running the seed), and use a random name
445 (because we want to start multiple nodes on the same host):
446
447 aemp seeds "*:4040" nodeid anon/
448
449 Then we run the seed node:
450
451 aemp run profile seed
452
453 After that, we can start as many other nodes as we want, and they will all
454 use our generic seed node to discover each other.
455
456 In fact, starting many receivers nicely illustrates that the time sender
457 can have multiple receivers.
458
459 That's all for now - next we will teach you about monitoring by writing a
460 simple chat client and server :)
461
462 =head1 PART 2: Monitoring, Supervising, Exception Handling and Recovery
463
464 That's a mouthful, so what does it mean? Our previous example is what one
465 could call "very loosely coupled" - the sender doesn't care about whether
466 there are any receivers, and the receivers do not care if there is any
467 sender.
468
469 This can work fine for simple services, but most real-world applications
470 want to ensure that the side they are expecting to be there is actually
471 there. Going one step further: most bigger real-world applications even
472 want to ensure that if some component is missing, or has crashed, it will
473 still be there, by recovering and restarting the service.
474
475 AnyEvent::MP supports this by catching exceptions and network problems,
476 and notifying interested parties of this.
477
478 =head2 Exceptions, Port Context, Network Errors and Monitors
479
480 =head3 Exceptions
481
482 Exceptions are handled on a per-port basis: receive callbacks are executed
483 in a special context, the so-called I<port-context>: code that throws an
484 otherwise uncaught exception will cause the port to be C<kil>led. Killed
485 ports are destroyed automatically (killing ports is the only way to free
486 ports, incidentally).
487
488 Ports can be monitored, even from a different host, and when a port is
489 killed any entity monitoring it will be notified.
490
491 Here is a simple example:
492
493 use AnyEvent::MP;
494
495 # create a port, it always dies
496 my $port = port { die "oops" };
497
498 # monitor it
499 mon $port, sub {
500 warn "$port was killed (with reason @_)";
501 };
502
503 # now send it some message, causing it to die:
504 snd $port;
505
506 It first creates a port whose only action is to throw an exception,
507 and the monitors it with the C<mon> function. Afterwards it sends it a
508 message, causing it to die and call the monitoring callback:
509
510 anon/6WmIpj.a was killed (with reason die oops at xxx line 5.) at xxx line 9.
511
512 The callback was actually passed two arguments: C<die> (to indicate it did
513 throw an exception as opposed to, say, a network error) and the exception
514 message itself.
515
516 What happens when a port is killed before we have a chance to monitor
517 it? Granted, this is highly unlikely in our example, but when you program
518 in a network this can easily happen due to races between nodes.
519
520 use AnyEvent::MP;
521
522 my $port = port { die "oops" };
523
524 snd $port;
525
526 mon $port, sub {
527 warn "$port was killed (with reason @_)";
528 };
529
530 This time we will get something like:
531
532 anon/zpX.a was killed (with reason no_such_port cannot monitor nonexistent port)
533
534 Since the port was already gone, the kill reason is now C<no_such_port>
535 with some descriptive (we hope) error message.
536
537 In fact, the kill reason is usually some identifier as first argument
538 and a human-readable error message as second argument, but can be about
539 anything (it's a list) or even nothing - which is called a "normal" kill.
540
541 You can kill ports manually using the C<kil> function, which will be
542 treated like an error when any reason is specified:
543
544 kil $port, custom_error => "don't like your steenking face";
545
546 And a clean kill without any reason arguments:
547
548 kil $port;
549
550 By now you probably wonder what this "normal" kill business is: A common
551 idiom is to not specify a callback to C<mon>, but another port, such as
552 C<$SELF>:
553
554 mon $port, $SELF;
555
556 This basically means "monitor $port and kill me when it crashes". And a
557 "normal" kill does not count as a crash. This way you can easily link
558 ports together and make them crash together on errors (but allow you to
559 remove a port silently).
560
561 =head3 Port Context
562
563 When code runs in an environment where C<$SELF> contains its own port ID
564 and exceptions will be caught, it is said to run in a port context.
565
566 Since AnyEvent::MP is event-based, it is not uncommon to register
567 callbacks from C<rcv> handlers. As example, assume that the port receive
568 handler wants to C<die> a second later, using C<after>:
569
570 my $port = port {
571 after 1, sub { die "oops" };
572 };
573
574 Then you will find it does not work - when the after callback is executed,
575 it does not run in port context anymore, so exceptions will not be caught.
576
577 For these cases, AnyEvent::MP exports a special "closure constructor"
578 called C<psub>, which works just like perls built-in C<sub>:
579
580 my $port = port {
581 after 1, psub { die "oops" };
582 };
583
584 C<psub> stores C<$SELF> and returns a code reference. When the code
585 reference is invoked, it will run the code block within the context of
586 that port, so exception handling once more works as expected.
587
588 There is also a way to temporarily execute code in the context of some
589 port, namely C<peval>:
590
591 peval $port, sub {
592 # die'ing here will kil $port
593 };
594
595 The C<peval> function temporarily replaces C<$SELF> by the given C<$port>
596 and then executes the given sub in a port context.
597
598 =head3 Network Errors and the AEMP Guarantee
599
600 I mentioned another important source of monitoring failures: network
601 problems. When a node loses connection to another node, it will invoke all
602 monitoring actions as if the port was killed, even if it is possible that
603 the port still lives happily on another node (not being able to talk to a
604 node means we have no clue what's going on with it, it could be crashed,
605 but also still running without knowing we lost the connection).
606
607 So another way to view monitors is "notify me when some of my messages
608 couldn't be delivered". AEMP has a guarantee about message delivery to a
609 port: After starting a monitor, any message sent to a port will either
610 be delivered, or, when it is lost, any further messages will also be lost
611 until the monitoring action is invoked. After that, further messages
612 I<might> get delivered again.
613
614 This doesn't sound like a very big guarantee, but it is kind of the best
615 you can get while staying sane: Specifically, it means that there will
616 be no "holes" in the message sequence: all messages sent are delivered
617 in order, without any missing in between, and when some were lost, you
618 I<will> be notified of that, so you can take recovery action.
619
620 =head3 Supervising
621
622 Ok, so what is this crashing-everything-stuff going to make applications
623 I<more> stable? Well in fact, the goal is not really to make them more
624 stable, but to make them more resilient against actual errors and
625 crashes. And this is not done by crashing I<everything>, but by crashing
626 everything except a supervisor.
627
628 A supervisor is simply some code that ensures that an application (or a
629 part of it) is running, and if it crashes, is restarted properly.
630
631 To show how to do all this we will create a simple chat server that can
632 handle many chat clients. Both server and clients can be killed and
633 restarted, and even crash, to some extent.
634
635 =head2 Chatting, the Resilient Way
636
637 Without further ado, here is the chat server (to run it, we assume the
638 set-up explained earlier, with a separate F<aemp run> seed node):
639
640 use common::sense;
641 use AnyEvent::MP;
642 use AnyEvent::MP::Global;
643
644 configure;
645
646 my %clients;
647
648 sub msg {
649 print "relaying: $_[0]\n";
650 snd $_, $_[0]
651 for values %clients;
652 }
653
654 our $server = port;
655
656 rcv $server, join => sub {
657 my ($client, $nick) = @_;
658
659 $clients{$client} = $client;
660
661 mon $client, sub {
662 delete $clients{$client};
663 msg "$nick (quits, @_)";
664 };
665 msg "$nick (joins)";
666 };
667
668 rcv $server, privmsg => sub {
669 my ($nick, $msg) = @_;
670 msg "$nick: $msg";
671 };
672
673 grp_reg eg_chat_server => $server;
674
675 warn "server ready.\n";
676
677 AnyEvent->condvar->recv;
678
679 Looks like a lot, but it is actually quite simple: after your usual
680 preamble (this time we use common sense), we define a helper function that
681 sends some message to every registered chat client:
682
683 sub msg {
684 print "relaying: $_[0]\n";
685 snd $_, $_[0]
686 for values %clients;
687 }
688
689 The clients are stored in the hash C<%client>. Then we define a server
690 port and install two receivers on it, C<join>, which is sent by clients
691 to join the chat, and C<privmsg>, that clients use to send actual chat
692 messages.
693
694 C<join> is most complicated. It expects the client port and the nickname
695 to be passed in the message, and registers the client in C<%clients>.
696
697 rcv $server, join => sub {
698 my ($client, $nick) = @_;
699
700 $clients{$client} = $client;
701
702 The next step is to monitor the client. The monitoring action removes the
703 client and sends a quit message with the error to all remaining clients.
704
705 mon $client, sub {
706 delete $clients{$client};
707 msg "$nick (quits, @_)";
708 };
709
710 And finally, it creates a join message and sends it to all clients.
711
712 msg "$nick (joins)";
713 };
714
715 The C<privmsg> callback simply broadcasts the message to all clients:
716
717 rcv $server, privmsg => sub {
718 my ($nick, $msg) = @_;
719 msg "$nick: $msg";
720 };
721
722 And finally, the server registers itself in the server group, so that
723 clients can find it:
724
725 grp_reg eg_chat_server => $server;
726
727 Well, well... and where is this supervisor stuff? Well... we cheated,
728 it's not there. To not overcomplicate the example, we only put it into
729 the..... CLIENT!
730
731 =head3 The Client, and a Supervisor!
732
733 Again, here is the client, including supervisor, which makes it a bit
734 longer:
735
736 use common::sense;
737 use AnyEvent::MP;
738 use AnyEvent::MP::Global;
739
740 my $nick = shift;
741
742 configure;
743
744 my ($client, $server);
745
746 sub server_connect {
747 my $servernodes = grp_get "eg_chat_server"
748 or return after 1, \&server_connect;
749
750 print "\rconnecting...\n";
751
752 $client = port { print "\r \r@_\n> " };
753 mon $client, sub {
754 print "\rdisconnected @_\n";
755 &server_connect;
756 };
757
758 $server = $servernodes->[0];
759 snd $server, join => $client, $nick;
760 mon $server, $client;
761 }
762
763 server_connect;
764
765 my $w = AnyEvent->io (fh => 0, poll => 'r', cb => sub {
766 chomp (my $line = <STDIN>);
767 print "> ";
768 snd $server, privmsg => $nick, $line
769 if $server;
770 });
771
772 $| = 1;
773 print "> ";
774 AnyEvent->condvar->recv;
775
776 The first thing the client does is to store the nick name (which is
777 expected as the only command line argument) in C<$nick>, for further
778 usage.
779
780 The next relevant thing is... finally... the supervisor:
781
782 sub server_connect {
783 my $servernodes = grp_get "eg_chat_server"
784 or return after 1, \&server_connect;
785
786 This looks up the server in the C<eg_chat_server> global group. If it
787 cannot find it (which is likely when the node is just starting up),
788 it will wait a second and then retry. This "wait a bit and retry"
789 is an important pattern, as distributed programming means lots of
790 things are going on asynchronously. In practise, one should use a more
791 intelligent algorithm, to possibly warn after an excessive number of
792 retries. Hopefully future versions of AnyEvent::MP will offer some
793 predefined supervisors, for now you will have to code it on your own.
794
795 Next it creates a local port for the server to send messages to, and
796 monitors it. When the port is killed, it will print "disconnected" and
797 tell the supervisor function to retry again.
798
799 $client = port { print "\r \r@_\n> " };
800 mon $client, sub {
801 print "\rdisconnected @_\n";
802 &server_connect;
803 };
804
805 Then everything is ready: the client will send a C<join> message with it's
806 local port to the server, and start monitoring it:
807
808 $server = $servernodes->[0];
809 snd $server, join => $client, $nick;
810 mon $server, $client;
811 }
812
813 The monitor will ensure that if the server crashes or goes away, the
814 client will be killed as well. This tells the user that the client was
815 disconnected, and will then start to connect the server again.
816
817 The rest of the program deals with the boring details of actually invoking
818 the supervisor function to start the whole client process and handle the
819 actual terminal input, sending it to the server.
820
821 You should now try to start the server and one or more clients in different
822 terminal windows (and the seed node):
823
824 perl eg/chat_client nick1
825 perl eg/chat_client nick2
826 perl eg/chat_server
827 aemp run profile seed
828
829 And then you can experiment with chatting, killing one or more clients, or
830 stopping and restarting the server, to see the monitoring in action.
831
832 The crucial point you should understand from this example is that
833 monitoring is usually symmetric: when you monitor some other port,
834 potentially on another node, that other port usually should monitor you,
835 too, so when the connection dies, both ports get killed, or at least both
836 sides can take corrective action. Exceptions are "servers" that serve
837 multiple clients at once and might only wish to clean up, and supervisors,
838 who of course should not normally get killed (unless they, too, have a
839 supervisor).
840
841 If you often think in object-oriented terms, then treat a port as an
842 object, C<port> is the constructor, the receive callbacks set by C<rcv>
843 act as methods, the C<kil> function becomes the explicit destructor and
844 C<mon> installs a destructor hook. Unlike conventional object oriented
845 programming, it can make sense to exchange ports more freely (for example,
846 to monitor one port from another).
847
848 There is ample room for improvement: the server should probably remember
849 the nickname in the C<join> handler instead of expecting it in every chat
850 message, it should probably monitor itself, and the client should not try
851 to send any messages unless a server is actually connected.
852
853 =head1 PART 3: TIMTOWTDI: Virtual Connections
854
855 The chat system developed in the previous sections is very "traditional"
856 in a way: you start some server(s) and some clients statically and they
857 start talking to each other.
858
859 Sometimes applications work more like "services": They can run on almost
860 any node and talks to itself on other nodes. The L<AnyEvent::MP::Global>
861 service for example monitors nodes joining the network and starts itself
862 automatically on other nodes (if it isn't running already).
863
864 A good way to design such applications is to put them into a module and
865 create "virtual connections" to other nodes - we call this the "bridge
866 head" method, because you start by creating a remote port (the bridge
867 head) and from that you start to bootstrap your application.
868
869 Since that sounds rather theoretical, let's redesign the chat server and
870 client using this design method.
871
872 Here is the server:
873
874 use common::sense;
875 use AnyEvent::MP;
876 use AnyEvent::MP::Global;
877
878 configure;
879
880 grp_reg eg_chat_server2 => $NODE;
881
882 my %clients;
883
884 sub msg {
885 print "relaying: $_[0]\n";
886 snd $_, $_[0]
887 for values %clients;
888 }
889
890 sub client_connect {
891 my ($client, $nick) = @_;
892
893 mon $client;
894 mon $client, sub {
895 delete $clients{$client};
896 msg "$nick (quits, @_)";
897 };
898
899 $clients{$client} = $client;
900
901 msg "$nick (joins)";
902
903 rcv $SELF, sub { msg "$nick: $_[0]" };
904 }
905
906 warn "server ready.\n";
907
908 AnyEvent->condvar->recv;
909
910 It starts out not much different then the previous example, except that
911 this time, we register the node port in the global group and not any port
912 we created - the clients only want to know which node the server should be
913 running on. In fact, they could also use some kind of election mechanism,
914 to find the node with lowest load or something like that.
915
916 The more interesting change is that indeed no server port is created -
917 the server consists only of code, and "does" nothing by itself. All it
918 does is define a function C<client_connect>, which expects a client port
919 and a nick name as arguments. It then monitors the client port and binds
920 a receive callback on C<$SELF>, which expects messages that in turn are
921 broadcast to all clients.
922
923 The two C<mon> calls are a bit tricky - the first C<mon> is a shorthand
924 for C<mon $client, $SELF>. The second does the normal "client has gone
925 away" clean-up action. Both could actually be rolled into one C<mon>
926 action.
927
928 C<$SELF> is a good hint that something interesting is going on. And
929 indeed, when looking at the client code, there is a new function,
930 C<spawn>:
931
932 use common::sense;
933 use AnyEvent::MP;
934 use AnyEvent::MP::Global;
935
936 my $nick = shift;
937
938 configure;
939
940 $| = 1;
941
942 my $port = port;
943
944 my ($client, $server);
945
946 sub server_connect {
947 my $servernodes = grp_get "eg_chat_server2"
948 or return after 1, \&server_connect;
949
950 print "\rconnecting...\n";
951
952 $client = port { print "\r \r@_\n> " };
953 mon $client, sub {
954 print "\rdisconnected @_\n";
955 &server_connect;
956 };
957
958 $server = spawn $servernodes->[0], "::client_connect", $client, $nick;
959 mon $server, $client;
960 }
961
962 server_connect;
963
964 my $w = AnyEvent->io (fh => 0, poll => 'r', cb => sub {
965 chomp (my $line = <STDIN>);
966 print "> ";
967 snd $server, $line
968 if $server;
969 });
970
971 print "> ";
972 AnyEvent->condvar->recv;
973
974 The client is quite similar to the previous one, but instead of contacting
975 the server I<port> (which no longer exists), it C<spawn>s (creates) a new
976 the server I<port on node>:
977
978 $server = spawn $servernodes->[0], "::client_connect", $client, $nick;
979 mon $server, $client;
980
981 And of course the first thing after creating it is monitoring it.
982
983 The C<spawn> function creates a new port on a remote node and returns
984 its port ID. After creating the port it calls a function on the remote
985 node, passing any remaining arguments to it, and - most importantly -
986 executes the function within the context of the new port, so it can be
987 manipulated by referring to C<$SELF>. The init function can reside in a
988 module (actually it normally I<should> reside in a module) - AnyEvent::MP
989 will automatically load the module if the function isn't defined.
990
991 The C<spawn> function returns immediately, which means you can instantly
992 send messages to the port, long before the remote node has even heard
993 of our request to create a port on it. In fact, the remote node might
994 not even be running. Despite these troubling facts, everything should
995 work just fine: if the node isn't running (or the init function throws an
996 exception), then the monitor will trigger because the port doesn't exist.
997
998 If the spawn message gets delivered, but the monitoring message is not
999 because of network problems (extremely unlikely, but monitoring, after
1000 all, is implemented by passing a message, and messages can get lost), then
1001 this connection loss will eventually trigger the monitoring action. On the
1002 remote node (which in return monitors the client) the port will also be
1003 cleaned up on connection loss. When the remote node comes up again and our
1004 monitoring message can be delivered, it will instantly fail because the
1005 port has been cleaned up in the meantime.
1006
1007 If your head is spinning by now, that's fine - just keep in mind, after
1008 creating a port, monitor it on the local node, and monitor "the other
1009 side" from the remote node, and all will be cleaned up just fine.
1010
1011 =head2 Services
1012
1013 Above it was mentioned that C<spawn> automatically loads modules, and this
1014 can be exploited in various ways.
1015
1016 Assume for a moment you put the server into a file called
1017 F<mymod/chatserver.pm> reachable from the current directory. Then you
1018 could run a node there with:
1019
1020 aemp run
1021
1022 The other nodes could C<spawn> the server by using
1023 C<mymod::chatserver::client_connect> as init function.
1024
1025 Likewise, when you have some service that starts automatically (similar to
1026 AnyEvent::MP::Global), then you can configure this service statically:
1027
1028 aemp profile mysrvnode services mymod::service::
1029 aemp run profile mysrvnode
1030
1031 And the module will automatically be loaded in the node, as specifying a
1032 module name (with C<::>-suffix) will simply load the module, which is then
1033 free to do whatever it wants.
1034
1035 Of course, you can also do it in the much more standard way by writing
1036 a module (e.g. C<BK::Backend::IRC>), installing it as part of a module
1037 distribution and then configure nodes, for example, if I want to run the
1038 Bummskraut IRC backend on a machine named "ruth", I could do this:
1039
1040 aemp profile ruth addservice BK::Backend::IRC::
1041
1042 And any F<aemp run> on that host will automatically have the Bummskraut
1043 IRC backend running.
1044
1045 That's plenty of possibilities you can use - it's all up to you how you
1046 structure your application.
1047
1048 =head1 PART 4: Coro::MP - selective receive
1049
1050 Not all problems lend themselves naturally to an event-based solution:
1051 sometimes things are easier if you can decide in what order you want to
1052 receive messages, irregardless of the order in which they were sent.
1053
1054 In these cases, L<Coro::MP> can provide a nice solution: instead of
1055 registering callbacks for each message type, C<Coro::MP> attached a
1056 (coro-) thread to a port. The thread can then opt to selectively receive
1057 messages it is interested in. Other messages are not lost, but queued, and
1058 can be received at a later time.
1059
1060 The C<Coro::MP> module is not part of L<AnyEvent::MP>, but a separate
1061 module. It is, however, tightly integrated into C<AnyEvent::MP> - the
1062 ports it creates are fully compatible to C<AnyEvent::MP> ports.
1063
1064 In fact, C<Coro::MP> is more of an extension than a separate module: all
1065 functions exported by C<AnyEvent::MP> are exported by it as well.
1066
1067 To illustrate how programing with C<Coro::MP> looks like, consider the
1068 following (slightly contrived) example: Let's implement a server that
1069 accepts a C<< (write_file =>, $port, $path) >> message with a (source)
1070 port and a filename, followed by as many C<< (data => $port, $data) >>
1071 messages as required to fill the file, followed by an empty C<< (data =>
1072 $port) >> message.
1073
1074 The server only writes a single file at a time, other requests will stay
1075 in the queue until the current file has been finished.
1076
1077 Here is an example implementation that uses L<Coro::AIO> and largely
1078 ignores error handling:
1079
1080 my $ioserver = port_async {
1081 while () {
1082 my ($tag, $port, $path) = get_cond;
1083
1084 $tag eq "write_file"
1085 or die "only write_file messages expected";
1086
1087 my $fh = aio_open $path, O_WRONLY|O_CREAT, 0666
1088 or die "$path: $!";
1089
1090 while () {
1091 my (undef, undef, $data) = get_cond {
1092 $_[0] eq "data" && $_[1] eq $port
1093 } 5
1094 or die "timeout waiting for data message from $port\n";
1095
1096 length $data or last;
1097
1098 aio_write $fh, undef, undef, $data, 0;
1099 };
1100 }
1101 };
1102
1103 mon $ioserver, sub {
1104 warn "ioserver was killed: @_\n";
1105 };
1106
1107 Let's go through it part by part.
1108
1109 my $ioserver = port_async {
1110
1111 Ports can be created by attaching a thread to an existing port via
1112 C<rcv_async>, or as here by calling C<port_async> with the code to execute
1113 as a thread. The C<async> component comes from the fact that threads are
1114 created using the C<Coro::async> function.
1115
1116 The thread runs in a normal port context (so C<$SELF> is set). In
1117 addition, when the thread returns, it will be C<kil> I<normally>, i.e.
1118 without a reason argument.
1119
1120 while () {
1121 my ($tag, $port, $path) = get_cond;
1122 or die "only write_file messages expected";
1123
1124 The thread is supposed to serve many file writes, which is why it executes
1125 in a loop. The first thing it does is fetch the next message, using
1126 C<get_cond>, the "conditional message get". Without a condition, it simply
1127 fetches the next message from the queue, which I<must> be a C<write_file>
1128 message.
1129
1130 The message contains the C<$path> to the file, which is then created:
1131
1132 my $fh = aio_open $path, O_WRONLY|O_CREAT, 0666
1133 or die "$path: $!";
1134
1135 Then we enter a loop again, to serve as many C<data> messages as
1136 necessary:
1137
1138 while () {
1139 my (undef, undef, $data) = get_cond {
1140 $_[0] eq "data" && $_[1] eq $port
1141 } 5
1142 or die "timeout waiting for data message from $port\n";
1143
1144 This time, the condition is not empty, but instead a code block: similarly
1145 to grep, the code block will be called with C<@_> set to each message in
1146 the queue, and it has to return whether it wants to receive the message or
1147 not.
1148
1149 In this case we are interested in C<data> messages (C<< $_[0] eq "data"
1150 >>), whose first element is the source port (C<< $_[1] eq $port >>).
1151
1152 The condition must be this strict, as it is possible to receive both
1153 C<write_file> messages and C<data> messages from other ports while we
1154 handle the file writing.
1155
1156 The lone C<5> at the end is a timeout - when no matching message is
1157 received within C<5> seconds, we assume an error and C<die>.
1158
1159 When an empty C<data> message is received we are done and can close the
1160 file (which is done automatically as C<$fh> goes out of scope):
1161
1162 length $data or last;
1163
1164 Otherwise we need to write the data:
1165
1166 aio_write $fh, undef, undef, $data, 0;
1167
1168 That's basically it. Note that every process should have some kind of
1169 supervisor. In our case, the supervisor simply prints any error message:
1170
1171 mon $ioserver, sub {
1172 warn "ioserver was killed: @_\n";
1173 };
1174
1175 Here is a usage example:
1176
1177 port_async {
1178 snd $ioserver, write_file => $SELF, "/tmp/unsafe";
1179 snd $ioserver, data => $SELF, "abc\n";
1180 snd $ioserver, data => $SELF, "def\n";
1181 snd $ioserver, data => $SELF;
1182 };
1183
1184 The messages are sent without any flow control or acknowledgement (feel
1185 free to improve). Also, the source port does not actually need to be a
1186 port - any unique ID will do - but port identifiers happen to be a simple
1187 source of network-wide unique IDs.
1188
1189 Apart from C<get_cond> as seen above, there are other ways to receive
1190 messages. The C<write_file> message above could also selectively be
1191 received using a C<get> call:
1192
1193 my ($port, $path) = get "write_file";
1194
1195 This is simpler, but when some other code part sends an unexpected message
1196 to the C<$ioserver> it will stay in the queue forever. As a rule of thumb,
1197 every threaded port should have a "fetch next message unconditionally"
1198 somewhere, to avoid filling up the queue.
1199
1200 It is also possible to switch-like C<get_conds>:
1201
1202 get_cond {
1203 $_[0] eq "msg1" and return sub {
1204 my (undef, @msg1_data) = @_;
1205 ...;
1206 };
1207
1208 $_[0] eq "msg2" and return sub {
1209 my (undef, @msg2_data) = @_;
1210 ...;
1211 };
1212
1213 die "unexpected message $_[0] received";
1214 };
1215
1216 =head1 THE END
1217
1218 This is the end of this introduction, but hopefully not the end of
1219 your career as AEMP user. I hope the tutorial was enough to make the
1220 basic concepts clear. Keep in mind that distributed programming is not
1221 completely trivial, that AnyEvent::MP is still in it's infancy, and I hope
1222 it will be useful to create exciting new applications.
1223
1224 =head1 SEE ALSO
1225
1226 L<AnyEvent::MP>
1227
1228 L<AnyEvent::MP::Global>
1229
1230 L<Coro::MP>
1231
1232 L<AnyEvent>
1233
1234 =head1 AUTHOR
1235
1236 Robin Redeker <elmex@ta-sa.org>
1237 Marc Lehmann <schmorp@schmorp.de>
1238