ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-MP/MP.pm
Revision: 1.62
Committed: Thu Aug 27 07:12:48 2009 UTC (14 years, 9 months ago) by root
Branch: MAIN
Changes since 1.61: +14 -26 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     AnyEvent::MP - multi-processing/message-passing framework
4    
5     =head1 SYNOPSIS
6    
7     use AnyEvent::MP;
8    
9 root 1.22 $NODE # contains this node's noderef
10     NODE # returns this node's noderef
11     NODE $port # returns the noderef of the port
12 root 1.2
13 root 1.38 $SELF # receiving/own port id in rcv callbacks
14    
15 root 1.48 # initialise the node so it can send/receive messages
16     initialise_node; # -OR-
17     initialise_node "localhost:4040"; # -OR-
18     initialise_node "slave/", "localhost:4040"
19    
20 root 1.38 # ports are message endpoints
21    
22     # sending messages
23 root 1.2 snd $port, type => data...;
24 root 1.38 snd $port, @msg;
25     snd @msg_with_first_element_being_a_port;
26 root 1.2
27 root 1.50 # creating/using ports, the simple way
28 root 1.53 my $simple_port = port { my @msg = @_; 0 };
29 root 1.22
30 root 1.52 # creating/using ports, tagged message matching
31 root 1.38 my $port = port;
32     rcv $port, ping => sub { snd $_[0], "pong"; 0 };
33     rcv $port, pong => sub { warn "pong received\n"; 0 };
34 root 1.2
35 root 1.48 # create a port on another node
36     my $port = spawn $node, $initfunc, @initdata;
37    
38 root 1.35 # monitoring
39     mon $port, $cb->(@msg) # callback is invoked on death
40     mon $port, $otherport # kill otherport on abnormal death
41     mon $port, $otherport, @msg # send message on death
42    
43 root 1.45 =head1 CURRENT STATUS
44    
45     AnyEvent::MP - stable API, should work
46     AnyEvent::MP::Intro - outdated
47     AnyEvent::MP::Kernel - WIP
48     AnyEvent::MP::Transport - mostly stable
49    
50     stay tuned.
51    
52 root 1.1 =head1 DESCRIPTION
53    
54 root 1.2 This module (-family) implements a simple message passing framework.
55    
56     Despite its simplicity, you can securely message other processes running
57     on the same or other hosts.
58    
59 root 1.23 For an introduction to this module family, see the L<AnyEvent::MP::Intro>
60     manual page.
61    
62     At the moment, this module family is severly broken and underdocumented,
63 root 1.21 so do not use. This was uploaded mainly to reserve the CPAN namespace -
64 root 1.45 stay tuned!
65 root 1.6
66 root 1.2 =head1 CONCEPTS
67    
68     =over 4
69    
70     =item port
71    
72 root 1.29 A port is something you can send messages to (with the C<snd> function).
73    
74 root 1.53 Ports allow you to register C<rcv> handlers that can match all or just
75     some messages. Messages will not be queued.
76 root 1.2
77 root 1.3 =item port id - C<noderef#portname>
78 root 1.2
79 root 1.53 A port ID is the concatenation of a noderef, a hash-mark (C<#>) as
80 root 1.29 separator, and a port name (a printable string of unspecified format). An
81 root 1.30 exception is the the node port, whose ID is identical to its node
82 root 1.29 reference.
83 root 1.2
84     =item node
85    
86 root 1.53 A node is a single process containing at least one port - the node port,
87     which provides nodes to manage each other remotely, and to create new
88     ports.
89 root 1.2
90 root 1.62 Nodes are either private (single-process only), slaves (can only talk to
91     public nodes, but do not need an open port) or public nodes (connectable
92     from any other node).
93 root 1.2
94 root 1.5 =item noderef - C<host:port,host:port...>, C<id@noderef>, C<id>
95 root 1.2
96 root 1.29 A node reference is a string that either simply identifies the node (for
97     private and slave nodes), or contains a recipe on how to reach a given
98 root 1.2 node (for public nodes).
99    
100 root 1.29 This recipe is simply a comma-separated list of C<address:port> pairs (for
101     TCP/IP, other protocols might look different).
102    
103     Node references come in two flavours: resolved (containing only numerical
104     addresses) or unresolved (where hostnames are used instead of addresses).
105    
106     Before using an unresolved node reference in a message you first have to
107     resolve it.
108    
109 root 1.2 =back
110    
111 root 1.3 =head1 VARIABLES/FUNCTIONS
112 root 1.2
113     =over 4
114    
115 root 1.1 =cut
116    
117     package AnyEvent::MP;
118    
119 root 1.44 use AnyEvent::MP::Kernel;
120 root 1.2
121 root 1.1 use common::sense;
122    
123 root 1.2 use Carp ();
124    
125 root 1.1 use AE ();
126    
127 root 1.2 use base "Exporter";
128    
129 root 1.44 our $VERSION = $AnyEvent::MP::Kernel::VERSION;
130 root 1.43
131 root 1.8 our @EXPORT = qw(
132 root 1.59 NODE $NODE *SELF node_of after
133 root 1.31 resolve_node initialise_node
134 root 1.61 snd rcv mon mon_guard kil reg psub spawn
135 root 1.22 port
136 root 1.8 );
137 root 1.2
138 root 1.22 our $SELF;
139    
140     sub _self_die() {
141     my $msg = $@;
142     $msg =~ s/\n+$// unless ref $msg;
143     kil $SELF, die => $msg;
144     }
145    
146     =item $thisnode = NODE / $NODE
147    
148 root 1.52 The C<NODE> function returns, and the C<$NODE> variable contains the
149     noderef of the local node. The value is initialised by a call to
150     C<initialise_node>.
151 root 1.22
152 root 1.33 =item $noderef = node_of $port
153 root 1.22
154 root 1.52 Extracts and returns the noderef from a port ID or a noderef.
155 root 1.22
156 root 1.34 =item initialise_node $noderef, $seednode, $seednode...
157    
158     =item initialise_node "slave/", $master, $master...
159    
160     Before a node can talk to other nodes on the network it has to initialise
161     itself - the minimum a node needs to know is it's own name, and optionally
162     it should know the noderefs of some other nodes in the network.
163    
164     This function initialises a node - it must be called exactly once (or
165     never) before calling other AnyEvent::MP functions.
166    
167 root 1.49 All arguments (optionally except for the first) are noderefs, which can be
168     either resolved or unresolved.
169    
170     The first argument will be looked up in the configuration database first
171     (if it is C<undef> then the current nodename will be used instead) to find
172     the relevant configuration profile (see L<aemp>). If none is found then
173     the default configuration is used. The configuration supplies additional
174     seed/master nodes and can override the actual noderef.
175 root 1.34
176     There are two types of networked nodes, public nodes and slave nodes:
177    
178     =over 4
179    
180     =item public nodes
181    
182 root 1.49 For public nodes, C<$noderef> (supplied either directly to
183     C<initialise_node> or indirectly via a profile or the nodename) must be a
184     noderef (possibly unresolved, in which case it will be resolved).
185    
186 root 1.62 After resolving, the node will bind itself on all endpoints.
187 root 1.34
188     =item slave nodes
189    
190 root 1.49 When the C<$noderef> (either as given or overriden by the config file)
191     is the special string C<slave/>, then the node will become a slave
192 root 1.62 node. Slave nodes cannot be contacted from outside, and cannot talk to
193     each other (at least in this version of AnyEvent::MP).
194 root 1.49
195 root 1.62 Slave nodes work by creating connections to all public nodes, using the
196     L<AnyEvent::MP::Global> service.
197 root 1.56
198 root 1.34 =back
199    
200 root 1.62 After initialising itself, the node will connect to all additional
201     C<$seednodes> that are specified diretcly or via a profile. Seednodes are
202     optional and can be used to quickly bootstrap the node into an existing
203     network.
204 root 1.34
205 root 1.56 All the seednodes will also be specially marked to automatically retry
206 root 1.62 connecting to them indefinitely, so make sure that seednodes are really
207     reliable and up (this might also change in the future).
208 root 1.56
209 root 1.49 Example: become a public node listening on the guessed noderef, or the one
210     specified via C<aemp> for the current node. This should be the most common
211     form of invocation for "daemon"-type nodes.
212 root 1.34
213     initialise_node;
214    
215 root 1.49 Example: become a slave node to any of the the seednodes specified via
216     C<aemp>. This form is often used for commandline clients.
217    
218     initialise_node "slave/";
219    
220 root 1.34 Example: become a public node, and try to contact some well-known master
221     servers to become part of the network.
222    
223     initialise_node undef, "master1", "master2";
224    
225     Example: become a public node listening on port C<4041>.
226    
227     initialise_node 4041;
228    
229     Example: become a public node, only visible on localhost port 4044.
230    
231 root 1.49 initialise_node "localhost:4044";
232 root 1.34
233 root 1.29 =item $cv = resolve_node $noderef
234    
235     Takes an unresolved node reference that may contain hostnames and
236     abbreviated IDs, resolves all of them and returns a resolved node
237     reference.
238    
239     In addition to C<address:port> pairs allowed in resolved noderefs, the
240     following forms are supported:
241    
242     =over 4
243    
244     =item the empty string
245    
246     An empty-string component gets resolved as if the default port (4040) was
247     specified.
248    
249     =item naked port numbers (e.g. C<1234>)
250    
251     These are resolved by prepending the local nodename and a colon, to be
252     further resolved.
253    
254     =item hostnames (e.g. C<localhost:1234>, C<localhost>)
255    
256     These are resolved by using AnyEvent::DNS to resolve them, optionally
257     looking up SRV records for the C<aemp=4040> port, if no port was
258     specified.
259    
260     =back
261    
262 root 1.22 =item $SELF
263    
264     Contains the current port id while executing C<rcv> callbacks or C<psub>
265     blocks.
266 root 1.3
267 root 1.22 =item SELF, %SELF, @SELF...
268    
269     Due to some quirks in how perl exports variables, it is impossible to
270     just export C<$SELF>, all the symbols called C<SELF> are exported by this
271     module, but only C<$SELF> is currently used.
272 root 1.3
273 root 1.33 =item snd $port, type => @data
274 root 1.3
275 root 1.33 =item snd $port, @msg
276 root 1.3
277 root 1.8 Send the given message to the given port ID, which can identify either
278 root 1.52 a local or a remote port, and must be a port ID.
279 root 1.8
280     While the message can be about anything, it is highly recommended to use a
281 root 1.52 string as first element (a port ID, or some word that indicates a request
282 root 1.8 type etc.).
283 root 1.3
284     The message data effectively becomes read-only after a call to this
285     function: modifying any argument is not allowed and can cause many
286     problems.
287    
288     The type of data you can transfer depends on the transport protocol: when
289     JSON is used, then only strings, numbers and arrays and hashes consisting
290     of those are allowed (no objects). When Storable is used, then anything
291     that Storable can serialise and deserialise is allowed, and for the local
292     node, anything can be passed.
293    
294 root 1.22 =item $local_port = port
295 root 1.2
296 root 1.50 Create a new local port object and returns its port ID. Initially it has
297     no callbacks set and will throw an error when it receives messages.
298 root 1.10
299 root 1.50 =item $local_port = port { my @msg = @_ }
300 root 1.15
301 root 1.50 Creates a new local port, and returns its ID. Semantically the same as
302     creating a port and calling C<rcv $port, $callback> on it.
303 root 1.15
304 root 1.50 The block will be called for every message received on the port, with the
305     global variable C<$SELF> set to the port ID. Runtime errors will cause the
306     port to be C<kil>ed. The message will be passed as-is, no extra argument
307     (i.e. no port ID) will be passed to the callback.
308 root 1.15
309 root 1.50 If you want to stop/destroy the port, simply C<kil> it:
310 root 1.15
311 root 1.50 my $port = port {
312     my @msg = @_;
313     ...
314     kil $SELF;
315 root 1.15 };
316 root 1.10
317     =cut
318    
319 root 1.33 sub rcv($@);
320    
321 root 1.50 sub _kilme {
322     die "received message on port without callback";
323     }
324    
325 root 1.22 sub port(;&) {
326     my $id = "$UNIQ." . $ID++;
327     my $port = "$NODE#$id";
328    
329 root 1.50 rcv $port, shift || \&_kilme;
330 root 1.10
331 root 1.22 $port
332 root 1.10 }
333    
334 root 1.50 =item rcv $local_port, $callback->(@msg)
335 root 1.31
336 root 1.50 Replaces the default callback on the specified port. There is no way to
337     remove the default callback: use C<sub { }> to disable it, or better
338     C<kil> the port when it is no longer needed.
339 root 1.3
340 root 1.33 The global C<$SELF> (exported by this module) contains C<$port> while
341 root 1.50 executing the callback. Runtime errors during callback execution will
342     result in the port being C<kil>ed.
343 root 1.22
344 root 1.50 The default callback received all messages not matched by a more specific
345     C<tag> match.
346 root 1.22
347 root 1.50 =item rcv $local_port, tag => $callback->(@msg_without_tag), ...
348 root 1.3
349 root 1.54 Register (or replace) callbacks to be called on messages starting with the
350     given tag on the given port (and return the port), or unregister it (when
351     C<$callback> is C<$undef> or missing). There can only be one callback
352     registered for each tag.
353 root 1.3
354 root 1.50 The original message will be passed to the callback, after the first
355     element (the tag) has been removed. The callback will use the same
356     environment as the default callback (see above).
357 root 1.3
358 root 1.36 Example: create a port and bind receivers on it in one go.
359    
360     my $port = rcv port,
361 root 1.50 msg1 => sub { ... },
362     msg2 => sub { ... },
363 root 1.36 ;
364    
365     Example: create a port, bind receivers and send it in a message elsewhere
366     in one go:
367    
368     snd $otherport, reply =>
369     rcv port,
370 root 1.50 msg1 => sub { ... },
371 root 1.36 ...
372     ;
373    
374 root 1.54 Example: temporarily register a rcv callback for a tag matching some port
375     (e.g. for a rpc reply) and unregister it after a message was received.
376    
377     rcv $port, $otherport => sub {
378     my @reply = @_;
379    
380     rcv $SELF, $otherport;
381     };
382    
383 root 1.3 =cut
384    
385     sub rcv($@) {
386 root 1.33 my $port = shift;
387     my ($noderef, $portid) = split /#/, $port, 2;
388 root 1.3
389 root 1.58 $NODE{$noderef} == $NODE{""}
390 root 1.33 or Carp::croak "$port: rcv can only be called on local ports, caught";
391 root 1.22
392 root 1.50 while (@_) {
393     if (ref $_[0]) {
394     if (my $self = $PORT_DATA{$portid}) {
395     "AnyEvent::MP::Port" eq ref $self
396     or Carp::croak "$port: rcv can only be called on message matching ports, caught";
397 root 1.33
398 root 1.50 $self->[2] = shift;
399     } else {
400     my $cb = shift;
401     $PORT{$portid} = sub {
402     local $SELF = $port;
403     eval { &$cb }; _self_die if $@;
404     };
405     }
406     } elsif (defined $_[0]) {
407     my $self = $PORT_DATA{$portid} ||= do {
408     my $self = bless [$PORT{$port} || sub { }, { }, $port], "AnyEvent::MP::Port";
409    
410     $PORT{$portid} = sub {
411     local $SELF = $port;
412    
413     if (my $cb = $self->[1]{$_[0]}) {
414     shift;
415     eval { &$cb }; _self_die if $@;
416     } else {
417     &{ $self->[0] };
418 root 1.33 }
419     };
420 root 1.50
421     $self
422 root 1.33 };
423    
424 root 1.50 "AnyEvent::MP::Port" eq ref $self
425     or Carp::croak "$port: rcv can only be called on message matching ports, caught";
426 root 1.22
427 root 1.50 my ($tag, $cb) = splice @_, 0, 2;
428 root 1.33
429 root 1.50 if (defined $cb) {
430     $self->[1]{$tag} = $cb;
431 root 1.33 } else {
432 root 1.50 delete $self->[1]{$tag};
433 root 1.33 }
434 root 1.22 }
435 root 1.3 }
436 root 1.31
437 root 1.33 $port
438 root 1.2 }
439    
440 root 1.22 =item $closure = psub { BLOCK }
441 root 1.2
442 root 1.22 Remembers C<$SELF> and creates a closure out of the BLOCK. When the
443     closure is executed, sets up the environment in the same way as in C<rcv>
444     callbacks, i.e. runtime errors will cause the port to get C<kil>ed.
445    
446     This is useful when you register callbacks from C<rcv> callbacks:
447    
448     rcv delayed_reply => sub {
449     my ($delay, @reply) = @_;
450     my $timer = AE::timer $delay, 0, psub {
451     snd @reply, $SELF;
452     };
453     };
454 root 1.3
455 root 1.8 =cut
456 root 1.3
457 root 1.22 sub psub(&) {
458     my $cb = shift;
459 root 1.3
460 root 1.22 my $port = $SELF
461     or Carp::croak "psub can only be called from within rcv or psub callbacks, not";
462 root 1.1
463 root 1.22 sub {
464     local $SELF = $port;
465 root 1.2
466 root 1.22 if (wantarray) {
467     my @res = eval { &$cb };
468     _self_die if $@;
469     @res
470     } else {
471     my $res = eval { &$cb };
472     _self_die if $@;
473     $res
474     }
475     }
476 root 1.2 }
477    
478 root 1.33 =item $guard = mon $port, $cb->(@reason)
479 root 1.32
480 root 1.36 =item $guard = mon $port, $rcvport
481    
482     =item $guard = mon $port
483 root 1.32
484 root 1.36 =item $guard = mon $port, $rcvport, @msg
485 root 1.32
486 root 1.42 Monitor the given port and do something when the port is killed or
487     messages to it were lost, and optionally return a guard that can be used
488     to stop monitoring again.
489    
490     C<mon> effectively guarantees that, in the absence of hardware failures,
491     that after starting the monitor, either all messages sent to the port
492     will arrive, or the monitoring action will be invoked after possible
493     message loss has been detected. No messages will be lost "in between"
494     (after the first lost message no further messages will be received by the
495     port). After the monitoring action was invoked, further messages might get
496     delivered again.
497 root 1.32
498 root 1.58 Note that monitoring-actions are one-shot: once released, they are removed
499     and will not trigger again.
500    
501 root 1.36 In the first form (callback), the callback is simply called with any
502     number of C<@reason> elements (no @reason means that the port was deleted
503 root 1.32 "normally"). Note also that I<< the callback B<must> never die >>, so use
504     C<eval> if unsure.
505    
506 root 1.43 In the second form (another port given), the other port (C<$rcvport>)
507 root 1.36 will be C<kil>'ed with C<@reason>, iff a @reason was specified, i.e. on
508     "normal" kils nothing happens, while under all other conditions, the other
509     port is killed with the same reason.
510 root 1.32
511 root 1.36 The third form (kill self) is the same as the second form, except that
512     C<$rvport> defaults to C<$SELF>.
513    
514     In the last form (message), a message of the form C<@msg, @reason> will be
515     C<snd>.
516 root 1.32
517 root 1.37 As a rule of thumb, monitoring requests should always monitor a port from
518     a local port (or callback). The reason is that kill messages might get
519     lost, just like any other message. Another less obvious reason is that
520     even monitoring requests can get lost (for exmaple, when the connection
521     to the other node goes down permanently). When monitoring a port locally
522     these problems do not exist.
523    
524 root 1.32 Example: call a given callback when C<$port> is killed.
525    
526     mon $port, sub { warn "port died because of <@_>\n" };
527    
528     Example: kill ourselves when C<$port> is killed abnormally.
529    
530 root 1.36 mon $port;
531 root 1.32
532 root 1.36 Example: send us a restart message when another C<$port> is killed.
533 root 1.32
534     mon $port, $self => "restart";
535    
536     =cut
537    
538     sub mon {
539     my ($noderef, $port) = split /#/, shift, 2;
540    
541     my $node = $NODE{$noderef} || add_node $noderef;
542    
543 root 1.41 my $cb = @_ ? shift : $SELF || Carp::croak 'mon: called with one argument only, but $SELF not set,';
544 root 1.32
545     unless (ref $cb) {
546     if (@_) {
547     # send a kill info message
548 root 1.41 my (@msg) = ($cb, @_);
549 root 1.32 $cb = sub { snd @msg, @_ };
550     } else {
551     # simply kill other port
552     my $port = $cb;
553     $cb = sub { kil $port, @_ if @_ };
554     }
555     }
556    
557     $node->monitor ($port, $cb);
558    
559     defined wantarray
560     and AnyEvent::Util::guard { $node->unmonitor ($port, $cb) }
561     }
562    
563     =item $guard = mon_guard $port, $ref, $ref...
564    
565     Monitors the given C<$port> and keeps the passed references. When the port
566     is killed, the references will be freed.
567    
568     Optionally returns a guard that will stop the monitoring.
569    
570     This function is useful when you create e.g. timers or other watchers and
571     want to free them when the port gets killed:
572    
573     $port->rcv (start => sub {
574     my $timer; $timer = mon_guard $port, AE::timer 1, 1, sub {
575     undef $timer if 0.9 < rand;
576     });
577     });
578    
579     =cut
580    
581     sub mon_guard {
582     my ($port, @refs) = @_;
583    
584 root 1.36 #TODO: mon-less form?
585    
586 root 1.32 mon $port, sub { 0 && @refs }
587     }
588    
589 root 1.33 =item kil $port[, @reason]
590 root 1.32
591     Kill the specified port with the given C<@reason>.
592    
593     If no C<@reason> is specified, then the port is killed "normally" (linked
594     ports will not be kileld, or even notified).
595    
596     Otherwise, linked ports get killed with the same reason (second form of
597     C<mon>, see below).
598    
599     Runtime errors while evaluating C<rcv> callbacks or inside C<psub> blocks
600     will be reported as reason C<< die => $@ >>.
601    
602     Transport/communication errors are reported as C<< transport_error =>
603     $message >>.
604    
605 root 1.38 =cut
606    
607     =item $port = spawn $node, $initfunc[, @initdata]
608    
609     Creates a port on the node C<$node> (which can also be a port ID, in which
610     case it's the node where that port resides).
611    
612     The port ID of the newly created port is return immediately, and it is
613     permissible to immediately start sending messages or monitor the port.
614    
615     After the port has been created, the init function is
616 root 1.39 called. This function must be a fully-qualified function name
617 root 1.40 (e.g. C<MyApp::Chat::Server::init>). To specify a function in the main
618     program, use C<::name>.
619 root 1.38
620     If the function doesn't exist, then the node tries to C<require>
621     the package, then the package above the package and so on (e.g.
622     C<MyApp::Chat::Server>, C<MyApp::Chat>, C<MyApp>) until the function
623     exists or it runs out of package names.
624    
625     The init function is then called with the newly-created port as context
626     object (C<$SELF>) and the C<@initdata> values as arguments.
627    
628     A common idiom is to pass your own port, monitor the spawned port, and
629     in the init function, monitor the original port. This two-way monitoring
630     ensures that both ports get cleaned up when there is a problem.
631    
632     Example: spawn a chat server port on C<$othernode>.
633    
634     # this node, executed from within a port context:
635     my $server = spawn $othernode, "MyApp::Chat::Server::connect", $SELF;
636     mon $server;
637    
638     # init function on C<$othernode>
639     sub connect {
640     my ($srcport) = @_;
641    
642     mon $srcport;
643    
644     rcv $SELF, sub {
645     ...
646     };
647     }
648    
649     =cut
650    
651     sub _spawn {
652     my $port = shift;
653     my $init = shift;
654    
655     local $SELF = "$NODE#$port";
656     eval {
657     &{ load_func $init }
658     };
659     _self_die if $@;
660     }
661    
662     sub spawn(@) {
663     my ($noderef, undef) = split /#/, shift, 2;
664    
665     my $id = "$RUNIQ." . $ID++;
666    
667 root 1.39 $_[0] =~ /::/
668     or Carp::croak "spawn init function must be a fully-qualified name, caught";
669    
670 root 1.55 snd_to_func $noderef, "AnyEvent::MP::_spawn" => $id, @_;
671 root 1.38
672     "$noderef#$id"
673     }
674    
675 root 1.59 =item after $timeout, @msg
676    
677     =item after $timeout, $callback
678    
679     Either sends the given message, or call the given callback, after the
680     specified number of seconds.
681    
682     This is simply a utility function that come sin handy at times.
683    
684     =cut
685    
686     sub after($@) {
687     my ($timeout, @action) = @_;
688    
689     my $t; $t = AE::timer $timeout, 0, sub {
690     undef $t;
691     ref $action[0]
692     ? $action[0]()
693     : snd @action;
694     };
695     }
696    
697 root 1.8 =back
698    
699 root 1.26 =head1 AnyEvent::MP vs. Distributed Erlang
700    
701 root 1.35 AnyEvent::MP got lots of its ideas from distributed Erlang (Erlang node
702     == aemp node, Erlang process == aemp port), so many of the documents and
703     programming techniques employed by Erlang apply to AnyEvent::MP. Here is a
704 root 1.27 sample:
705    
706 root 1.35 http://www.Erlang.se/doc/programming_rules.shtml
707     http://Erlang.org/doc/getting_started/part_frame.html # chapters 3 and 4
708     http://Erlang.org/download/Erlang-book-part1.pdf # chapters 5 and 6
709     http://Erlang.org/download/armstrong_thesis_2003.pdf # chapters 4 and 5
710 root 1.27
711     Despite the similarities, there are also some important differences:
712 root 1.26
713     =over 4
714    
715     =item * Node references contain the recipe on how to contact them.
716    
717     Erlang relies on special naming and DNS to work everywhere in the
718     same way. AEMP relies on each node knowing it's own address(es), with
719     convenience functionality.
720    
721 root 1.27 This means that AEMP requires a less tightly controlled environment at the
722     cost of longer node references and a slightly higher management overhead.
723    
724 root 1.54 =item * Erlang has a "remote ports are like local ports" philosophy, AEMP
725 root 1.51 uses "local ports are like remote ports".
726    
727     The failure modes for local ports are quite different (runtime errors
728     only) then for remote ports - when a local port dies, you I<know> it dies,
729     when a connection to another node dies, you know nothing about the other
730     port.
731    
732     Erlang pretends remote ports are as reliable as local ports, even when
733     they are not.
734    
735     AEMP encourages a "treat remote ports differently" philosophy, with local
736     ports being the special case/exception, where transport errors cannot
737     occur.
738    
739 root 1.26 =item * Erlang uses processes and a mailbox, AEMP does not queue.
740    
741 root 1.51 Erlang uses processes that selectively receive messages, and therefore
742     needs a queue. AEMP is event based, queuing messages would serve no
743     useful purpose. For the same reason the pattern-matching abilities of
744     AnyEvent::MP are more limited, as there is little need to be able to
745     filter messages without dequeing them.
746 root 1.26
747 root 1.35 (But see L<Coro::MP> for a more Erlang-like process model on top of AEMP).
748 root 1.26
749     =item * Erlang sends are synchronous, AEMP sends are asynchronous.
750    
751 root 1.51 Sending messages in Erlang is synchronous and blocks the process (and
752     so does not need a queue that can overflow). AEMP sends are immediate,
753     connection establishment is handled in the background.
754 root 1.26
755 root 1.51 =item * Erlang suffers from silent message loss, AEMP does not.
756 root 1.26
757     Erlang makes few guarantees on messages delivery - messages can get lost
758     without any of the processes realising it (i.e. you send messages a, b,
759     and c, and the other side only receives messages a and c).
760    
761     AEMP guarantees correct ordering, and the guarantee that there are no
762     holes in the message sequence.
763    
764 root 1.35 =item * In Erlang, processes can be declared dead and later be found to be
765 root 1.26 alive.
766    
767 root 1.35 In Erlang it can happen that a monitored process is declared dead and
768 root 1.26 linked processes get killed, but later it turns out that the process is
769     still alive - and can receive messages.
770    
771     In AEMP, when port monitoring detects a port as dead, then that port will
772     eventually be killed - it cannot happen that a node detects a port as dead
773     and then later sends messages to it, finding it is still alive.
774    
775     =item * Erlang can send messages to the wrong port, AEMP does not.
776    
777 root 1.51 In Erlang it is quite likely that a node that restarts reuses a process ID
778     known to other nodes for a completely different process, causing messages
779     destined for that process to end up in an unrelated process.
780 root 1.26
781     AEMP never reuses port IDs, so old messages or old port IDs floating
782     around in the network will not be sent to an unrelated port.
783    
784     =item * Erlang uses unprotected connections, AEMP uses secure
785     authentication and can use TLS.
786    
787     AEMP can use a proven protocol - SSL/TLS - to protect connections and
788     securely authenticate nodes.
789    
790 root 1.28 =item * The AEMP protocol is optimised for both text-based and binary
791     communications.
792    
793 root 1.35 The AEMP protocol, unlike the Erlang protocol, supports both
794 root 1.28 language-independent text-only protocols (good for debugging) and binary,
795     language-specific serialisers (e.g. Storable).
796    
797     It has also been carefully designed to be implementable in other languages
798     with a minimum of work while gracefully degrading fucntionality to make the
799     protocol simple.
800    
801 root 1.35 =item * AEMP has more flexible monitoring options than Erlang.
802    
803     In Erlang, you can chose to receive I<all> exit signals as messages
804     or I<none>, there is no in-between, so monitoring single processes is
805     difficult to implement. Monitoring in AEMP is more flexible than in
806     Erlang, as one can choose between automatic kill, exit message or callback
807     on a per-process basis.
808    
809 root 1.37 =item * Erlang tries to hide remote/local connections, AEMP does not.
810 root 1.35
811     Monitoring in Erlang is not an indicator of process death/crashes,
812 root 1.37 as linking is (except linking is unreliable in Erlang).
813    
814     In AEMP, you don't "look up" registered port names or send to named ports
815     that might or might not be persistent. Instead, you normally spawn a port
816     on the remote node. The init function monitors the you, and you monitor
817     the remote port. Since both monitors are local to the node, they are much
818     more reliable.
819    
820     This also saves round-trips and avoids sending messages to the wrong port
821     (hard to do in Erlang).
822 root 1.35
823 root 1.26 =back
824    
825 root 1.46 =head1 RATIONALE
826    
827     =over 4
828    
829     =item Why strings for ports and noderefs, why not objects?
830    
831     We considered "objects", but found that the actual number of methods
832     thatc an be called are very low. Since port IDs and noderefs travel over
833     the network frequently, the serialising/deserialising would add lots of
834     overhead, as well as having to keep a proxy object.
835    
836     Strings can easily be printed, easily serialised etc. and need no special
837     procedures to be "valid".
838    
839 root 1.47 And a a miniport consists of a single closure stored in a global hash - it
840     can't become much cheaper.
841    
842 root 1.46 =item Why favour JSON, why not real serialising format such as Storable?
843    
844     In fact, any AnyEvent::MP node will happily accept Storable as framing
845     format, but currently there is no way to make a node use Storable by
846     default.
847    
848     The default framing protocol is JSON because a) JSON::XS is many times
849     faster for small messages and b) most importantly, after years of
850     experience we found that object serialisation is causing more problems
851     than it gains: Just like function calls, objects simply do not travel
852     easily over the network, mostly because they will always be a copy, so you
853     always have to re-think your design.
854    
855     Keeping your messages simple, concentrating on data structures rather than
856     objects, will keep your messages clean, tidy and efficient.
857    
858     =back
859    
860 root 1.1 =head1 SEE ALSO
861    
862     L<AnyEvent>.
863    
864     =head1 AUTHOR
865    
866     Marc Lehmann <schmorp@schmorp.de>
867     http://home.schmorp.de/
868    
869     =cut
870    
871     1
872