ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-MP/MP.pm
Revision: 1.80
Committed: Fri Sep 4 22:30:29 2009 UTC (14 years, 8 months ago) by root
Branch: MAIN
Changes since 1.79: +4 -0 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     AnyEvent::MP - multi-processing/message-passing framework
4    
5     =head1 SYNOPSIS
6    
7     use AnyEvent::MP;
8    
9 root 1.75 $NODE # contains this node's node ID
10     NODE # returns this node's node ID
11 root 1.2
12 root 1.38 $SELF # receiving/own port id in rcv callbacks
13    
14 root 1.48 # initialise the node so it can send/receive messages
15 root 1.72 configure;
16 root 1.48
17 root 1.75 # ports are message destinations
18 root 1.38
19     # sending messages
20 root 1.2 snd $port, type => data...;
21 root 1.38 snd $port, @msg;
22     snd @msg_with_first_element_being_a_port;
23 root 1.2
24 root 1.50 # creating/using ports, the simple way
25 root 1.73 my $simple_port = port { my @msg = @_ };
26 root 1.22
27 root 1.52 # creating/using ports, tagged message matching
28 root 1.38 my $port = port;
29 root 1.73 rcv $port, ping => sub { snd $_[0], "pong" };
30     rcv $port, pong => sub { warn "pong received\n" };
31 root 1.2
32 root 1.48 # create a port on another node
33     my $port = spawn $node, $initfunc, @initdata;
34    
35 root 1.35 # monitoring
36     mon $port, $cb->(@msg) # callback is invoked on death
37     mon $port, $otherport # kill otherport on abnormal death
38     mon $port, $otherport, @msg # send message on death
39    
40 root 1.45 =head1 CURRENT STATUS
41    
42 root 1.71 bin/aemp - stable.
43     AnyEvent::MP - stable API, should work.
44 elmex 1.77 AnyEvent::MP::Intro - explains most concepts.
45 root 1.71 AnyEvent::MP::Kernel - mostly stable.
46 root 1.79 AnyEvent::MP::Global - stable but incomplete, protocol not yet final.
47 root 1.45
48 root 1.79 stay tuned.
49 root 1.45
50 root 1.1 =head1 DESCRIPTION
51    
52 root 1.2 This module (-family) implements a simple message passing framework.
53    
54     Despite its simplicity, you can securely message other processes running
55 root 1.67 on the same or other hosts, and you can supervise entities remotely.
56 root 1.2
57 root 1.23 For an introduction to this module family, see the L<AnyEvent::MP::Intro>
58 root 1.67 manual page and the examples under F<eg/>.
59 root 1.23
60 root 1.2 =head1 CONCEPTS
61    
62     =over 4
63    
64     =item port
65    
66 root 1.79 Not to be confused with a TCP port, a "port" is something you can send
67     messages to (with the C<snd> function).
68 root 1.29
69 root 1.53 Ports allow you to register C<rcv> handlers that can match all or just
70 root 1.64 some messages. Messages send to ports will not be queued, regardless of
71     anything was listening for them or not.
72 root 1.2
73 root 1.67 =item port ID - C<nodeid#portname>
74 root 1.2
75 root 1.67 A port ID is the concatenation of a node ID, a hash-mark (C<#>) as
76     separator, and a port name (a printable string of unspecified format).
77 root 1.2
78     =item node
79    
80 root 1.53 A node is a single process containing at least one port - the node port,
81 root 1.67 which enables nodes to manage each other remotely, and to create new
82 root 1.53 ports.
83 root 1.2
84 root 1.67 Nodes are either public (have one or more listening ports) or private
85     (no listening ports). Private nodes cannot talk to other private nodes
86     currently.
87 root 1.2
88 root 1.63 =item node ID - C<[a-za-Z0-9_\-.:]+>
89 root 1.2
90 root 1.64 A node ID is a string that uniquely identifies the node within a
91     network. Depending on the configuration used, node IDs can look like a
92     hostname, a hostname and a port, or a random string. AnyEvent::MP itself
93     doesn't interpret node IDs in any way.
94    
95     =item binds - C<ip:port>
96    
97     Nodes can only talk to each other by creating some kind of connection to
98     each other. To do this, nodes should listen on one or more local transport
99     endpoints - binds. Currently, only standard C<ip:port> specifications can
100     be used, which specify TCP ports to listen on.
101    
102     =item seeds - C<host:port>
103    
104     When a node starts, it knows nothing about the network. To teach the node
105     about the network it first has to contact some other node within the
106     network. This node is called a seed.
107    
108     Seeds are transport endpoint(s) of as many nodes as one wants. Those nodes
109     are expected to be long-running, and at least one of those should always
110     be available. When nodes run out of connections (e.g. due to a network
111     error), they try to re-establish connections to some seednodes again to
112     join the network.
113 root 1.29
114 root 1.67 Apart from being sued for seeding, seednodes are not special in any way -
115     every public node can be a seednode.
116    
117 root 1.2 =back
118    
119 root 1.3 =head1 VARIABLES/FUNCTIONS
120 root 1.2
121     =over 4
122    
123 root 1.1 =cut
124    
125     package AnyEvent::MP;
126    
127 root 1.44 use AnyEvent::MP::Kernel;
128 root 1.2
129 root 1.1 use common::sense;
130    
131 root 1.2 use Carp ();
132    
133 root 1.1 use AE ();
134    
135 root 1.2 use base "Exporter";
136    
137 root 1.44 our $VERSION = $AnyEvent::MP::Kernel::VERSION;
138 root 1.43
139 root 1.8 our @EXPORT = qw(
140 root 1.59 NODE $NODE *SELF node_of after
141 root 1.72 configure
142 root 1.61 snd rcv mon mon_guard kil reg psub spawn
143 root 1.22 port
144 root 1.8 );
145 root 1.2
146 root 1.22 our $SELF;
147    
148     sub _self_die() {
149     my $msg = $@;
150     $msg =~ s/\n+$// unless ref $msg;
151     kil $SELF, die => $msg;
152     }
153    
154     =item $thisnode = NODE / $NODE
155    
156 root 1.67 The C<NODE> function returns, and the C<$NODE> variable contains, the node
157 root 1.64 ID of the node running in the current process. This value is initialised by
158 root 1.72 a call to C<configure>.
159 root 1.22
160 root 1.63 =item $nodeid = node_of $port
161 root 1.22
162 root 1.67 Extracts and returns the node ID from a port ID or a node ID.
163 root 1.34
164 root 1.78 =item configure $profile, key => value...
165    
166 root 1.72 =item configure key => value...
167 root 1.34
168 root 1.64 Before a node can talk to other nodes on the network (i.e. enter
169 root 1.72 "distributed mode") it has to configure itself - the minimum a node needs
170 root 1.64 to know is its own name, and optionally it should know the addresses of
171     some other nodes in the network to discover other nodes.
172 root 1.34
173 root 1.72 This function configures a node - it must be called exactly once (or
174 root 1.34 never) before calling other AnyEvent::MP functions.
175    
176 root 1.72 =over 4
177    
178     =item step 1, gathering configuration from profiles
179    
180     The function first looks up a profile in the aemp configuration (see the
181     L<aemp> commandline utility). The profile name can be specified via the
182 root 1.78 named C<profile> parameter or can simply be the first parameter). If it is
183     missing, then the nodename (F<uname -n>) will be used as profile name.
184 root 1.34
185 root 1.72 The profile data is then gathered as follows:
186 root 1.69
187 elmex 1.77 First, all remaining key => value pairs (all of which are conveniently
188 root 1.72 undocumented at the moment) will be interpreted as configuration
189     data. Then they will be overwritten by any values specified in the global
190     default configuration (see the F<aemp> utility), then the chain of
191     profiles chosen by the profile name (and any C<parent> attributes).
192    
193     That means that the values specified in the profile have highest priority
194     and the values specified directly via C<configure> have lowest priority,
195     and can only be used to specify defaults.
196 root 1.49
197 root 1.64 If the profile specifies a node ID, then this will become the node ID of
198     this process. If not, then the profile name will be used as node ID. The
199     special node ID of C<anon/> will be replaced by a random node ID.
200    
201 root 1.72 =item step 2, bind listener sockets
202    
203 root 1.64 The next step is to look up the binds in the profile, followed by binding
204     aemp protocol listeners on all binds specified (it is possible and valid
205     to have no binds, meaning that the node cannot be contacted form the
206     outside. This means the node cannot talk to other nodes that also have no
207     binds, but it can still talk to all "normal" nodes).
208    
209 root 1.70 If the profile does not specify a binds list, then a default of C<*> is
210 root 1.72 used, meaning the node will bind on a dynamically-assigned port on every
211     local IP address it finds.
212    
213     =item step 3, connect to seed nodes
214 root 1.64
215 root 1.72 As the last step, the seeds list from the profile is passed to the
216 root 1.64 L<AnyEvent::MP::Global> module, which will then use it to keep
217 root 1.72 connectivity with at least one node at any point in time.
218 root 1.64
219 root 1.72 =back
220    
221     Example: become a distributed node using the locla node name as profile.
222     This should be the most common form of invocation for "daemon"-type nodes.
223 root 1.34
224 root 1.72 configure
225 root 1.34
226 root 1.64 Example: become an anonymous node. This form is often used for commandline
227     clients.
228 root 1.34
229 root 1.72 configure nodeid => "anon/";
230    
231     Example: configure a node using a profile called seed, which si suitable
232     for a seed node as it binds on all local addresses on a fixed port (4040,
233     customary for aemp).
234    
235     # use the aemp commandline utility
236 root 1.74 # aemp profile seed nodeid anon/ binds '*:4040'
237 root 1.72
238     # then use it
239     configure profile => "seed";
240 root 1.34
241 root 1.72 # or simply use aemp from the shell again:
242     # aemp run profile seed
243 root 1.34
244 root 1.72 # or provide a nicer-to-remember nodeid
245     # aemp run profile seed nodeid "$(hostname)"
246 root 1.34
247 root 1.22 =item $SELF
248    
249     Contains the current port id while executing C<rcv> callbacks or C<psub>
250     blocks.
251 root 1.3
252 root 1.67 =item *SELF, SELF, %SELF, @SELF...
253 root 1.22
254     Due to some quirks in how perl exports variables, it is impossible to
255 root 1.67 just export C<$SELF>, all the symbols named C<SELF> are exported by this
256 root 1.22 module, but only C<$SELF> is currently used.
257 root 1.3
258 root 1.33 =item snd $port, type => @data
259 root 1.3
260 root 1.33 =item snd $port, @msg
261 root 1.3
262 root 1.67 Send the given message to the given port, which can identify either a
263     local or a remote port, and must be a port ID.
264 root 1.8
265 root 1.67 While the message can be almost anything, it is highly recommended to
266     use a string as first element (a port ID, or some word that indicates a
267     request type etc.) and to consist if only simple perl values (scalars,
268     arrays, hashes) - if you think you need to pass an object, think again.
269    
270     The message data logically becomes read-only after a call to this
271     function: modifying any argument (or values referenced by them) is
272     forbidden, as there can be considerable time between the call to C<snd>
273     and the time the message is actually being serialised - in fact, it might
274     never be copied as within the same process it is simply handed to the
275     receiving port.
276 root 1.3
277     The type of data you can transfer depends on the transport protocol: when
278     JSON is used, then only strings, numbers and arrays and hashes consisting
279     of those are allowed (no objects). When Storable is used, then anything
280     that Storable can serialise and deserialise is allowed, and for the local
281 root 1.67 node, anything can be passed. Best rely only on the common denominator of
282     these.
283 root 1.3
284 root 1.22 =item $local_port = port
285 root 1.2
286 root 1.50 Create a new local port object and returns its port ID. Initially it has
287     no callbacks set and will throw an error when it receives messages.
288 root 1.10
289 root 1.50 =item $local_port = port { my @msg = @_ }
290 root 1.15
291 root 1.50 Creates a new local port, and returns its ID. Semantically the same as
292     creating a port and calling C<rcv $port, $callback> on it.
293 root 1.15
294 root 1.50 The block will be called for every message received on the port, with the
295     global variable C<$SELF> set to the port ID. Runtime errors will cause the
296     port to be C<kil>ed. The message will be passed as-is, no extra argument
297     (i.e. no port ID) will be passed to the callback.
298 root 1.15
299 root 1.50 If you want to stop/destroy the port, simply C<kil> it:
300 root 1.15
301 root 1.50 my $port = port {
302     my @msg = @_;
303     ...
304     kil $SELF;
305 root 1.15 };
306 root 1.10
307     =cut
308    
309 root 1.33 sub rcv($@);
310    
311 root 1.50 sub _kilme {
312     die "received message on port without callback";
313     }
314    
315 root 1.22 sub port(;&) {
316     my $id = "$UNIQ." . $ID++;
317     my $port = "$NODE#$id";
318    
319 root 1.50 rcv $port, shift || \&_kilme;
320 root 1.10
321 root 1.22 $port
322 root 1.10 }
323    
324 root 1.50 =item rcv $local_port, $callback->(@msg)
325 root 1.31
326 root 1.50 Replaces the default callback on the specified port. There is no way to
327     remove the default callback: use C<sub { }> to disable it, or better
328     C<kil> the port when it is no longer needed.
329 root 1.3
330 root 1.33 The global C<$SELF> (exported by this module) contains C<$port> while
331 root 1.50 executing the callback. Runtime errors during callback execution will
332     result in the port being C<kil>ed.
333 root 1.22
334 root 1.50 The default callback received all messages not matched by a more specific
335     C<tag> match.
336 root 1.22
337 root 1.50 =item rcv $local_port, tag => $callback->(@msg_without_tag), ...
338 root 1.3
339 root 1.54 Register (or replace) callbacks to be called on messages starting with the
340     given tag on the given port (and return the port), or unregister it (when
341     C<$callback> is C<$undef> or missing). There can only be one callback
342     registered for each tag.
343 root 1.3
344 root 1.50 The original message will be passed to the callback, after the first
345     element (the tag) has been removed. The callback will use the same
346     environment as the default callback (see above).
347 root 1.3
348 root 1.36 Example: create a port and bind receivers on it in one go.
349    
350     my $port = rcv port,
351 root 1.50 msg1 => sub { ... },
352     msg2 => sub { ... },
353 root 1.36 ;
354    
355     Example: create a port, bind receivers and send it in a message elsewhere
356     in one go:
357    
358     snd $otherport, reply =>
359     rcv port,
360 root 1.50 msg1 => sub { ... },
361 root 1.36 ...
362     ;
363    
364 root 1.54 Example: temporarily register a rcv callback for a tag matching some port
365     (e.g. for a rpc reply) and unregister it after a message was received.
366    
367     rcv $port, $otherport => sub {
368     my @reply = @_;
369    
370     rcv $SELF, $otherport;
371     };
372    
373 root 1.3 =cut
374    
375     sub rcv($@) {
376 root 1.33 my $port = shift;
377 root 1.75 my ($nodeid, $portid) = split /#/, $port, 2;
378 root 1.3
379 root 1.75 $NODE{$nodeid} == $NODE{""}
380 root 1.33 or Carp::croak "$port: rcv can only be called on local ports, caught";
381 root 1.22
382 root 1.50 while (@_) {
383     if (ref $_[0]) {
384     if (my $self = $PORT_DATA{$portid}) {
385     "AnyEvent::MP::Port" eq ref $self
386     or Carp::croak "$port: rcv can only be called on message matching ports, caught";
387 root 1.33
388 root 1.50 $self->[2] = shift;
389     } else {
390     my $cb = shift;
391     $PORT{$portid} = sub {
392     local $SELF = $port;
393     eval { &$cb }; _self_die if $@;
394     };
395     }
396     } elsif (defined $_[0]) {
397     my $self = $PORT_DATA{$portid} ||= do {
398     my $self = bless [$PORT{$port} || sub { }, { }, $port], "AnyEvent::MP::Port";
399    
400     $PORT{$portid} = sub {
401     local $SELF = $port;
402    
403     if (my $cb = $self->[1]{$_[0]}) {
404     shift;
405     eval { &$cb }; _self_die if $@;
406     } else {
407     &{ $self->[0] };
408 root 1.33 }
409     };
410 root 1.50
411     $self
412 root 1.33 };
413    
414 root 1.50 "AnyEvent::MP::Port" eq ref $self
415     or Carp::croak "$port: rcv can only be called on message matching ports, caught";
416 root 1.22
417 root 1.50 my ($tag, $cb) = splice @_, 0, 2;
418 root 1.33
419 root 1.50 if (defined $cb) {
420     $self->[1]{$tag} = $cb;
421 root 1.33 } else {
422 root 1.50 delete $self->[1]{$tag};
423 root 1.33 }
424 root 1.22 }
425 root 1.3 }
426 root 1.31
427 root 1.33 $port
428 root 1.2 }
429    
430 root 1.22 =item $closure = psub { BLOCK }
431 root 1.2
432 root 1.22 Remembers C<$SELF> and creates a closure out of the BLOCK. When the
433     closure is executed, sets up the environment in the same way as in C<rcv>
434     callbacks, i.e. runtime errors will cause the port to get C<kil>ed.
435    
436     This is useful when you register callbacks from C<rcv> callbacks:
437    
438     rcv delayed_reply => sub {
439     my ($delay, @reply) = @_;
440     my $timer = AE::timer $delay, 0, psub {
441     snd @reply, $SELF;
442     };
443     };
444 root 1.3
445 root 1.8 =cut
446 root 1.3
447 root 1.22 sub psub(&) {
448     my $cb = shift;
449 root 1.3
450 root 1.22 my $port = $SELF
451     or Carp::croak "psub can only be called from within rcv or psub callbacks, not";
452 root 1.1
453 root 1.22 sub {
454     local $SELF = $port;
455 root 1.2
456 root 1.22 if (wantarray) {
457     my @res = eval { &$cb };
458     _self_die if $@;
459     @res
460     } else {
461     my $res = eval { &$cb };
462     _self_die if $@;
463     $res
464     }
465     }
466 root 1.2 }
467    
468 root 1.67 =item $guard = mon $port, $cb->(@reason) # call $cb when $port dies
469 root 1.32
470 root 1.67 =item $guard = mon $port, $rcvport # kill $rcvport when $port dies
471 root 1.36
472 root 1.67 =item $guard = mon $port # kill $SELF when $port dies
473 root 1.32
474 root 1.67 =item $guard = mon $port, $rcvport, @msg # send a message when $port dies
475 root 1.32
476 root 1.42 Monitor the given port and do something when the port is killed or
477     messages to it were lost, and optionally return a guard that can be used
478     to stop monitoring again.
479    
480 root 1.36 In the first form (callback), the callback is simply called with any
481     number of C<@reason> elements (no @reason means that the port was deleted
482 root 1.32 "normally"). Note also that I<< the callback B<must> never die >>, so use
483     C<eval> if unsure.
484    
485 root 1.43 In the second form (another port given), the other port (C<$rcvport>)
486 elmex 1.77 will be C<kil>'ed with C<@reason>, if a @reason was specified, i.e. on
487 root 1.36 "normal" kils nothing happens, while under all other conditions, the other
488     port is killed with the same reason.
489 root 1.32
490 root 1.36 The third form (kill self) is the same as the second form, except that
491     C<$rvport> defaults to C<$SELF>.
492    
493     In the last form (message), a message of the form C<@msg, @reason> will be
494     C<snd>.
495 root 1.32
496 root 1.79 Monitoring-actions are one-shot: once messages are lost (and a monitoring
497     alert was raised), they are removed and will not trigger again.
498    
499 root 1.37 As a rule of thumb, monitoring requests should always monitor a port from
500     a local port (or callback). The reason is that kill messages might get
501     lost, just like any other message. Another less obvious reason is that
502 elmex 1.77 even monitoring requests can get lost (for example, when the connection
503 root 1.37 to the other node goes down permanently). When monitoring a port locally
504     these problems do not exist.
505    
506 root 1.79 C<mon> effectively guarantees that, in the absence of hardware failures,
507     after starting the monitor, either all messages sent to the port will
508     arrive, or the monitoring action will be invoked after possible message
509     loss has been detected. No messages will be lost "in between" (after
510     the first lost message no further messages will be received by the
511     port). After the monitoring action was invoked, further messages might get
512     delivered again.
513    
514     Inter-host-connection timeouts and monitoring depend on the transport
515     used. The only transport currently implemented is TCP, and AnyEvent::MP
516     relies on TCP to detect node-downs (this can take 10-15 minutes on a
517     non-idle connection, and usually around two hours for idle conenctions).
518    
519     This means that monitoring is good for program errors and cleaning up
520     stuff eventually, but they are no replacement for a timeout when you need
521     to ensure some maximum latency.
522    
523 root 1.32 Example: call a given callback when C<$port> is killed.
524    
525     mon $port, sub { warn "port died because of <@_>\n" };
526    
527     Example: kill ourselves when C<$port> is killed abnormally.
528    
529 root 1.36 mon $port;
530 root 1.32
531 root 1.36 Example: send us a restart message when another C<$port> is killed.
532 root 1.32
533     mon $port, $self => "restart";
534    
535     =cut
536    
537     sub mon {
538 root 1.75 my ($nodeid, $port) = split /#/, shift, 2;
539 root 1.32
540 root 1.75 my $node = $NODE{$nodeid} || add_node $nodeid;
541 root 1.32
542 root 1.41 my $cb = @_ ? shift : $SELF || Carp::croak 'mon: called with one argument only, but $SELF not set,';
543 root 1.32
544     unless (ref $cb) {
545     if (@_) {
546     # send a kill info message
547 root 1.41 my (@msg) = ($cb, @_);
548 root 1.32 $cb = sub { snd @msg, @_ };
549     } else {
550     # simply kill other port
551     my $port = $cb;
552     $cb = sub { kil $port, @_ if @_ };
553     }
554     }
555    
556     $node->monitor ($port, $cb);
557    
558     defined wantarray
559     and AnyEvent::Util::guard { $node->unmonitor ($port, $cb) }
560     }
561    
562     =item $guard = mon_guard $port, $ref, $ref...
563    
564     Monitors the given C<$port> and keeps the passed references. When the port
565     is killed, the references will be freed.
566    
567     Optionally returns a guard that will stop the monitoring.
568    
569     This function is useful when you create e.g. timers or other watchers and
570 root 1.67 want to free them when the port gets killed (note the use of C<psub>):
571 root 1.32
572     $port->rcv (start => sub {
573 root 1.67 my $timer; $timer = mon_guard $port, AE::timer 1, 1, psub {
574 root 1.32 undef $timer if 0.9 < rand;
575     });
576     });
577    
578     =cut
579    
580     sub mon_guard {
581     my ($port, @refs) = @_;
582    
583 root 1.36 #TODO: mon-less form?
584    
585 root 1.32 mon $port, sub { 0 && @refs }
586     }
587    
588 root 1.33 =item kil $port[, @reason]
589 root 1.32
590     Kill the specified port with the given C<@reason>.
591    
592 root 1.67 If no C<@reason> is specified, then the port is killed "normally" (ports
593     monitoring other ports will not necessarily die because a port dies
594     "normally").
595 root 1.32
596     Otherwise, linked ports get killed with the same reason (second form of
597 root 1.67 C<mon>, see above).
598 root 1.32
599     Runtime errors while evaluating C<rcv> callbacks or inside C<psub> blocks
600     will be reported as reason C<< die => $@ >>.
601    
602     Transport/communication errors are reported as C<< transport_error =>
603     $message >>.
604    
605 root 1.38 =cut
606    
607     =item $port = spawn $node, $initfunc[, @initdata]
608    
609     Creates a port on the node C<$node> (which can also be a port ID, in which
610     case it's the node where that port resides).
611    
612 root 1.67 The port ID of the newly created port is returned immediately, and it is
613     possible to immediately start sending messages or to monitor the port.
614 root 1.38
615 root 1.67 After the port has been created, the init function is called on the remote
616     node, in the same context as a C<rcv> callback. This function must be a
617     fully-qualified function name (e.g. C<MyApp::Chat::Server::init>). To
618     specify a function in the main program, use C<::name>.
619 root 1.38
620     If the function doesn't exist, then the node tries to C<require>
621     the package, then the package above the package and so on (e.g.
622     C<MyApp::Chat::Server>, C<MyApp::Chat>, C<MyApp>) until the function
623     exists or it runs out of package names.
624    
625     The init function is then called with the newly-created port as context
626     object (C<$SELF>) and the C<@initdata> values as arguments.
627    
628 root 1.67 A common idiom is to pass a local port, immediately monitor the spawned
629     port, and in the remote init function, immediately monitor the passed
630     local port. This two-way monitoring ensures that both ports get cleaned up
631     when there is a problem.
632 root 1.38
633 root 1.80 C<spawn> guarantees that the C<$initfunc> has no visible effects on the
634     caller before C<spawn> returns (by delaying invocation when spawn is
635     called for the local node).
636    
637 root 1.38 Example: spawn a chat server port on C<$othernode>.
638    
639     # this node, executed from within a port context:
640     my $server = spawn $othernode, "MyApp::Chat::Server::connect", $SELF;
641     mon $server;
642    
643     # init function on C<$othernode>
644     sub connect {
645     my ($srcport) = @_;
646    
647     mon $srcport;
648    
649     rcv $SELF, sub {
650     ...
651     };
652     }
653    
654     =cut
655    
656     sub _spawn {
657     my $port = shift;
658     my $init = shift;
659    
660     local $SELF = "$NODE#$port";
661     eval {
662     &{ load_func $init }
663     };
664     _self_die if $@;
665     }
666    
667     sub spawn(@) {
668 root 1.75 my ($nodeid, undef) = split /#/, shift, 2;
669 root 1.38
670     my $id = "$RUNIQ." . $ID++;
671    
672 root 1.39 $_[0] =~ /::/
673     or Carp::croak "spawn init function must be a fully-qualified name, caught";
674    
675 root 1.75 snd_to_func $nodeid, "AnyEvent::MP::_spawn" => $id, @_;
676 root 1.38
677 root 1.75 "$nodeid#$id"
678 root 1.38 }
679    
680 root 1.59 =item after $timeout, @msg
681    
682     =item after $timeout, $callback
683    
684     Either sends the given message, or call the given callback, after the
685     specified number of seconds.
686    
687 root 1.67 This is simply a utility function that comes in handy at times - the
688     AnyEvent::MP author is not convinced of the wisdom of having it, though,
689     so it may go away in the future.
690 root 1.59
691     =cut
692    
693     sub after($@) {
694     my ($timeout, @action) = @_;
695    
696     my $t; $t = AE::timer $timeout, 0, sub {
697     undef $t;
698     ref $action[0]
699     ? $action[0]()
700     : snd @action;
701     };
702     }
703    
704 root 1.8 =back
705    
706 root 1.26 =head1 AnyEvent::MP vs. Distributed Erlang
707    
708 root 1.35 AnyEvent::MP got lots of its ideas from distributed Erlang (Erlang node
709     == aemp node, Erlang process == aemp port), so many of the documents and
710     programming techniques employed by Erlang apply to AnyEvent::MP. Here is a
711 root 1.27 sample:
712    
713 root 1.35 http://www.Erlang.se/doc/programming_rules.shtml
714     http://Erlang.org/doc/getting_started/part_frame.html # chapters 3 and 4
715     http://Erlang.org/download/Erlang-book-part1.pdf # chapters 5 and 6
716     http://Erlang.org/download/armstrong_thesis_2003.pdf # chapters 4 and 5
717 root 1.27
718     Despite the similarities, there are also some important differences:
719 root 1.26
720     =over 4
721    
722 root 1.65 =item * Node IDs are arbitrary strings in AEMP.
723 root 1.26
724 root 1.65 Erlang relies on special naming and DNS to work everywhere in the same
725     way. AEMP relies on each node somehow knowing its own address(es) (e.g. by
726 elmex 1.77 configuration or DNS), but will otherwise discover other odes itself.
727 root 1.27
728 root 1.54 =item * Erlang has a "remote ports are like local ports" philosophy, AEMP
729 root 1.51 uses "local ports are like remote ports".
730    
731     The failure modes for local ports are quite different (runtime errors
732     only) then for remote ports - when a local port dies, you I<know> it dies,
733     when a connection to another node dies, you know nothing about the other
734     port.
735    
736     Erlang pretends remote ports are as reliable as local ports, even when
737     they are not.
738    
739     AEMP encourages a "treat remote ports differently" philosophy, with local
740     ports being the special case/exception, where transport errors cannot
741     occur.
742    
743 root 1.26 =item * Erlang uses processes and a mailbox, AEMP does not queue.
744    
745 root 1.51 Erlang uses processes that selectively receive messages, and therefore
746     needs a queue. AEMP is event based, queuing messages would serve no
747     useful purpose. For the same reason the pattern-matching abilities of
748     AnyEvent::MP are more limited, as there is little need to be able to
749 elmex 1.77 filter messages without dequeuing them.
750 root 1.26
751 root 1.35 (But see L<Coro::MP> for a more Erlang-like process model on top of AEMP).
752 root 1.26
753     =item * Erlang sends are synchronous, AEMP sends are asynchronous.
754    
755 root 1.51 Sending messages in Erlang is synchronous and blocks the process (and
756     so does not need a queue that can overflow). AEMP sends are immediate,
757     connection establishment is handled in the background.
758 root 1.26
759 root 1.51 =item * Erlang suffers from silent message loss, AEMP does not.
760 root 1.26
761     Erlang makes few guarantees on messages delivery - messages can get lost
762     without any of the processes realising it (i.e. you send messages a, b,
763     and c, and the other side only receives messages a and c).
764    
765 root 1.66 AEMP guarantees correct ordering, and the guarantee that after one message
766     is lost, all following ones sent to the same port are lost as well, until
767     monitoring raises an error, so there are no silent "holes" in the message
768     sequence.
769 root 1.26
770     =item * Erlang can send messages to the wrong port, AEMP does not.
771    
772 root 1.51 In Erlang it is quite likely that a node that restarts reuses a process ID
773     known to other nodes for a completely different process, causing messages
774     destined for that process to end up in an unrelated process.
775 root 1.26
776     AEMP never reuses port IDs, so old messages or old port IDs floating
777     around in the network will not be sent to an unrelated port.
778    
779     =item * Erlang uses unprotected connections, AEMP uses secure
780     authentication and can use TLS.
781    
782 root 1.66 AEMP can use a proven protocol - TLS - to protect connections and
783 root 1.26 securely authenticate nodes.
784    
785 root 1.28 =item * The AEMP protocol is optimised for both text-based and binary
786     communications.
787    
788 root 1.66 The AEMP protocol, unlike the Erlang protocol, supports both programming
789     language independent text-only protocols (good for debugging) and binary,
790 root 1.67 language-specific serialisers (e.g. Storable). By default, unless TLS is
791     used, the protocol is actually completely text-based.
792 root 1.28
793     It has also been carefully designed to be implementable in other languages
794 root 1.66 with a minimum of work while gracefully degrading functionality to make the
795 root 1.28 protocol simple.
796    
797 root 1.35 =item * AEMP has more flexible monitoring options than Erlang.
798    
799     In Erlang, you can chose to receive I<all> exit signals as messages
800     or I<none>, there is no in-between, so monitoring single processes is
801     difficult to implement. Monitoring in AEMP is more flexible than in
802     Erlang, as one can choose between automatic kill, exit message or callback
803     on a per-process basis.
804    
805 root 1.37 =item * Erlang tries to hide remote/local connections, AEMP does not.
806 root 1.35
807 root 1.67 Monitoring in Erlang is not an indicator of process death/crashes, in the
808     same way as linking is (except linking is unreliable in Erlang).
809 root 1.37
810     In AEMP, you don't "look up" registered port names or send to named ports
811     that might or might not be persistent. Instead, you normally spawn a port
812 root 1.67 on the remote node. The init function monitors you, and you monitor the
813     remote port. Since both monitors are local to the node, they are much more
814     reliable (no need for C<spawn_link>).
815 root 1.37
816     This also saves round-trips and avoids sending messages to the wrong port
817     (hard to do in Erlang).
818 root 1.35
819 root 1.26 =back
820    
821 root 1.46 =head1 RATIONALE
822    
823     =over 4
824    
825 root 1.67 =item Why strings for port and node IDs, why not objects?
826 root 1.46
827     We considered "objects", but found that the actual number of methods
828 root 1.67 that can be called are quite low. Since port and node IDs travel over
829 root 1.46 the network frequently, the serialising/deserialising would add lots of
830 root 1.67 overhead, as well as having to keep a proxy object everywhere.
831 root 1.46
832     Strings can easily be printed, easily serialised etc. and need no special
833     procedures to be "valid".
834    
835 root 1.67 And as a result, a miniport consists of a single closure stored in a
836     global hash - it can't become much cheaper.
837 root 1.47
838 root 1.67 =item Why favour JSON, why not a real serialising format such as Storable?
839 root 1.46
840     In fact, any AnyEvent::MP node will happily accept Storable as framing
841     format, but currently there is no way to make a node use Storable by
842 root 1.67 default (although all nodes will accept it).
843 root 1.46
844     The default framing protocol is JSON because a) JSON::XS is many times
845     faster for small messages and b) most importantly, after years of
846     experience we found that object serialisation is causing more problems
847 root 1.67 than it solves: Just like function calls, objects simply do not travel
848 root 1.46 easily over the network, mostly because they will always be a copy, so you
849     always have to re-think your design.
850    
851     Keeping your messages simple, concentrating on data structures rather than
852     objects, will keep your messages clean, tidy and efficient.
853    
854     =back
855    
856 root 1.1 =head1 SEE ALSO
857    
858 root 1.68 L<AnyEvent::MP::Intro> - a gentle introduction.
859    
860     L<AnyEvent::MP::Kernel> - more, lower-level, stuff.
861    
862     L<AnyEvent::MP::Global> - network maintainance and port groups, to find
863     your applications.
864    
865 root 1.1 L<AnyEvent>.
866    
867     =head1 AUTHOR
868    
869     Marc Lehmann <schmorp@schmorp.de>
870     http://home.schmorp.de/
871    
872     =cut
873    
874     1
875