--- AnyEvent-MP/MP.pm	2009/09/07 18:42:09	1.82
+++ AnyEvent-MP/MP.pm	2012/03/12 10:34:06	1.133
@@ -1,6 +1,6 @@
 =head1 NAME
 
-AnyEvent::MP - multi-processing/message-passing framework
+AnyEvent::MP - erlang-style multi-processing/message-passing framework
 
 =head1 SYNOPSIS
 
@@ -32,20 +32,30 @@
    # create a port on another node
    my $port = spawn $node, $initfunc, @initdata;
 
+   # destroy a port again
+   kil $port;  # "normal" kill
+   kil $port, my_error => "everything is broken"; # error kill
+
    # monitoring
    mon $port, $cb->(@msg)      # callback is invoked on death
-   mon $port, $otherport       # kill otherport on abnormal death
-   mon $port, $otherport, @msg # send message on death
+   mon $port, $localport       # kill localport on abnormal death
+   mon $port, $localport, @msg # send message on death
+
+   # temporarily execute code in port context
+   peval $port, sub { die "kill the port!" };
+
+   # execute callbacks in $SELF port context
+   my $timer = AE::timer 1, 0, psub {
+      die "kill the port, delayed";
+   };
 
 =head1 CURRENT STATUS
 
    bin/aemp                - stable.
    AnyEvent::MP            - stable API, should work.
    AnyEvent::MP::Intro     - explains most concepts.
-   AnyEvent::MP::Kernel    - mostly stable.
-   AnyEvent::MP::Global    - stable but incomplete, protocol not yet final.
-
-stay tuned.
+   AnyEvent::MP::Kernel    - mostly stable API.
+   AnyEvent::MP::Global    - stable API.
 
 =head1 DESCRIPTION
 
@@ -70,10 +80,13 @@
 some messages. Messages send to ports will not be queued, regardless of
 anything was listening for them or not.
 
+Ports are represented by (printable) strings called "port IDs".
+
 =item port ID - C<nodeid#portname>
 
-A port ID is the concatenation of a node ID, a hash-mark (C<#>) as
-separator, and a port name (a printable string of unspecified format).
+A port ID is the concatenation of a node ID, a hash-mark (C<#>)
+as separator, and a port name (a printable string of unspecified
+format created by AnyEvent::MP).
 
 =item node
 
@@ -83,36 +96,77 @@
 
 Nodes are either public (have one or more listening ports) or private
 (no listening ports). Private nodes cannot talk to other private nodes
-currently.
+currently, but all nodes can talk to public nodes.
 
-=item node ID - C<[a-za-Z0-9_\-.:]+>
+Nodes is represented by (printable) strings called "node IDs".
+
+=item node ID - C<[A-Za-z0-9_\-.:]*>
 
 A node ID is a string that uniquely identifies the node within a
 network. Depending on the configuration used, node IDs can look like a
 hostname, a hostname and a port, or a random string. AnyEvent::MP itself
-doesn't interpret node IDs in any way.
+doesn't interpret node IDs in any way except to uniquely identify a node.
 
 =item binds - C<ip:port>
 
 Nodes can only talk to each other by creating some kind of connection to
 each other. To do this, nodes should listen on one or more local transport
-endpoints - binds. Currently, only standard C<ip:port> specifications can
-be used, which specify TCP ports to listen on.
-
-=item seeds - C<host:port>
+endpoints - binds.
 
-When a node starts, it knows nothing about the network. To teach the node
-about the network it first has to contact some other node within the
-network. This node is called a seed.
-
-Seeds are transport endpoint(s) of as many nodes as one wants. Those nodes
-are expected to be long-running, and at least one of those should always
-be available. When nodes run out of connections (e.g. due to a network
-error), they try to re-establish connections to some seednodes again to
-join the network.
-
-Apart from being sued for seeding, seednodes are not special in any way -
-every public node can be a seednode.
+Currently, only standard C<ip:port> specifications can be used, which
+specify TCP ports to listen on. So a bind is basically just a tcp socket
+in listening mode thta accepts conenctions form other nodes.
+
+=item seed nodes
+
+When a node starts, it knows nothing about the network it is in - it
+needs to connect to at least one other node that is already in the
+network. These other nodes are called "seed nodes".
+
+Seed nodes themselves are not special - they are seed nodes only because
+some other node I<uses> them as such, but any node can be used as seed
+node for other nodes, and eahc node cna use a different set of seed nodes.
+
+In addition to discovering the network, seed nodes are also used to
+maintain the network - all nodes using the same seed node form are part of
+the same network. If a network is split into multiple subnets because e.g.
+the network link between the parts goes down, then using the same seed
+nodes for all nodes ensures that eventually the subnets get merged again.
+
+Seed nodes are expected to be long-running, and at least one seed node
+should always be available. They should also be relatively responsive - a
+seed node that blocks for long periods will slow down everybody else.
+
+For small networks, it's best if every node uses the same set of seed
+nodes. For large networks, it can be useful to specify "regional" seed
+nodes for most nodes in an area, and use all seed nodes as seed nodes for
+each other. What's important is that all seed nodes connections form a
+complete graph, so that the network cannot split into separate subnets
+forever.
+
+Seed nodes are represented by seed IDs.
+
+=item seed IDs - C<host:port>
+
+Seed IDs are transport endpoint(s) (usually a hostname/IP address and a
+TCP port) of nodes that should be used as seed nodes.
+
+=item global nodes
+
+An AEMP network needs a discovery service - nodes need to know how to
+connect to other nodes they only know by name. In addition, AEMP offers a
+distributed "group database", which maps group names to a list of strings
+- for example, to register worker ports.
+
+A network needs at least one global node to work, and allows every node to
+be a global node.
+
+Any node that loads the L<AnyEvent::MP::Global> module becomes a global
+node and tries to keep connections to all other nodes. So while it can
+make sense to make every node "global" in small networks, it usually makes
+sense to only make seed nodes into global nodes in large networks (nodes
+keep connections to seed nodes and global nodes, so makign them the same
+reduces overhead).
 
 =back
 
@@ -124,23 +178,28 @@
 
 package AnyEvent::MP;
 
+use AnyEvent::MP::Config ();
 use AnyEvent::MP::Kernel;
+use AnyEvent::MP::Kernel qw(%NODE %PORT %PORT_DATA $UNIQ $RUNIQ $ID);
 
 use common::sense;
 
 use Carp ();
 
 use AE ();
+use Guard ();
 
 use base "Exporter";
 
-our $VERSION = $AnyEvent::MP::Kernel::VERSION;
+our $VERSION = $AnyEvent::MP::Config::VERSION;
 
 our @EXPORT = qw(
    NODE $NODE *SELF node_of after
    configure
-   snd rcv mon mon_guard kil reg psub spawn
+   snd rcv mon mon_guard kil psub peval spawn cal
    port
+   db_set db_del db_reg
+   db_mon db_family db_keys db_values
 );
 
 our $SELF;
@@ -173,6 +232,33 @@
 This function configures a node - it must be called exactly once (or
 never) before calling other AnyEvent::MP functions.
 
+The key/value pairs are basically the same ones as documented for the
+F<aemp> command line utility (sans the set/del prefix), with these additions:
+
+=over 4
+
+=item norc => $boolean (default false)
+
+If true, then the rc file (e.g. F<~/.perl-anyevent-mp>) will I<not>
+be consulted - all configuraiton options must be specified in the
+C<configure> call.
+
+=item force => $boolean (default false)
+
+IF true, then the values specified in the C<configure> will take
+precedence over any values configured via the rc file. The default is for
+the rc file to override any options specified in the program.
+
+=item secure => $pass->($nodeid)
+
+In addition to specifying a boolean, you can specify a code reference that
+is called for every remote execution attempt - the execution request is
+granted iff the callback returns a true value.
+
+See F<semp setsecure> for more info.
+
+=back
+
 =over 4
 
 =item step 1, gathering configuration from profiles
@@ -195,8 +281,14 @@
 and can only be used to specify defaults.
 
 If the profile specifies a node ID, then this will become the node ID of
-this process. If not, then the profile name will be used as node ID. The
-special node ID of C<anon/> will be replaced by a random node ID.
+this process. If not, then the profile name will be used as node ID, with
+a unique randoms tring (C</%u>) appended.
+
+The node ID can contain some C<%> sequences that are expanded: C<%n>
+is expanded to the local nodename, C<%u> is replaced by a random
+strign to make the node unique. For example, the F<aemp> commandline
+utility uses C<aemp/%n/%u> as nodename, which might expand to
+C<aemp/cerebro/ZQDGSIkRhEZQDGSIkRhE>.
 
 =item step 2, bind listener sockets
 
@@ -212,28 +304,28 @@
 
 =item step 3, connect to seed nodes
 
-As the last step, the seeds list from the profile is passed to the
+As the last step, the seed ID list from the profile is passed to the
 L<AnyEvent::MP::Global> module, which will then use it to keep
 connectivity with at least one node at any point in time.
 
 =back
 
-Example: become a distributed node using the locla node name as profile.
+Example: become a distributed node using the local node name as profile.
 This should be the most common form of invocation for "daemon"-type nodes.
 
    configure
 
-Example: become an anonymous node. This form is often used for commandline
-clients.
+Example: become a semi-anonymous node. This form is often used for
+commandline clients.
 
-   configure nodeid => "anon/";
+   configure nodeid => "myscript/%n/%u";
 
-Example: configure a node using a profile called seed, which si suitable
+Example: configure a node using a profile called seed, which is suitable
 for a seed node as it binds on all local addresses on a fixed port (4040,
 customary for aemp).
 
    # use the aemp commandline utility
-   # aemp profile seed nodeid anon/ binds '*:4040'
+   # aemp profile seed binds '*:4040'
 
    # then use it
    configure profile => "seed";
@@ -308,15 +400,16 @@
 
 sub rcv($@);
 
-sub _kilme {
-   die "received message on port without callback";
-}
+my $KILME = sub {
+   (my $tag = substr $_[0], 0, 30) =~ s/([\x20-\x7e])/./g;
+   kil $SELF, unhandled_message => "no callback set for message (first element $tag)";
+};
 
 sub port(;&) {
-   my $id = "$UNIQ." . $ID++;
+   my $id = $UNIQ . ++$ID;
    my $port = "$NODE#$id";
 
-   rcv $port, shift || \&_kilme;
+   rcv $port, shift || $KILME;
 
    $port
 }
@@ -331,7 +424,7 @@
 executing the callback. Runtime errors during callback execution will
 result in the port being C<kil>ed.
 
-The default callback received all messages not matched by a more specific
+The default callback receives all messages not matched by a more specific
 C<tag> match.
 
 =item rcv $local_port, tag => $callback->(@msg_without_tag), ...
@@ -362,7 +455,7 @@
    ;
 
 Example: temporarily register a rcv callback for a tag matching some port
-(e.g. for a rpc reply) and unregister it after a message was received.
+(e.g. for an rpc reply) and unregister it after a message was received.
 
    rcv $port, $otherport => sub {
       my @reply = @_;
@@ -385,7 +478,7 @@
             "AnyEvent::MP::Port" eq ref $self
                or Carp::croak "$port: rcv can only be called on message matching ports, caught";
 
-            $self->[2] = shift;
+            $self->[0] = shift;
          } else {
             my $cb = shift;
             $PORT{$portid} = sub {
@@ -395,7 +488,7 @@
          }
       } elsif (defined $_[0]) {
          my $self = $PORT_DATA{$portid} ||= do {
-            my $self = bless [$PORT{$port} || sub { }, { }, $port], "AnyEvent::MP::Port";
+            my $self = bless [$PORT{$portid} || sub { }, { }, $port], "AnyEvent::MP::Port";
 
             $PORT{$portid} = sub {
                local $SELF = $port;
@@ -427,12 +520,52 @@
    $port
 }
 
+=item peval $port, $coderef[, @args]
+
+Evaluates the given C<$codref> within the contetx of C<$port>, that is,
+when the code throews an exception the C<$port> will be killed.
+
+Any remaining args will be passed to the callback. Any return values will
+be returned to the caller.
+
+This is useful when you temporarily want to execute code in the context of
+a port.
+
+Example: create a port and run some initialisation code in it's context.
+
+   my $port = port { ... };
+
+   peval $port, sub {
+      init
+         or die "unable to init";
+   };
+
+=cut
+
+sub peval($$) {
+   local $SELF = shift;
+   my $cb = shift;
+
+   if (wantarray) {
+      my @res = eval { &$cb };
+      _self_die if $@;
+      @res
+   } else {
+      my $res = eval { &$cb };
+      _self_die if $@;
+      $res
+   }
+}
+
 =item $closure = psub { BLOCK }
 
 Remembers C<$SELF> and creates a closure out of the BLOCK. When the
 closure is executed, sets up the environment in the same way as in C<rcv>
 callbacks, i.e. runtime errors will cause the port to get C<kil>ed.
 
+The effect is basically as if it returned C<< sub { peval $SELF, sub {
+BLOCK }, @_ } >>.
+
 This is useful when you register callbacks from C<rcv> callbacks:
 
    rcv delayed_reply => sub {
@@ -514,7 +647,7 @@
 Inter-host-connection timeouts and monitoring depend on the transport
 used. The only transport currently implemented is TCP, and AnyEvent::MP
 relies on TCP to detect node-downs (this can take 10-15 minutes on a
-non-idle connection, and usually around two hours for idle conenctions).
+non-idle connection, and usually around two hours for idle connections).
 
 This means that monitoring is good for program errors and cleaning up
 stuff eventually, but they are no replacement for a timeout when you need
@@ -556,7 +689,7 @@
    $node->monitor ($port, $cb);
 
    defined wantarray
-      and AnyEvent::Util::guard { $node->unmonitor ($port, $cb) }
+      and ($cb += 0, Guard::guard { $node->unmonitor ($port, $cb) })
 }
 
 =item $guard = mon_guard $port, $ref, $ref...
@@ -589,12 +722,12 @@
 
 Kill the specified port with the given C<@reason>.
 
-If no C<@reason> is specified, then the port is killed "normally" (ports
-monitoring other ports will not necessarily die because a port dies
-"normally").
+If no C<@reason> is specified, then the port is killed "normally" -
+monitor callback will be invoked, but the kil will not cause linked ports
+(C<mon $mport, $lport> form) to get killed.
 
-Otherwise, linked ports get killed with the same reason (second form of
-C<mon>, see above).
+If a C<@reason> is specified, then linked ports (C<mon $mport, $lport>
+form) get killed with the same reason.
 
 Runtime errors while evaluating C<rcv> callbacks or inside C<psub> blocks
 will be reported as reason C<< die => $@ >>.
@@ -602,7 +735,17 @@
 Transport/communication errors are reported as C<< transport_error =>
 $message >>.
 
-=cut
+Common idioms:
+
+   # silently remove yourself, do not kill linked ports
+   kil $SELF;
+
+   # report a failure in some detail
+   kil $SELF, failure_mode_1 => "it failed with too high temperature";
+
+   # do not waste much time with killing, just die when something goes wrong
+   open my $fh, "<file"
+      or die "file: $!";
 
 =item $port = spawn $node, $initfunc[, @initdata]
 
@@ -670,7 +813,7 @@
 sub spawn(@) {
    my ($nodeid, undef) = split /#/, shift, 2;
 
-   my $id = "$RUNIQ." . $ID++;
+   my $id = $RUNIQ . ++$ID;
 
    $_[0] =~ /::/
       or Carp::croak "spawn init function must be a fully-qualified name, caught";
@@ -680,6 +823,7 @@
    "$nodeid#$id"
 }
 
+
 =item after $timeout, @msg
 
 =item after $timeout, $callback
@@ -704,6 +848,207 @@
    };
 }
 
+#=item $cb2 = timeout $seconds, $cb[, @args]
+
+=item cal $port, @msg, $callback[, $timeout]
+
+A simple form of RPC - sends a message to the given C<$port> with the
+given contents (C<@msg>), but adds a reply port to the message.
+
+The reply port is created temporarily just for the purpose of receiving
+the reply, and will be C<kil>ed when no longer needed.
+
+A reply message sent to the port is passed to the C<$callback> as-is.
+
+If an optional time-out (in seconds) is given and it is not C<undef>,
+then the callback will be called without any arguments after the time-out
+elapsed and the port is C<kil>ed.
+
+If no time-out is given (or it is C<undef>), then the local port will
+monitor the remote port instead, so it eventually gets cleaned-up.
+
+Currently this function returns the temporary port, but this "feature"
+might go in future versions unless you can make a convincing case that
+this is indeed useful for something.
+
+=cut
+
+sub cal(@) {
+   my $timeout = ref $_[-1] ? undef : pop;
+   my $cb = pop;
+
+   my $port = port {
+      undef $timeout;
+      kil $SELF;
+      &$cb;
+   };
+
+   if (defined $timeout) {
+      $timeout = AE::timer $timeout, 0, sub {
+         undef $timeout;
+         kil $port;
+         $cb->();
+      };
+   } else {
+      mon $_[0], sub {
+         kil $port;
+         $cb->();
+      };
+   }
+
+   push @_, $port;
+   &snd;
+
+   $port
+}
+
+=back
+
+=head1 DISTRIBUTED DATABASE
+
+AnyEvent::MP comes with a simple distributed database. The database will
+be mirrored asynchronously on all global nodes. Other nodes bind to one
+of the global nodes for their needs. Every node has a "local database"
+which contains all the values that are set locally. All local databases
+are merged together to form the global database, which can be queried.
+
+The database structure is that of a two-level hash - the database hash
+contains hashes which contain values, similarly to a perl hash of hashes,
+i.e.:
+
+  $DATABASE{$family}{$subkey} = $value
+
+The top level hash key is called "family", and the second-level hash key
+is called "subkey" or simply "key".
+
+The family must be alphanumeric, i.e. start with a letter and consist
+of letters, digits, underscores and colons (C<[A-Za-z][A-Za-z0-9_:]*>,
+pretty much like Perl module names.
+
+As the family namespace is global, it is recommended to prefix family names
+with the name of the application or module using it.
+
+The subkeys must be non-empty strings, with no further restrictions.
+
+The values should preferably be strings, but other perl scalars should
+work as well (such as C<undef>, arrays and hashes).
+
+Every database entry is owned by one node - adding the same family/subkey
+combination on multiple nodes will not cause discomfort for AnyEvent::MP,
+but the result might be nondeterministic, i.e. the key might have
+different values on different nodes.
+
+Different subkeys in the same family can be owned by different nodes
+without problems, and in fact, this is the common method to create worker
+pools. For example, a worker port for image scaling might do this:
+
+   db_set my_image_scalers => $port;
+
+And clients looking for an image scaler will want to get the
+C<my_image_scalers> keys from time to time:
+
+   db_keys my_image_scalers => sub {
+      @ports = @{ $_[0] };
+   };
+
+Or better yet, they want to monitor the database family, so they always
+have a reasonable up-to-date copy:
+
+   db_mon my_image_scalers => sub {
+      @ports = keys %{ $_[0] };
+   };
+
+In general, you can set or delete single subkeys, but query and monitor
+whole families only.
+
+If you feel the need to monitor or query a single subkey, try giving it
+it's own family.
+
+=over
+
+=item db_set $family => $subkey [=> $value]
+
+Sets (or replaces) a key to the database - if C<$value> is omitted,
+C<undef> is used instead.
+
+=item db_del $family => $subkey...
+
+Deletes one or more subkeys from the database family.
+
+=item $guard = db_reg $family => $subkey [=> $value]
+
+Sets the key on the database and returns a guard. When the guard is
+destroyed, the key is deleted from the database. If C<$value> is missing,
+then C<undef> is used.
+
+=item db_family $family => $cb->(\%familyhash)
+
+Queries the named database C<$family> and call the callback with the
+family represented as a hash. You can keep and freely modify the hash.
+
+=item db_keys $family => $cb->(\@keys)
+
+Same as C<db_family>, except it only queries the family I<subkeys> and passes
+them as array reference to the callback.
+
+=item db_values $family => $cb->(\@values)
+
+Same as C<db_family>, except it only queries the family I<values> and passes them
+as array reference to the callback.
+
+=item $guard = db_mon $family => $cb->($familyhash, \@added, \@changed, \@deleted)
+
+Creates a monitor on the given database family. Each time a key is set
+or or is deleted the callback is called with a hash containing the
+database family and three lists of added, changed and deleted subkeys,
+respectively. If no keys have changed then the array reference might be
+C<undef> or even missing.
+
+If not called in void context, a guard object is returned that, when
+destroyed, stops the monitor.
+
+The family hash reference and the key arrays belong to AnyEvent::MP and
+B<must not be modified or stored> by the callback. When in doubt, make a
+copy.
+
+As soon as possible after the monitoring starts, the callback will be
+called with the intiial contents of the family, even if it is empty,
+i.e. there will always be a timely call to the callback with the current
+contents.
+
+It is possible that the callback is called with a change event even though
+the subkey is already present and the value has not changed.
+
+The monitoring stops when the guard object is destroyed.
+
+Example: on every change to the family "mygroup", print out all keys.
+
+   my $guard = db_mon mygroup => sub {
+      my ($family, $a, $c, $d) = @_;
+      print "mygroup members: ", (join " ", keys %$family), "\n";
+   };
+
+Exmaple: wait until the family "My::Module::workers" is non-empty.
+
+   my $guard; $guard = db_mon My::Module::workers => sub {
+      my ($family, $a, $c, $d) = @_;
+      return unless %$family;
+      undef $guard;
+      print "My::Module::workers now nonempty\n";
+   };
+
+Example: print all changes to the family "AnyRvent::Fantasy::Module".
+
+   my $guard = db_mon AnyRvent::Fantasy::Module => sub {
+      my ($family, $a, $c, $d) = @_;
+
+      print "+$_=$family->{$_}\n" for @$a;
+      print "*$_=$family->{$_}\n" for @$c;
+      print "-$_=$family->{$_}\n" for @$d;
+   };
+
+=cut
+
 =back
 
 =head1 AnyEvent::MP vs. Distributed Erlang
@@ -713,10 +1058,10 @@
 programming techniques employed by Erlang apply to AnyEvent::MP. Here is a
 sample:
 
-   http://www.Erlang.se/doc/programming_rules.shtml
-   http://Erlang.org/doc/getting_started/part_frame.html # chapters 3 and 4
-   http://Erlang.org/download/Erlang-book-part1.pdf      # chapters 5 and 6
-   http://Erlang.org/download/armstrong_thesis_2003.pdf  # chapters 4 and 5
+   http://www.erlang.se/doc/programming_rules.shtml
+   http://erlang.org/doc/getting_started/part_frame.html # chapters 3 and 4
+   http://erlang.org/download/erlang-book-part1.pdf      # chapters 5 and 6
+   http://erlang.org/download/armstrong_thesis_2003.pdf  # chapters 4 and 5
 
 Despite the similarities, there are also some important differences:
 
@@ -726,7 +1071,8 @@
 
 Erlang relies on special naming and DNS to work everywhere in the same
 way. AEMP relies on each node somehow knowing its own address(es) (e.g. by
-configuration or DNS), but will otherwise discover other odes itself.
+configuration or DNS), and possibly the addresses of some seed nodes, but
+will otherwise discover other nodes (and their IDs) itself.
 
 =item * Erlang has a "remote ports are like local ports" philosophy, AEMP
 uses "local ports are like remote ports".
@@ -745,38 +1091,49 @@
 
 =item * Erlang uses processes and a mailbox, AEMP does not queue.
 
-Erlang uses processes that selectively receive messages, and therefore
-needs a queue. AEMP is event based, queuing messages would serve no
-useful purpose. For the same reason the pattern-matching abilities of
-AnyEvent::MP are more limited, as there is little need to be able to
+Erlang uses processes that selectively receive messages out of order, and
+therefore needs a queue. AEMP is event based, queuing messages would serve
+no useful purpose. For the same reason the pattern-matching abilities
+of AnyEvent::MP are more limited, as there is little need to be able to
 filter messages without dequeuing them.
 
-(But see L<Coro::MP> for a more Erlang-like process model on top of AEMP).
+This is not a philosophical difference, but simply stems from AnyEvent::MP
+being event-based, while Erlang is process-based.
+
+You cna have a look at L<Coro::MP> for a more Erlang-like process model on
+top of AEMP and Coro threads.
 
 =item * Erlang sends are synchronous, AEMP sends are asynchronous.
 
-Sending messages in Erlang is synchronous and blocks the process (and
-so does not need a queue that can overflow). AEMP sends are immediate,
-connection establishment is handled in the background.
+Sending messages in Erlang is synchronous and blocks the process until
+a conenction has been established and the message sent (and so does not
+need a queue that can overflow). AEMP sends return immediately, connection
+establishment is handled in the background.
 
 =item * Erlang suffers from silent message loss, AEMP does not.
 
-Erlang makes few guarantees on messages delivery - messages can get lost
-without any of the processes realising it (i.e. you send messages a, b,
-and c, and the other side only receives messages a and c).
-
-AEMP guarantees correct ordering, and the guarantee that after one message
-is lost, all following ones sent to the same port are lost as well, until
-monitoring raises an error, so there are no silent "holes" in the message
-sequence.
+Erlang implements few guarantees on messages delivery - messages can get
+lost without any of the processes realising it (i.e. you send messages a,
+b, and c, and the other side only receives messages a and c).
+
+AEMP guarantees (modulo hardware errors) correct ordering, and the
+guarantee that after one message is lost, all following ones sent to the
+same port are lost as well, until monitoring raises an error, so there are
+no silent "holes" in the message sequence.
+
+If you want your software to be very reliable, you have to cope with
+corrupted and even out-of-order messages in both Erlang and AEMP. AEMP
+simply tries to work better in common error cases, such as when a network
+link goes down.
 
 =item * Erlang can send messages to the wrong port, AEMP does not.
 
-In Erlang it is quite likely that a node that restarts reuses a process ID
-known to other nodes for a completely different process, causing messages
-destined for that process to end up in an unrelated process.
+In Erlang it is quite likely that a node that restarts reuses an Erlang
+process ID known to other nodes for a completely different process,
+causing messages destined for that process to end up in an unrelated
+process.
 
-AEMP never reuses port IDs, so old messages or old port IDs floating
+AEMP does not reuse port IDs, so old messages or old port IDs floating
 around in the network will not be sent to an unrelated port.
 
 =item * Erlang uses unprotected connections, AEMP uses secure
@@ -789,7 +1146,7 @@
 communications.
 
 The AEMP protocol, unlike the Erlang protocol, supports both programming
-language independent text-only protocols (good for debugging) and binary,
+language independent text-only protocols (good for debugging), and binary,
 language-specific serialisers (e.g. Storable). By default, unless TLS is
 used, the protocol is actually completely text-based.
 
@@ -799,11 +1156,12 @@
 
 =item * AEMP has more flexible monitoring options than Erlang.
 
-In Erlang, you can chose to receive I<all> exit signals as messages
-or I<none>, there is no in-between, so monitoring single processes is
-difficult to implement. Monitoring in AEMP is more flexible than in
-Erlang, as one can choose between automatic kill, exit message or callback
-on a per-process basis.
+In Erlang, you can chose to receive I<all> exit signals as messages or
+I<none>, there is no in-between, so monitoring single Erlang processes is
+difficult to implement.
+
+Monitoring in AEMP is more flexible than in Erlang, as one can choose
+between automatic kill, exit message or callback on a per-port basis.
 
 =item * Erlang tries to hide remote/local connections, AEMP does not.
 
@@ -835,8 +1193,8 @@
 Strings can easily be printed, easily serialised etc. and need no special
 procedures to be "valid".
 
-And as a result, a miniport consists of a single closure stored in a
-global hash - it can't become much cheaper.
+And as a result, a port with just a default receiver consists of a single
+code reference stored in a global hash - it can't become much cheaper.
 
 =item Why favour JSON, why not a real serialising format such as Storable?
 
@@ -862,9 +1220,11 @@
 
 L<AnyEvent::MP::Kernel> - more, lower-level, stuff.
 
-L<AnyEvent::MP::Global> - network maintainance and port groups, to find
+L<AnyEvent::MP::Global> - network maintenance and port groups, to find
 your applications.
 
+L<AnyEvent::MP::DataConn> - establish data connections between nodes.
+
 L<AnyEvent::MP::LogCatcher> - simple service to display log messages from
 all nodes.