[ViewVC] Diff of: cvs/AnyEvent-MP/MP.pm

Comparing AnyEvent-MP/MP.pm (file contents):
Revision 1.115 by root, Fri May 7 18:14:21 2010 UTC vs.
Revision 1.140 by root, Thu Mar 22 23:47:01 2012 UTC

…		…
30	rcv $port, pong => sub { warn "pong received\n" };	30	rcv $port, pong => sub { warn "pong received\n" };
31		31
32	# create a port on another node	32	# create a port on another node
33	my $port = spawn $node, $initfunc, @initdata;	33	my $port = spawn $node, $initfunc, @initdata;
34		34
35	# destroy a prot again	35	# destroy a port again
36	kil $port; # "normal" kill	36	kil $port; # "normal" kill
37	kil $port, my_error => "everything is broken"; # error kill	37	kil $port, my_error => "everything is broken"; # error kill
38		38
39	# monitoring	39	# monitoring
40	mon $localport, $cb->(@msg) # callback is invoked on death	40	mon $port, $cb->(@msg) # callback is invoked on death
41	mon $localport, $otherport # kill otherport on abnormal death	41	mon $port, $localport # kill localport on abnormal death
42	mon $localport, $otherport, @msg # send message on death	42	mon $port, $localport, @msg # send message on death
43		43
44	# temporarily execute code in port context	44	# temporarily execute code in port context
45	peval $port, sub { die "kill the port!" };	45	peval $port, sub { die "kill the port!" };
46		46
47	# execute callbacks in $SELF port context	47	# execute callbacks in $SELF port context
48	my $timer = AE::timer 1, 0, psub {	48	my $timer = AE::timer 1, 0, psub {
49	die "kill the port, delayed";	49	die "kill the port, delayed";
50	};	50	};
51		51
52	=head1 CURRENT STATUS
53
54	bin/aemp - stable.
55	AnyEvent::MP - stable API, should work.
56	AnyEvent::MP::Intro - explains most concepts.
57	AnyEvent::MP::Kernel - mostly stable API.
58	AnyEvent::MP::Global - stable API.
59
60	=head1 DESCRIPTION	52	=head1 DESCRIPTION
61		53
62	This module (-family) implements a simple message passing framework.	54	This module (-family) implements a simple message passing framework.
63		55
64	Despite its simplicity, you can securely message other processes running	56	Despite its simplicity, you can securely message other processes running
…		…
78		70
79	Ports allow you to register C<rcv> handlers that can match all or just	71	Ports allow you to register C<rcv> handlers that can match all or just
80	some messages. Messages send to ports will not be queued, regardless of	72	some messages. Messages send to ports will not be queued, regardless of
81	anything was listening for them or not.	73	anything was listening for them or not.
82		74
		75	Ports are represented by (printable) strings called "port IDs".
		76
83	=item port ID - C<nodeid#portname>	77	=item port ID - C<nodeid#portname>
84		78
85	A port ID is the concatenation of a node ID, a hash-mark (C<#>) as	79	A port ID is the concatenation of a node ID, a hash-mark (C<#>)
86	separator, and a port name (a printable string of unspecified format).	80	as separator, and a port name (a printable string of unspecified
		81	format created by AnyEvent::MP).
87		82
88	=item node	83	=item node
89		84
90	A node is a single process containing at least one port - the node port,	85	A node is a single process containing at least one port - the node port,
91	which enables nodes to manage each other remotely, and to create new	86	which enables nodes to manage each other remotely, and to create new
92	ports.	87	ports.
93		88
94	Nodes are either public (have one or more listening ports) or private	89	Nodes are either public (have one or more listening ports) or private
95	(no listening ports). Private nodes cannot talk to other private nodes	90	(no listening ports). Private nodes cannot talk to other private nodes
96	currently.	91	currently, but all nodes can talk to public nodes.
97		92
		93	Nodes is represented by (printable) strings called "node IDs".
		94
98	=item node ID - C<[A-Z_][a-zA-Z0-9_\-.:]*>	95	=item node ID - C<[A-Za-z0-9_\-.:]*>
99		96
100	A node ID is a string that uniquely identifies the node within a	97	A node ID is a string that uniquely identifies the node within a
101	network. Depending on the configuration used, node IDs can look like a	98	network. Depending on the configuration used, node IDs can look like a
102	hostname, a hostname and a port, or a random string. AnyEvent::MP itself	99	hostname, a hostname and a port, or a random string. AnyEvent::MP itself
103	doesn't interpret node IDs in any way.	100	doesn't interpret node IDs in any way except to uniquely identify a node.
104		101
105	=item binds - C<ip:port>	102	=item binds - C<ip:port>
106		103
107	Nodes can only talk to each other by creating some kind of connection to	104	Nodes can only talk to each other by creating some kind of connection to
108	each other. To do this, nodes should listen on one or more local transport	105	each other. To do this, nodes should listen on one or more local transport
		106	endpoints - binds.
		107
109	endpoints - binds. Currently, only standard C<ip:port> specifications can	108	Currently, only standard C<ip:port> specifications can be used, which
110	be used, which specify TCP ports to listen on.	109	specify TCP ports to listen on. So a bind is basically just a tcp socket
		110	in listening mode thta accepts conenctions form other nodes.
111		111
112	=item seed nodes	112	=item seed nodes
113		113
114	When a node starts, it knows nothing about the network. To teach the node	114	When a node starts, it knows nothing about the network it is in - it
115	about the network it first has to contact some other node within the	115	needs to connect to at least one other node that is already in the
116	network. This node is called a seed.	116	network. These other nodes are called "seed nodes".
117		117
118	Apart from the fact that other nodes know them as seed nodes and they have	118	Seed nodes themselves are not special - they are seed nodes only because
119	to have fixed listening addresses, seed nodes are perfectly normal nodes -	119	some other node I<uses> them as such, but any node can be used as seed
120	any node can function as a seed node for others.	120	node for other nodes, and eahc node cna use a different set of seed nodes.
121		121
122	In addition to discovering the network, seed nodes are also used to	122	In addition to discovering the network, seed nodes are also used to
123	maintain the network and to connect nodes that otherwise would have	123	maintain the network - all nodes using the same seed node form are part of
124	trouble connecting. They form the backbone of an AnyEvent::MP network.	124	the same network. If a network is split into multiple subnets because e.g.
		125	the network link between the parts goes down, then using the same seed
		126	nodes for all nodes ensures that eventually the subnets get merged again.
125		127
126	Seed nodes are expected to be long-running, and at least one seed node	128	Seed nodes are expected to be long-running, and at least one seed node
127	should always be available. They should also be relatively responsive - a	129	should always be available. They should also be relatively responsive - a
128	seed node that blocks for long periods will slow down everybody else.	130	seed node that blocks for long periods will slow down everybody else.
129		131
		132	For small networks, it's best if every node uses the same set of seed
		133	nodes. For large networks, it can be useful to specify "regional" seed
		134	nodes for most nodes in an area, and use all seed nodes as seed nodes for
		135	each other. What's important is that all seed nodes connections form a
		136	complete graph, so that the network cannot split into separate subnets
		137	forever.
		138
		139	Seed nodes are represented by seed IDs.
		140
130	=item seeds - C<host:port>	141	=item seed IDs - C<host:port>
131		142
132	Seeds are transport endpoint(s) (usually a hostname/IP address and a	143	Seed IDs are transport endpoint(s) (usually a hostname/IP address and a
133	TCP port) of nodes that should be used as seed nodes.	144	TCP port) of nodes that should be used as seed nodes.
134		145
135	The nodes listening on those endpoints are expected to be long-running,	146	=item global nodes
136	and at least one of those should always be available. When nodes run out	147
137	of connections (e.g. due to a network error), they try to re-establish	148	An AEMP network needs a discovery service - nodes need to know how to
138	connections to some seednodes again to join the network.	149	connect to other nodes they only know by name. In addition, AEMP offers a
		150	distributed "group database", which maps group names to a list of strings
		151	- for example, to register worker ports.
		152
		153	A network needs at least one global node to work, and allows every node to
		154	be a global node.
		155
		156	Any node that loads the L<AnyEvent::MP::Global> module becomes a global
		157	node and tries to keep connections to all other nodes. So while it can
		158	make sense to make every node "global" in small networks, it usually makes
		159	sense to only make seed nodes into global nodes in large networks (nodes
		160	keep connections to seed nodes and global nodes, so makign them the same
		161	reduces overhead).
139		162
140	=back	163	=back
141		164
142	=head1 VARIABLES/FUNCTIONS	165	=head1 VARIABLES/FUNCTIONS
143		166
…		…
145		168
146	=cut	169	=cut
147		170
148	package AnyEvent::MP;	171	package AnyEvent::MP;
149		172
		173	use AnyEvent::MP::Config ();
150	use AnyEvent::MP::Kernel;	174	use AnyEvent::MP::Kernel;
		175	use AnyEvent::MP::Kernel qw(%NODE %PORT %PORT_DATA $UNIQ $RUNIQ $ID);
151		176
152	use common::sense;	177	use common::sense;
153		178
154	use Carp ();	179	use Carp ();
155		180
156	use AE ();	181	use AE ();
		182	use Guard ();
157		183
158	use base "Exporter";	184	use base "Exporter";
159		185
160	our $VERSION = 1.29;	186	our $VERSION = $AnyEvent::MP::Config::VERSION;
161		187
162	our @EXPORT = qw(	188	our @EXPORT = qw(
163	NODE $NODE *SELF node_of after	189	NODE $NODE *SELF node_of after
164	configure	190	configure
165	snd rcv mon mon_guard kil psub peval spawn cal	191	snd rcv mon mon_guard kil psub peval spawn cal
166	port	192	port
		193	db_set db_del db_reg
		194	db_mon db_family db_keys db_values
167	);	195	);
168		196
169	our $SELF;	197	our $SELF;
170		198
171	sub _self_die() {	199	sub _self_die() {
…		…
191	Before a node can talk to other nodes on the network (i.e. enter	219	Before a node can talk to other nodes on the network (i.e. enter
192	"distributed mode") it has to configure itself - the minimum a node needs	220	"distributed mode") it has to configure itself - the minimum a node needs
193	to know is its own name, and optionally it should know the addresses of	221	to know is its own name, and optionally it should know the addresses of
194	some other nodes in the network to discover other nodes.	222	some other nodes in the network to discover other nodes.
195		223
196	The key/value pairs are basically the same ones as documented for the
197	F<aemp> command line utility (sans the set/del prefix).
198
199	This function configures a node - it must be called exactly once (or	224	This function configures a node - it must be called exactly once (or
200	never) before calling other AnyEvent::MP functions.	225	never) before calling other AnyEvent::MP functions.
		226
		227	The key/value pairs are basically the same ones as documented for the
		228	F<aemp> command line utility (sans the set/del prefix), with these additions:
		229
		230	=over 4
		231
		232	=item norc => $boolean (default false)
		233
		234	If true, then the rc file (e.g. F<~/.perl-anyevent-mp>) will I<not>
		235	be consulted - all configuraiton options must be specified in the
		236	C<configure> call.
		237
		238	=item force => $boolean (default false)
		239
		240	IF true, then the values specified in the C<configure> will take
		241	precedence over any values configured via the rc file. The default is for
		242	the rc file to override any options specified in the program.
		243
		244	=item secure => $pass->(@msg)
		245
		246	In addition to specifying a boolean, you can specify a code reference that
		247	is called for every code execution attempt - the execution request is
		248	granted iff the callback returns a true value.
		249
		250	Most of the time the callback should look only at
		251	C<$AnyEvent::MP::Kernel::SRCNODE> to make a decision, and not at the
		252	actual message (which can be about anything, and is mostly provided for
		253	diagnostic purposes).
		254
		255	See F<semp setsecure> for more info.
		256
		257	=back
201		258
202	=over 4	259	=over 4
203		260
204	=item step 1, gathering configuration from profiles	261	=item step 1, gathering configuration from profiles
205		262
…		…
219	That means that the values specified in the profile have highest priority	276	That means that the values specified in the profile have highest priority
220	and the values specified directly via C<configure> have lowest priority,	277	and the values specified directly via C<configure> have lowest priority,
221	and can only be used to specify defaults.	278	and can only be used to specify defaults.
222		279
223	If the profile specifies a node ID, then this will become the node ID of	280	If the profile specifies a node ID, then this will become the node ID of
224	this process. If not, then the profile name will be used as node ID. The	281	this process. If not, then the profile name will be used as node ID, with
225	special node ID of C<anon/> will be replaced by a random node ID.	282	a unique randoms tring (C</%u>) appended.
		283
		284	The node ID can contain some C<%> sequences that are expanded: C<%n>
		285	is expanded to the local nodename, C<%u> is replaced by a random
		286	strign to make the node unique. For example, the F<aemp> commandline
		287	utility uses C<aemp/%n/%u> as nodename, which might expand to
		288	C<aemp/cerebro/ZQDGSIkRhEZQDGSIkRhE>.
226		289
227	=item step 2, bind listener sockets	290	=item step 2, bind listener sockets
228		291
229	The next step is to look up the binds in the profile, followed by binding	292	The next step is to look up the binds in the profile, followed by binding
230	aemp protocol listeners on all binds specified (it is possible and valid	293	aemp protocol listeners on all binds specified (it is possible and valid
…		…
236	used, meaning the node will bind on a dynamically-assigned port on every	299	used, meaning the node will bind on a dynamically-assigned port on every
237	local IP address it finds.	300	local IP address it finds.
238		301
239	=item step 3, connect to seed nodes	302	=item step 3, connect to seed nodes
240		303
241	As the last step, the seeds list from the profile is passed to the	304	As the last step, the seed ID list from the profile is passed to the
242	L<AnyEvent::MP::Global> module, which will then use it to keep	305	L<AnyEvent::MP::Global> module, which will then use it to keep
243	connectivity with at least one node at any point in time.	306	connectivity with at least one node at any point in time.
244		307
245	=back	308	=back
246		309
247	Example: become a distributed node using the local node name as profile.	310	Example: become a distributed node using the local node name as profile.
248	This should be the most common form of invocation for "daemon"-type nodes.	311	This should be the most common form of invocation for "daemon"-type nodes.
249		312
250	configure	313	configure
251		314
252	Example: become an anonymous node. This form is often used for commandline	315	Example: become a semi-anonymous node. This form is often used for
253	clients.	316	commandline clients.
254		317
255	configure nodeid => "anon/";	318	configure nodeid => "myscript/%n/%u";
256		319
257	Example: configure a node using a profile called seed, which si suitable	320	Example: configure a node using a profile called seed, which is suitable
258	for a seed node as it binds on all local addresses on a fixed port (4040,	321	for a seed node as it binds on all local addresses on a fixed port (4040,
259	customary for aemp).	322	customary for aemp).
260		323
261	# use the aemp commandline utility	324	# use the aemp commandline utility
262	# aemp profile seed nodeid anon/ binds '*:4040'	325	# aemp profile seed binds '*:4040'
263		326
264	# then use it	327	# then use it
265	configure profile => "seed";	328	configure profile => "seed";
266		329
267	# or simply use aemp from the shell again:	330	# or simply use aemp from the shell again:
…		…
332		395
333	=cut	396	=cut
334		397
335	sub rcv($@);	398	sub rcv($@);
336		399
337	sub _kilme {	400	my $KILME = sub {
338	die "received message on port without callback";	401	(my $tag = substr $_[0], 0, 30) =~ s/([\x20-\x7e])/./g;
339	}	402	kil $SELF, unhandled_message => "no callback found for message '$tag'";
		403	};
340		404
341	sub port(;&) {	405	sub port(;&) {
342	my $id = "$UNIQ." . $ID++;	406	my $id = $UNIQ . ++$ID;
343	my $port = "$NODE#$id";	407	my $port = "$NODE#$id";
344		408
345	rcv $port, shift \|\| \&_kilme;	409	rcv $port, shift \|\| $KILME;
346		410
347	$port	411	$port
348	}	412	}
349		413
350	=item rcv $local_port, $callback->(@msg)	414	=item rcv $local_port, $callback->(@msg)
…		…
355		419
356	The global C<$SELF> (exported by this module) contains C<$port> while	420	The global C<$SELF> (exported by this module) contains C<$port> while
357	executing the callback. Runtime errors during callback execution will	421	executing the callback. Runtime errors during callback execution will
358	result in the port being C<kil>ed.	422	result in the port being C<kil>ed.
359		423
360	The default callback received all messages not matched by a more specific	424	The default callback receives all messages not matched by a more specific
361	C<tag> match.	425	C<tag> match.
362		426
363	=item rcv $local_port, tag => $callback->(@msg_without_tag), ...	427	=item rcv $local_port, tag => $callback->(@msg_without_tag), ...
364		428
365	Register (or replace) callbacks to be called on messages starting with the	429	Register (or replace) callbacks to be called on messages starting with the
…		…
529	$res	593	$res
530	}	594	}
531	}	595	}
532	}	596	}
533		597
		598	=item $guard = mon $port, $rcvport # kill $rcvport when $port dies
		599
		600	=item $guard = mon $port # kill $SELF when $port dies
		601
534	=item $guard = mon $port, $cb->(@reason) # call $cb when $port dies	602	=item $guard = mon $port, $cb->(@reason) # call $cb when $port dies
535
536	=item $guard = mon $port, $rcvport # kill $rcvport when $port dies
537
538	=item $guard = mon $port # kill $SELF when $port dies
539		603
540	=item $guard = mon $port, $rcvport, @msg # send a message when $port dies	604	=item $guard = mon $port, $rcvport, @msg # send a message when $port dies
541		605
542	Monitor the given port and do something when the port is killed or	606	Monitor the given port and do something when the port is killed or
543	messages to it were lost, and optionally return a guard that can be used	607	messages to it were lost, and optionally return a guard that can be used
544	to stop monitoring again.	608	to stop monitoring again.
545		609
		610	The first two forms distinguish between "normal" and "abnormal" kil's:
		611
		612	In the first form (another port given), if the C<$port> is C<kil>'ed with
		613	a non-empty reason, the other port (C<$rcvport>) will be kil'ed with the
		614	same reason. That is, on "normal" kil's nothing happens, while under all
		615	other conditions, the other port is killed with the same reason.
		616
		617	The second form (kill self) is the same as the first form, except that
		618	C<$rvport> defaults to C<$SELF>.
		619
		620	The remaining forms don't distinguish between "normal" and "abnormal" kil's
		621	- it's up to the callback or receiver to check whether the C<@reason> is
		622	empty and act accordingly.
		623
546	In the first form (callback), the callback is simply called with any	624	In the third form (callback), the callback is simply called with any
547	number of C<@reason> elements (no @reason means that the port was deleted	625	number of C<@reason> elements (empty @reason means that the port was deleted
548	"normally"). Note also that I<< the callback B<must> never die >>, so use	626	"normally"). Note also that I<< the callback B<must> never die >>, so use
549	C<eval> if unsure.	627	C<eval> if unsure.
550		628
551	In the second form (another port given), the other port (C<$rcvport>)
552	will be C<kil>'ed with C<@reason>, if a @reason was specified, i.e. on
553	"normal" kils nothing happens, while under all other conditions, the other
554	port is killed with the same reason.
555
556	The third form (kill self) is the same as the second form, except that
557	C<$rvport> defaults to C<$SELF>.
558
559	In the last form (message), a message of the form C<@msg, @reason> will be	629	In the last form (message), a message of the form C<$rcvport, @msg,
560	C<snd>.	630	@reason> will be C<snd>.
561		631
562	Monitoring-actions are one-shot: once messages are lost (and a monitoring	632	Monitoring-actions are one-shot: once messages are lost (and a monitoring
563	alert was raised), they are removed and will not trigger again.	633	alert was raised), they are removed and will not trigger again, even if it
		634	turns out that the port is still alive.
564		635
565	As a rule of thumb, monitoring requests should always monitor a port from	636	As a rule of thumb, monitoring requests should always monitor a remote
566	a local port (or callback). The reason is that kill messages might get	637	port locally (using a local C<$rcvport> or a callback). The reason is that
567	lost, just like any other message. Another less obvious reason is that	638	kill messages might get lost, just like any other message. Another less
568	even monitoring requests can get lost (for example, when the connection	639	obvious reason is that even monitoring requests can get lost (for example,
569	to the other node goes down permanently). When monitoring a port locally	640	when the connection to the other node goes down permanently). When
570	these problems do not exist.	641	monitoring a port locally these problems do not exist.
571		642
572	C<mon> effectively guarantees that, in the absence of hardware failures,	643	C<mon> effectively guarantees that, in the absence of hardware failures,
573	after starting the monitor, either all messages sent to the port will	644	after starting the monitor, either all messages sent to the port will
574	arrive, or the monitoring action will be invoked after possible message	645	arrive, or the monitoring action will be invoked after possible message
575	loss has been detected. No messages will be lost "in between" (after	646	loss has been detected. No messages will be lost "in between" (after
…		…
620	}	691	}
621		692
622	$node->monitor ($port, $cb);	693	$node->monitor ($port, $cb);
623		694
624	defined wantarray	695	defined wantarray
625	and ($cb += 0, AnyEvent::Util::guard { $node->unmonitor ($port, $cb) })	696	and ($cb += 0, Guard::guard { $node->unmonitor ($port, $cb) })
626	}	697	}
627		698
628	=item $guard = mon_guard $port, $ref, $ref...	699	=item $guard = mon_guard $port, $ref, $ref...
629		700
630	Monitors the given C<$port> and keeps the passed references. When the port	701	Monitors the given C<$port> and keeps the passed references. When the port
…		…
666	will be reported as reason C<< die => $@ >>.	737	will be reported as reason C<< die => $@ >>.
667		738
668	Transport/communication errors are reported as C<< transport_error =>	739	Transport/communication errors are reported as C<< transport_error =>
669	$message >>.	740	$message >>.
670		741
671	=cut	742	Common idioms:
		743
		744	# silently remove yourself, do not kill linked ports
		745	kil $SELF;
		746
		747	# report a failure in some detail
		748	kil $SELF, failure_mode_1 => "it failed with too high temperature";
		749
		750	# do not waste much time with killing, just die when something goes wrong
		751	open my $fh, "<file"
		752	or die "file: $!";
672		753
673	=item $port = spawn $node, $initfunc[, @initdata]	754	=item $port = spawn $node, $initfunc[, @initdata]
674		755
675	Creates a port on the node C<$node> (which can also be a port ID, in which	756	Creates a port on the node C<$node> (which can also be a port ID, in which
676	case it's the node where that port resides).	757	case it's the node where that port resides).
…		…
734	}	815	}
735		816
736	sub spawn(@) {	817	sub spawn(@) {
737	my ($nodeid, undef) = split /#/, shift, 2;	818	my ($nodeid, undef) = split /#/, shift, 2;
738		819
739	my $id = "$RUNIQ." . $ID++;	820	my $id = $RUNIQ . ++$ID;
740		821
741	$_[0] =~ /::/	822	$_[0] =~ /::/
742	or Carp::croak "spawn init function must be a fully-qualified name, caught";	823	or Carp::croak "spawn init function must be a fully-qualified name, caught";
743		824
744	snd_to_func $nodeid, "AnyEvent::MP::_spawn" => $id, @_;	825	snd_to_func $nodeid, "AnyEvent::MP::_spawn" => $id, @_;
745		826
746	"$nodeid#$id"	827	"$nodeid#$id"
747	}	828	}
		829
748		830
749	=item after $timeout, @msg	831	=item after $timeout, @msg
750		832
751	=item after $timeout, $callback	833	=item after $timeout, $callback
752		834
…		…
767	ref $action[0]	849	ref $action[0]
768	? $action[0]()	850	? $action[0]()
769	: snd @action;	851	: snd @action;
770	};	852	};
771	}	853	}
		854
		855	#=item $cb2 = timeout $seconds, $cb[, @args]
772		856
773	=item cal $port, @msg, $callback[, $timeout]	857	=item cal $port, @msg, $callback[, $timeout]
774		858
775	A simple form of RPC - sends a message to the given C<$port> with the	859	A simple form of RPC - sends a message to the given C<$port> with the
776	given contents (C<@msg>), but adds a reply port to the message.	860	given contents (C<@msg>), but adds a reply port to the message.
…		…
822	$port	906	$port
823	}	907	}
824		908
825	=back	909	=back
826		910
		911	=head1 DISTRIBUTED DATABASE
		912
		913	AnyEvent::MP comes with a simple distributed database. The database will
		914	be mirrored asynchronously on all global nodes. Other nodes bind to one
		915	of the global nodes for their needs. Every node has a "local database"
		916	which contains all the values that are set locally. All local databases
		917	are merged together to form the global database, which can be queried.
		918
		919	The database structure is that of a two-level hash - the database hash
		920	contains hashes which contain values, similarly to a perl hash of hashes,
		921	i.e.:
		922
		923	$DATABASE{$family}{$subkey} = $value
		924
		925	The top level hash key is called "family", and the second-level hash key
		926	is called "subkey" or simply "key".
		927
		928	The family must be alphanumeric, i.e. start with a letter and consist
		929	of letters, digits, underscores and colons (C<[A-Za-z][A-Za-z0-9_:]*>,
		930	pretty much like Perl module names.
		931
		932	As the family namespace is global, it is recommended to prefix family names
		933	with the name of the application or module using it.
		934
		935	The subkeys must be non-empty strings, with no further restrictions.
		936
		937	The values should preferably be strings, but other perl scalars should
		938	work as well (such as C<undef>, arrays and hashes).
		939
		940	Every database entry is owned by one node - adding the same family/subkey
		941	combination on multiple nodes will not cause discomfort for AnyEvent::MP,
		942	but the result might be nondeterministic, i.e. the key might have
		943	different values on different nodes.
		944
		945	Different subkeys in the same family can be owned by different nodes
		946	without problems, and in fact, this is the common method to create worker
		947	pools. For example, a worker port for image scaling might do this:
		948
		949	db_set my_image_scalers => $port;
		950
		951	And clients looking for an image scaler will want to get the
		952	C<my_image_scalers> keys from time to time:
		953
		954	db_keys my_image_scalers => sub {
		955	@ports = @{ $_[0] };
		956	};
		957
		958	Or better yet, they want to monitor the database family, so they always
		959	have a reasonable up-to-date copy:
		960
		961	db_mon my_image_scalers => sub {
		962	@ports = keys %{ $_[0] };
		963	};
		964
		965	In general, you can set or delete single subkeys, but query and monitor
		966	whole families only.
		967
		968	If you feel the need to monitor or query a single subkey, try giving it
		969	it's own family.
		970
		971	=over
		972
		973	=item $guard = db_set $family => $subkey [=> $value]
		974
		975	Sets (or replaces) a key to the database - if C<$value> is omitted,
		976	C<undef> is used instead.
		977
		978	When called in non-void context, C<db_set> returns a guard that
		979	automatically calls C<db_del> when it is destroyed.
		980
		981	=item db_del $family => $subkey...
		982
		983	Deletes one or more subkeys from the database family.
		984
		985	=item $guard = db_reg $family => $port => $value
		986
		987	=item $guard = db_reg $family => $port
		988
		989	=item $guard = db_reg $family
		990
		991	Registers a port in the given family and optionally returns a guard to
		992	remove it.
		993
		994	This function basically does the same as:
		995
		996	db_set $family => $port => $value
		997
		998	Except that the port is monitored and automatically removed from the
		999	database family when it is kil'ed.
		1000
		1001	If C<$value> is missing, C<undef> is used. If C<$port> is missing, then
		1002	C<$SELF> is used.
		1003
		1004	This function is most useful to register a port in some port group (which
		1005	is just another name for a database family), and have it removed when the
		1006	port is gone. This works best when the port is a local port.
		1007
		1008	=cut
		1009
		1010	sub db_reg($$;$) {
		1011	my $family = shift;
		1012	my $port = @_ ? shift : $SELF;
		1013
		1014	my $clr = sub { db_del $family => $port };
		1015	mon $port, $clr;
		1016
		1017	db_set $family => $port => $_[0];
		1018
		1019	defined wantarray
		1020	and &Guard::guard ($clr)
		1021	}
		1022
		1023	=item db_family $family => $cb->(\%familyhash)
		1024
		1025	Queries the named database C<$family> and call the callback with the
		1026	family represented as a hash. You can keep and freely modify the hash.
		1027
		1028	=item db_keys $family => $cb->(\@keys)
		1029
		1030	Same as C<db_family>, except it only queries the family I<subkeys> and passes
		1031	them as array reference to the callback.
		1032
		1033	=item db_values $family => $cb->(\@values)
		1034
		1035	Same as C<db_family>, except it only queries the family I<values> and passes them
		1036	as array reference to the callback.
		1037
		1038	=item $guard = db_mon $family => $cb->($familyhash, \@added, \@changed, \@deleted)
		1039
		1040	Creates a monitor on the given database family. Each time a key is set
		1041	or or is deleted the callback is called with a hash containing the
		1042	database family and three lists of added, changed and deleted subkeys,
		1043	respectively. If no keys have changed then the array reference might be
		1044	C<undef> or even missing.
		1045
		1046	If not called in void context, a guard object is returned that, when
		1047	destroyed, stops the monitor.
		1048
		1049	The family hash reference and the key arrays belong to AnyEvent::MP and
		1050	B<must not be modified or stored> by the callback. When in doubt, make a
		1051	copy.
		1052
		1053	As soon as possible after the monitoring starts, the callback will be
		1054	called with the intiial contents of the family, even if it is empty,
		1055	i.e. there will always be a timely call to the callback with the current
		1056	contents.
		1057
		1058	It is possible that the callback is called with a change event even though
		1059	the subkey is already present and the value has not changed.
		1060
		1061	The monitoring stops when the guard object is destroyed.
		1062
		1063	Example: on every change to the family "mygroup", print out all keys.
		1064
		1065	my $guard = db_mon mygroup => sub {
		1066	my ($family, $a, $c, $d) = @_;
		1067	print "mygroup members: ", (join " ", keys %$family), "\n";
		1068	};
		1069
		1070	Exmaple: wait until the family "My::Module::workers" is non-empty.
		1071
		1072	my $guard; $guard = db_mon My::Module::workers => sub {
		1073	my ($family, $a, $c, $d) = @_;
		1074	return unless %$family;
		1075	undef $guard;
		1076	print "My::Module::workers now nonempty\n";
		1077	};
		1078
		1079	Example: print all changes to the family "AnyRvent::Fantasy::Module".
		1080
		1081	my $guard = db_mon AnyRvent::Fantasy::Module => sub {
		1082	my ($family, $a, $c, $d) = @_;
		1083
		1084	print "+$_=$family->{$_}\n" for @$a;
		1085	print "*$_=$family->{$_}\n" for @$c;
		1086	print "-$_=$family->{$_}\n" for @$d;
		1087	};
		1088
		1089	=cut
		1090
		1091	=back
		1092
827	=head1 AnyEvent::MP vs. Distributed Erlang	1093	=head1 AnyEvent::MP vs. Distributed Erlang
828		1094
829	AnyEvent::MP got lots of its ideas from distributed Erlang (Erlang node	1095	AnyEvent::MP got lots of its ideas from distributed Erlang (Erlang node
830	== aemp node, Erlang process == aemp port), so many of the documents and	1096	== aemp node, Erlang process == aemp port), so many of the documents and
831	programming techniques employed by Erlang apply to AnyEvent::MP. Here is a	1097	programming techniques employed by Erlang apply to AnyEvent::MP. Here is a
…		…
862	ports being the special case/exception, where transport errors cannot	1128	ports being the special case/exception, where transport errors cannot
863	occur.	1129	occur.
864		1130
865	=item * Erlang uses processes and a mailbox, AEMP does not queue.	1131	=item * Erlang uses processes and a mailbox, AEMP does not queue.
866		1132
867	Erlang uses processes that selectively receive messages, and therefore	1133	Erlang uses processes that selectively receive messages out of order, and
868	needs a queue. AEMP is event based, queuing messages would serve no	1134	therefore needs a queue. AEMP is event based, queuing messages would serve
869	useful purpose. For the same reason the pattern-matching abilities of	1135	no useful purpose. For the same reason the pattern-matching abilities
870	AnyEvent::MP are more limited, as there is little need to be able to	1136	of AnyEvent::MP are more limited, as there is little need to be able to
871	filter messages without dequeuing them.	1137	filter messages without dequeuing them.
872		1138
873	(But see L<Coro::MP> for a more Erlang-like process model on top of AEMP).	1139	This is not a philosophical difference, but simply stems from AnyEvent::MP
		1140	being event-based, while Erlang is process-based.
		1141
		1142	You cna have a look at L<Coro::MP> for a more Erlang-like process model on
		1143	top of AEMP and Coro threads.
874		1144
875	=item * Erlang sends are synchronous, AEMP sends are asynchronous.	1145	=item * Erlang sends are synchronous, AEMP sends are asynchronous.
876		1146
877	Sending messages in Erlang is synchronous and blocks the process (and	1147	Sending messages in Erlang is synchronous and blocks the process until
		1148	a conenction has been established and the message sent (and so does not
878	so does not need a queue that can overflow). AEMP sends are immediate,	1149	need a queue that can overflow). AEMP sends return immediately, connection
879	connection establishment is handled in the background.	1150	establishment is handled in the background.
880		1151
881	=item * Erlang suffers from silent message loss, AEMP does not.	1152	=item * Erlang suffers from silent message loss, AEMP does not.
882		1153
883	Erlang implements few guarantees on messages delivery - messages can get	1154	Erlang implements few guarantees on messages delivery - messages can get
884	lost without any of the processes realising it (i.e. you send messages a,	1155	lost without any of the processes realising it (i.e. you send messages a,
885	b, and c, and the other side only receives messages a and c).	1156	b, and c, and the other side only receives messages a and c).
886		1157
887	AEMP guarantees correct ordering, and the guarantee that after one message	1158	AEMP guarantees (modulo hardware errors) correct ordering, and the
888	is lost, all following ones sent to the same port are lost as well, until	1159	guarantee that after one message is lost, all following ones sent to the
889	monitoring raises an error, so there are no silent "holes" in the message	1160	same port are lost as well, until monitoring raises an error, so there are
890	sequence.	1161	no silent "holes" in the message sequence.
		1162
		1163	If you want your software to be very reliable, you have to cope with
		1164	corrupted and even out-of-order messages in both Erlang and AEMP. AEMP
		1165	simply tries to work better in common error cases, such as when a network
		1166	link goes down.
891		1167
892	=item * Erlang can send messages to the wrong port, AEMP does not.	1168	=item * Erlang can send messages to the wrong port, AEMP does not.
893		1169
894	In Erlang it is quite likely that a node that restarts reuses a process ID	1170	In Erlang it is quite likely that a node that restarts reuses an Erlang
895	known to other nodes for a completely different process, causing messages	1171	process ID known to other nodes for a completely different process,
896	destined for that process to end up in an unrelated process.	1172	causing messages destined for that process to end up in an unrelated
		1173	process.
897		1174
898	AEMP never reuses port IDs, so old messages or old port IDs floating	1175	AEMP does not reuse port IDs, so old messages or old port IDs floating
899	around in the network will not be sent to an unrelated port.	1176	around in the network will not be sent to an unrelated port.
900		1177
901	=item * Erlang uses unprotected connections, AEMP uses secure	1178	=item * Erlang uses unprotected connections, AEMP uses secure
902	authentication and can use TLS.	1179	authentication and can use TLS.
903		1180
…		…
906		1183
907	=item * The AEMP protocol is optimised for both text-based and binary	1184	=item * The AEMP protocol is optimised for both text-based and binary
908	communications.	1185	communications.
909		1186
910	The AEMP protocol, unlike the Erlang protocol, supports both programming	1187	The AEMP protocol, unlike the Erlang protocol, supports both programming
911	language independent text-only protocols (good for debugging) and binary,	1188	language independent text-only protocols (good for debugging), and binary,
912	language-specific serialisers (e.g. Storable). By default, unless TLS is	1189	language-specific serialisers (e.g. Storable). By default, unless TLS is
913	used, the protocol is actually completely text-based.	1190	used, the protocol is actually completely text-based.
914		1191
915	It has also been carefully designed to be implementable in other languages	1192	It has also been carefully designed to be implementable in other languages
916	with a minimum of work while gracefully degrading functionality to make the	1193	with a minimum of work while gracefully degrading functionality to make the
917	protocol simple.	1194	protocol simple.
918		1195
919	=item * AEMP has more flexible monitoring options than Erlang.	1196	=item * AEMP has more flexible monitoring options than Erlang.
920		1197
921	In Erlang, you can chose to receive I<all> exit signals as messages	1198	In Erlang, you can chose to receive I<all> exit signals as messages or
922	or I<none>, there is no in-between, so monitoring single processes is	1199	I<none>, there is no in-between, so monitoring single Erlang processes is
923	difficult to implement. Monitoring in AEMP is more flexible than in	1200	difficult to implement.
924	Erlang, as one can choose between automatic kill, exit message or callback	1201
925	on a per-process basis.	1202	Monitoring in AEMP is more flexible than in Erlang, as one can choose
		1203	between automatic kill, exit message or callback on a per-port basis.
926		1204
927	=item * Erlang tries to hide remote/local connections, AEMP does not.	1205	=item * Erlang tries to hide remote/local connections, AEMP does not.
928		1206
929	Monitoring in Erlang is not an indicator of process death/crashes, in the	1207	Monitoring in Erlang is not an indicator of process death/crashes, in the
930	same way as linking is (except linking is unreliable in Erlang).	1208	same way as linking is (except linking is unreliable in Erlang).
…		…
953		1231
954	Strings can easily be printed, easily serialised etc. and need no special	1232	Strings can easily be printed, easily serialised etc. and need no special
955	procedures to be "valid".	1233	procedures to be "valid".
956		1234
957	And as a result, a port with just a default receiver consists of a single	1235	And as a result, a port with just a default receiver consists of a single
958	closure stored in a global hash - it can't become much cheaper.	1236	code reference stored in a global hash - it can't become much cheaper.
959		1237
960	=item Why favour JSON, why not a real serialising format such as Storable?	1238	=item Why favour JSON, why not a real serialising format such as Storable?
961		1239
962	In fact, any AnyEvent::MP node will happily accept Storable as framing	1240	In fact, any AnyEvent::MP node will happily accept Storable as framing
963	format, but currently there is no way to make a node use Storable by	1241	format, but currently there is no way to make a node use Storable by
…		…
973	Keeping your messages simple, concentrating on data structures rather than	1251	Keeping your messages simple, concentrating on data structures rather than
974	objects, will keep your messages clean, tidy and efficient.	1252	objects, will keep your messages clean, tidy and efficient.
975		1253
976	=back	1254	=back
977		1255
		1256	=head1 PORTING FROM AnyEvent::MP VERSION 1.X
		1257
		1258	AEMP version 2 has a few major incompatible changes compared to version 1:
		1259
		1260	=over 4
		1261
		1262	=item AnyEvent::MP::Global no longer has group management functions.
		1263
		1264	At least not officially - the grp_* functions are still exported and might
		1265	work, but they will be removed in some later release.
		1266
		1267	AnyEvent::MP now comes with a distributed database that is more
		1268	powerful. Its database families map closely to port groups, but the API
		1269	has changed (the functions are also now exported by AnyEvent::MP). Here is
		1270	a rough porting guide:
		1271
		1272	grp_reg $group, $port # old
		1273	db_reg $group, $port # new
		1274
		1275	$list = grp_get $group # old
		1276	db_keys $group, sub { my $list = shift } # new
		1277
		1278	grp_mon $group, $cb->(\@ports, $add, $del) # old
		1279	db_mon $group, $cb->(\%ports, $add, $change, $del) # new
		1280
		1281	C<grp_reg> is a no-brainer (just replace by C<db_reg>), but C<grp_get> is
		1282	no longer instant, because the local node might not have a copy of the
		1283	group. You can either modify your code to allow for a callback, or use
		1284	C<db_mon> to keep an updated copy of the group:
		1285
		1286	my $local_group_copy;
		1287	db_mon $group => sub { $local_group_copy = $_[0] };
		1288
		1289	# now "keys %$local_group_copy" always returns the most up-to-date
		1290	# list of ports in the group.
		1291
		1292	C<grp_mon> can be replaced by C<db_mon> with minor changes - C<db_mon>
		1293	passes a hash as first argument, and an extra C<$chg> argument that can be
		1294	ignored:
		1295
		1296	db_mon $group => sub {
		1297	my ($ports, $add, $chg, $lde) = @_;
		1298	$ports = [keys %$ports];
		1299
		1300	# now $ports, $add and $del are the same as
		1301	# were originally passed by grp_mon.
		1302	...
		1303	};
		1304
		1305	=item Nodes not longer connect to all other nodes.
		1306
		1307	In AEMP 1.x, every node automatically loads the L<AnyEvent::MP::Global>
		1308	module, which in turn would create connections to all other nodes in the
		1309	network (helped by the seed nodes).
		1310
		1311	In version 2.x, global nodes still connect to all other global nodes, but
		1312	other nodes don't - now every node either is a global node itself, or
		1313	attaches itself to another global node.
		1314
		1315	If a node isn't a global node itself, then it attaches itself to one
		1316	of its seed nodes. If that seed node isn't a global node yet, it will
		1317	automatically be upgraded to a global node.
		1318
		1319	So in many cases, nothing needs to be changed - one just has to make sure
		1320	that all seed nodes are meshed together with the other seed nodes (as with
		1321	AEMP 1.x), and other nodes specify them as seed nodes. This is most easily
		1322	achieved by specifying the same set of seed nodes for all nodes in the
		1323	network.
		1324
		1325	Not opening a connection to every other node is usually an advantage,
		1326	except when you need the lower latency of an already established
		1327	connection. To ensure a node establishes a connection to another node,
		1328	you can monitor the node port (C<mon $node, ...>), which will attempt to
		1329	create the connection (and notify you when the connection fails).
		1330
		1331	=item Listener-less nodes (nodes without binds) are gone.
		1332
		1333	And are not coming back, at least not in their old form. If no C<binds>
		1334	are specified for a node, AnyEvent::MP assumes a default of C<:>.
		1335
		1336	There are vague plans to implement some form of routing domains, which
		1337	might or might not bring back listener-less nodes, but don't count on it.
		1338
		1339	The fact that most connections are now optional somewhat mitigates this,
		1340	as a node can be effectively unreachable from the outside without any
		1341	problems, as long as it isn't a global node and only reaches out to other
		1342	nodes (as opposed to being contacted from other nodes).
		1343
		1344	=item $AnyEvent::MP::Kernel::WARN has gone.
		1345
		1346	AnyEvent has acquired a logging framework (L<AnyEvent::Log>), and AEMP now
		1347	uses this, and so should your programs.
		1348
		1349	Every module now documents what kinds of messages it generates, with
		1350	AnyEvent::MP acting as a catch all.
		1351
		1352	On the positive side, this means that instead of setting
		1353	C<PERL_ANYEVENT_MP_WARNLEVEL>, you can get away by setting C<AE_VERBOSE> -
		1354	much less to type.
		1355
		1356	=back
		1357
		1358	=head1 LOGGING
		1359
		1360	AnyEvent::MP does not normally log anything by itself, but sinc eit is the
		1361	root of the contetx hierarchy for AnyEvent::MP modules, it will receive
		1362	all log messages by submodules.
		1363
978	=head1 SEE ALSO	1364	=head1 SEE ALSO
979		1365
980	L<AnyEvent::MP::Intro> - a gentle introduction.	1366	L<AnyEvent::MP::Intro> - a gentle introduction.
981		1367
982	L<AnyEvent::MP::Kernel> - more, lower-level, stuff.	1368	L<AnyEvent::MP::Kernel> - more, lower-level, stuff.

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-MP/MP.pm (file contents): Revision 1.115 by root, Fri May 7 18:14:21 2010 UTC vs. Revision 1.140 by root, Thu Mar 22 23:47:01 2012 UTC

Diff Legend

Comparing AnyEvent-MP/MP.pm (file contents):
Revision 1.115 by root, Fri May 7 18:14:21 2010 UTC vs.
Revision 1.140 by root, Thu Mar 22 23:47:01 2012 UTC