[ViewVC] Diff of: cvs/AnyEvent-MP/MP/Intro.pod

Comparing AnyEvent-MP/MP/Intro.pod (file contents):
Revision 1.29 by root, Mon Aug 31 11:11:27 2009 UTC vs.
Revision 1.30 by root, Mon Aug 31 13:18:06 2009 UTC

…		…
169		169
170	Thats is all for now, you will find some more advanced fiddling with the	170	Thats is all for now, you will find some more advanced fiddling with the
171	C<aemp> utility later.	171	C<aemp> utility later.
172		172
173		173
174	=head1 Passing Messages Between Processes	174	=head1 PART 1: Passing Messages Between Processes
175		175
176	=head2 The Receiver	176	=head2 The Receiver
177		177
178	Lets split the previous example up into two programs: one that contains	178	Lets split the previous example up into two programs: one that contains
179	the sender and one for the receiver. First the receiver application, in	179	the sender and one for the receiver. First the receiver application, in
…		…
391	Or to put it differently: the arguments passed to configure are usually	391	Or to put it differently: the arguments passed to configure are usually
392	provided not by the programmer, but by whoeever is deplying the program.	392	provided not by the programmer, but by whoeever is deplying the program.
393		393
394	To make this easy, AnyEvent::MP supports a simple configuration database,	394	To make this easy, AnyEvent::MP supports a simple configuration database,
395	using profiles, which can be managed using the F<aemp> command-line	395	using profiles, which can be managed using the F<aemp> command-line
396	utility.	396	utility (yes, this section is about the advanced tinkering we mentioned
		397	before).
397		398
398	When you change both programs above to simply call	399	When you change both programs above to simply call
399		400
400	configure;	401	configure;
401		402
…		…
411		412
412	aemp profile seed binds "*:4040"	413	aemp profile seed binds "*:4040"
413		414
414	And we configure all nodes to use this as seed node (this only works when	415	And we configure all nodes to use this as seed node (this only works when
415	running on the same host, for multiple machines you would provide the IP	416	running on the same host, for multiple machines you would provide the IP
416	address or hostname of the node running the seed):	417	address or hostname of the node running the seed), and use a random name
		418	(because we want to start multiple nodes on the same host):
417		419
418	aemp seeds "*:4040"	420	aemp seeds "*:4040" nodeid anon/
419		421
420	Then we run the seed node:	422	Then we run the seed node:
421		423
422	aemp run profile seed	424	aemp run profile seed
423		425
…		…
425	use our generic seed node to discover each other.	427	use our generic seed node to discover each other.
426		428
427	In fact, starting many receivers nicely illustrates that the time sender	429	In fact, starting many receivers nicely illustrates that the time sender
428	can have multiple receivers.	430	can have multiple receivers.
429		431
430	That's all for now - next time we will teach you about monitoring by	432	That's all for now - next we will teach you about monitoring by writing a
431	writing a simple chat client and server :)	433	simple chat client and server :)
		434
		435	=head1 PART 2: Monitoring, Supervising, Exception Handling and Recovery
		436
		437	That's a mouthful, so what does it mean? Our previous example is what one
		438	could call "very loosely coupled" - the sender doesn't care about whether
		439	there are any receivers, and the receivers do not care if there is any
		440	sender.
		441
		442	This can work fine for simple services, but most real-world applications
		443	want to ensure that the side they are expecting to be there is actually
		444	there. Going one step further: most bigger real-world applications even
		445	want to ensure that if some component is missing, or has crashed, it will
		446	still be there, by recovering and restarting the service.
		447
		448	AnyEvent::MP supports this by catching exceptions and network problems,
		449	and notifying interested parties of this.
		450
		451	=head2 Exceptions, Network Errors and Monitors
		452
		453	=head3 Exceptions
		454
		455	Exceptions are handled on a per-port basis: receive callbacks are executed
		456	in a special context, the port-context, and code that throws an uncaught
		457	exception will cause the port to be C<kil>led. Killed ports are destroyed
		458	automatically (killing ports is the only way to free ports, incidentally).
		459
		460	Ports can be monitored, even from a different host, and when a port is
		461	killed any entity monitoring it will be notified.
		462
		463	Here is a simple example:
		464
		465	use AnyEvent::MP;
		466
		467	# create a port, it always dies
		468	my $port = port { die "oops" };
		469
		470	# monitor it
		471	mon $port, sub {
		472	warn "$port was killed (with reason @_)";
		473	};
		474
		475	# now send it some message, causing it to die:
		476	snd $port;
		477
		478	It first creates a port whose only action is to throw an exception,
		479	and the monitors it with the C<mon> function. Afterwards it sends it a
		480	message, causing it to die and call the monitoring callback:
		481
		482	anon/6WmIpj.a was killed (with reason die oops at xxx line 5.) at xxx line 9.
		483
		484	The callback was actually passed two arguments: C<die> (to indicate it did
		485	throw an exception as opposed to, say, a network error) and the exception
		486	message itself.
		487
		488	What happens when a port is killed before we have a chance to monitor
		489	it? Granted, this is highly unlikely in our example, but when you program
		490	in a network this can easily happen due to races between nodes.
		491
		492	use AnyEvent::MP;
		493
		494	my $port = port { die "oops" };
		495
		496	snd $port;
		497
		498	mon $port, sub {
		499	warn "$port was killed (with reason @_)";
		500	};
		501
		502	This time we will get something like:
		503
		504	anon/zpX.a was killed (with reason no_such_port cannot monitor nonexistent port)
		505
		506	Since the port was already gone, the kill reason is now C<no_such_port>
		507	with some descriptive (we hope) error message.
		508
		509	In fact, the kill reason is usually some identifier as first argument
		510	and a human-readable error message as second argument, but can be about
		511	anything (it's a list) or even nothing - which is called a "normal" kill.
		512
		513	You can kill ports manually using the C<kil> function, which will be
		514	treated like an error when any reason is specified:
		515
		516	kil $port, custom_error => "don't like your steenking face";
		517
		518	And a clean kill without any reason arguments:
		519
		520	kil $port;
		521
		522	By now you probably wonder what this "normal" kill business is: A common
		523	idiom is to not specify a callback to C<mon>, but another port, such as
		524	C<$SELF>:
		525
		526	mon $port, $SELF;
		527
		528	This basically means "monitor $port and kill me when it crashes". And a
		529	"normal" kill does not count as a crash. This way you can easily link
		530	ports together and make them crash together on errors (but allow you to
		531	remove a port silently).
		532
		533	=head3 Network Errors and the AEMP Guarantee
		534
		535	I mentioned another important source of monitoring failures: network
		536	problems. When a node loses connection to another node, it will invoke all
		537	monitoring actions as if the port was killed, even if it is possible that
		538	the prot sitll lives happily on another node (not being able to talk to a
		539	node means we have no clue what's going on with it, it could be crashed,
		540	but also still running without knowing we lost the connection).
		541
		542	So another way to view monitors is "notify me when some of my messages
		543	couldn't be delivered". AEMP has a guarantee about message delivery to a
		544	port: After starting a monitor, any message sent to a port will either
		545	be delivered, or, when it is lost, any further messages will also be lost
		546	until the monitoring action is invoked. After that, further messgaes
		547	I<might> get delivered again.
		548
		549	This doesn't sound like a very big guarantee, but it is kind of the best
		550	you can get whiel staying sane: Specifically, it means that there will
		551	be no "wholes" in the message sequence: all messages sent are delivered
		552	in order, without any missing in between, and when some were lost, you
		553	I<will> be notified of that, so you can take recovery action.
		554
		555	=head3 Supervising
		556
		557	Ok, so what is this crashing-everything-stuff going to make applications
		558	I<more> stable? Well in fact, the goal is not really to make them more
		559	stable, but to make them more resilient against actual errors and
		560	crashes. And this is not done by crashing I<everything>, but by crashing
		561	everything except a supervisor.
		562
		563	A supervisor is simply some code that ensures that an applciation (or a
		564	part of it) is running, and if it crashes, is restarted properly.
		565
		566	To show how to do all this we will create a simple chat server that can
		567	handle many chat clients. Both server and clients can be killed and
		568	restarted, and even crash, to some extent.
		569
		570	=head2 Chatting, the Resilient Way
		571
		572	Without further ado, here is the chat server (to run it, we assume the
		573	set-up explained earlier, with a separate F<aemp run> seed node):
		574
		575	use common::sense;
		576	use AnyEvent::MP;
		577	use AnyEvent::MP::Global;
		578
		579	configure;
		580
		581	my %clients;
		582
		583	sub msg {
		584	print "relaying: $_[0]\n";
		585	snd $_, $_[0]
		586	for values %clients;
		587	}
		588
		589	our $server = port;
		590
		591	rcv $server, join => sub {
		592	my ($client, $nick) = @_;
		593
		594	$clients{$client} = $client;
		595
		596	mon $client, sub {
		597	delete $clients{$client};
		598	msg "$nick (quits, @_)";
		599	};
		600	msg "$nick (joins)";
		601	};
		602
		603	rcv $server, privmsg => sub {
		604	my ($nick, $msg) = @_;
		605	msg "$nick: $msg";
		606	};
		607
		608	AnyEvent::MP::Global::register $server, "eg_chat_server";
		609
		610	warn "server ready.\n";
		611
		612	AnyEvent->condvar->recv;
		613
		614	Looks like a lot, but it is actually quite simple: after your usualy
		615	preamble (this time we use common sense), we define a helper function that
		616	sends some message to every registered chat client:
		617
		618	sub msg {
		619	print "relaying: $_[0]\n";
		620	snd $_, $_[0]
		621	for values %clients;
		622	}
		623
		624	The clients are stored in the hash C<%client>. Then we define a server
		625	port and install two receivers on it, C<join>, which is sent by clients
		626	to join the chat, and C<privmsg>, that clients use to send actual chat
		627	messages.
		628
		629	C<join> is most complicated. It expects the client port and the nickname
		630	to be passed in the message, and registers the client in C<%clients>.
		631
		632	rcv $server, join => sub {
		633	my ($client, $nick) = @_;
		634
		635	$clients{$client} = $client;
		636
		637	The next step is to monitor the client. The monitoring action removes the
		638	client and sends a quit message with the error to all remaining clients.
		639
		640	mon $client, sub {
		641	delete $clients{$client};
		642	msg "$nick (quits, @_)";
		643	};
		644
		645	And finally, it creates a join message and sends it to all clients.
		646
		647	msg "$nick (joins)";
		648	};
		649
		650	The C<privmsg> callback simply broadcasts the message to all clients:
		651
		652	rcv $server, privmsg => sub {
		653	my ($nick, $msg) = @_;
		654	msg "$nick: $msg";
		655	};
		656
		657	And finally, the server rgeisters itself in the server group, so that
		658	clients can find it:
		659
		660	AnyEvent::MP::Global::register $server, "eg_chat_server";
		661
		662	Well, well... and where is this supervisor stuff? Well... we cheated,
		663	it's not there. To not overcomplicate the example, we only put it into
		664	the..... CLIENT!
		665
		666	=head3 The Client, and a Supervisor!
		667
		668	Again, here is the client, including supervisor, which makes it a bit
		669	longer:
		670
		671	use common::sense;
		672	use AnyEvent::MP;
		673	use AnyEvent::MP::Global;
		674
		675	my $nick = shift;
		676
		677	configure;
		678
		679	my ($client, $server);
		680
		681	sub server_connect {
		682	my $servernodes = AnyEvent::MP::Global::find "eg_chat_server"
		683	or return after 1, \&server_connect;
		684
		685	print "\rconnecting...\n";
		686
		687	$client = port { print "\r \r@_\n> " };
		688	mon $client, sub {
		689	print "\rdisconnected @_\n";
		690	&server_connect;
		691	};
		692
		693	$server = $servernodes->[0];
		694	snd $server, join => $client, $nick;
		695	mon $server, $client;
		696	}
		697
		698	server_connect;
		699
		700	my $w = AnyEvent->io (fh => *STDIN, poll => 'r', cb => sub {
		701	chomp (my $line = <STDIN>);
		702	print "> ";
		703	snd $server, privmsg => $nick, $line
		704	if $server;
		705	});
		706
		707	$\| = 1;
		708	print "> ";
		709	AnyEvent->condvar->recv;
		710
		711	The first thing the client does is to store the nick name (which is
		712	expected as the only command line argument) in C<$nick>, for further
		713	usage.
		714
		715	The next relevant thing is... finally... the supervisor:
		716
		717	sub server_connect {
		718	my $servernodes = AnyEvent::MP::Global::find "eg_chat_server"
		719	or return after 1, \&server_connect;
		720
		721	This looks up the server in the C<eg_chat_server> global group. If it
		722	cannot find it (which is likely when the node is just starting up),
		723	it will wait a second and then retry. This "wait a bit and retry"
		724	is an important pattern, as distributed programming means lots of
		725	things are going on asynchronously. In practise, one should use a more
		726	intelligent algorithm, to possibly warn after an excessive number of
		727	retries. Hopefully future versions of AnyEvent::MP will offer some
		728	predefined supervisors, for now you will have to code it on your own.
		729
		730	Next it creates a local port for the server to send messages to, and
		731	monitors it. When the port is killed, it will print "disconnected" and
		732	tell the supervisor function to retry again.
		733
		734	$client = port { print "\r \r@_\n> " };
		735	mon $client, sub {
		736	print "\rdisconnected @_\n";
		737	&server_connect;
		738	};
		739
		740	Then everything is ready: the client will send a C<join> message with it's
		741	local port to the server, and start monitoring it:
		742
		743	$server = $servernodes->[0];
		744	snd $server, join => $client, $nick;
		745	mon $server, $client;
		746	}
		747
		748	The monitor will ensure that if the server crashes or goes away, the
		749	client will be killed as well. This tells the user that the client was
		750	disconnected, and will then start to connect the server again.
		751
		752	The rest of the program deals with the boring details of actually invoking
		753	the supervisor function to start the whole client process and handle the
		754	actual terminal input, sending it to the server.
		755
		756	You should now try to start the server and one or more clients in diferent
		757	terminal windows (and the seed node):
		758
		759	perl eg/chat_client nick1
		760	perl eg/chat_client nick2
		761	perl eg/chat_server
		762	aemp run profile seed
		763
		764	And then you can experiment with chatting, killing one or more clients, or
		765	stopping and restarting the server, to see the monitoring in action.
		766
		767	There is ample room for improvement: the server should probably remember
		768	the nickname in the C<join> handler instead of expecting it in every chat
		769	message, it should probably monitor itself, and the client should not try
		770	to send any messages unless a server is actually connected.
		771
		772	=head1 PART 3: TIMTOWTDI: Virtual Connections
		773
		774	#TODO
432		775
433	=head1 SEE ALSO	776	=head1 SEE ALSO
434		777
435	L<AnyEvent>	778	L<AnyEvent>
436		779

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-MP/MP/Intro.pod (file contents): Revision 1.29 by root, Mon Aug 31 11:11:27 2009 UTC vs. Revision 1.30 by root, Mon Aug 31 13:18:06 2009 UTC

Diff Legend

Comparing AnyEvent-MP/MP/Intro.pod (file contents):
Revision 1.29 by root, Mon Aug 31 11:11:27 2009 UTC vs.
Revision 1.30 by root, Mon Aug 31 13:18:06 2009 UTC