[ViewVC] Diff of: cvs/AnyEvent-Fork-Pool/README

Comparing AnyEvent-Fork-Pool/README (file contents):
Revision 1.1 by root, Thu Apr 18 14:24:01 2013 UTC vs.
Revision 1.2 by root, Sun Apr 21 12:28:34 2013 UTC

		1	NAME
		2	AnyEvent::Fork::Pool - simple process pool manager on top of
		3	AnyEvent::Fork
		4
		5	THE API IS NOT FINISHED, CONSIDER THIS AN ALPHA RELEASE
		6
		7	SYNOPSIS
		8	use AnyEvent;
		9	use AnyEvent::Fork::Pool;
		10	# use AnyEvent::Fork is not needed
		11
		12	# all possible parameters shown, with default values
		13	my $pool = AnyEvent::Fork
		14	->new
		15	->require ("MyWorker")
		16	->AnyEvent::Fork::Pool::run (
		17	"MyWorker::run", # the worker function
		18
		19	# pool management
		20	max => 4, # absolute maximum # of processes
		21	idle => 0, # minimum # of idle processes
		22	load => 2, # queue at most this number of jobs per process
		23	start => 0.1, # wait this many seconds before starting a new process
		24	stop => 10, # wait this many seconds before stopping an idle process
		25	on_destroy => (my $finish = AE::cv), # called when object is destroyed
		26
		27	# parameters passed to AnyEvent::Fork::RPC
		28	async => 0,
		29	on_error => sub { die "FATAL: $_[0]\n" },
		30	on_event => sub { my @ev = @_ },
		31	init => "MyWorker::init",
		32	serialiser => $AnyEvent::Fork::RPC::STRING_SERIALISER,
		33	);
		34
		35	for (1..10) {
		36	$pool->(doit => $_, sub {
		37	print "MyWorker::run returned @_\n";
		38	});
		39	}
		40
		41	undef $pool;
		42
		43	$finish->recv;
		44
		45	DESCRIPTION
		46	This module uses processes created via AnyEvent::Fork and the RPC
		47	protocol implement in AnyEvent::Fork::RPC to create a load-balanced pool
		48	of processes that handles jobs.
		49
		50	Understanding of AnyEvent::Fork is helpful but not critical to be able
		51	to use this module, but a thorough understanding of AnyEvent::Fork::RPC
		52	is, as it defines the actual API that needs to be implemented in the
		53	worker processes.
		54
		55	EXAMPLES
		56	PARENT USAGE
		57	To create a pool, you first have to create a AnyEvent::Fork object -
		58	this object becomes your template process. Whenever a new worker process
		59	is needed, it is forked from this template process. Then you need to
		60	"hand off" this template process to the "AnyEvent::Fork::Pool" module by
		61	calling its run method on it:
		62
		63	my $template = AnyEvent::Fork
		64	->new
		65	->require ("SomeModule", "MyWorkerModule");
		66
		67	my $pool = $template->AnyEvent::Fork::Pool::run ("MyWorkerModule::myfunction");
		68
		69	The pool "object" is not a regular Perl object, but a code reference
		70	that you can call and that works roughly like calling the worker
		71	function directly, except that it returns nothing but instead you need
		72	to specify a callback to be invoked once results are in:
		73
		74	$pool->(1, 2, 3, sub { warn "myfunction(1,2,3) returned @_" });
		75
		76	my $pool = AnyEvent::Fork::Pool::run $fork, $function, [key => value...]
		77	The traditional way to call the pool creation function. But it is
		78	way cooler to call it in the following way:
		79
		80	my $pool = $fork->AnyEvent::Fork::Pool::run ($function, [key =>
		81	value...])
		82	Creates a new pool object with the specified $function as function
		83	(name) to call for each request. The pool uses the $fork object as
		84	the template when creating worker processes.
		85
		86	You can supply your own template process, or tell
		87	"AnyEvent::Fork::Pool" to create one.
		88
		89	A relatively large number of key/value pairs can be specified to
		90	influence the behaviour. They are grouped into the categories "pool
		91	management", "template process" and "rpc parameters".
		92
		93	Pool Management
		94	The pool consists of a certain number of worker processes. These
		95	options decide how many of these processes exist and when they
		96	are started and stopped.
		97
		98	The worker pool is dynamically resized, according to (perceived
		99	:) load. The minimum size is given by the "idle" parameter and
		100	the maximum size is given by the "max" parameter. A new worker
		101	is started every "start" seconds at most, and an idle worker is
		102	stopped at most every "stop" second.
		103
		104	You can specify the amount of jobs sent to a worker concurrently
		105	using the "load" parameter.
		106
		107	idle => $count (default: 0)
		108	The minimum amount of idle processes in the pool - when
		109	there are fewer than this many idle workers,
		110	"AnyEvent::Fork::Pool" will try to start new ones, subject
		111	to the limits set by "max" and "start".
		112
		113	This is also the initial amount of workers in the pool. The
		114	default of zero means that the pool starts empty and can
		115	shrink back to zero workers over time.
		116
		117	max => $count (default: 4)
		118	The maximum number of processes in the pool, in addition to
		119	the template process. "AnyEvent::Fork::Pool" will never have
		120	more than this number of worker processes, although there
		121	can be more temporarily when a worker is shut down and
		122	hasn't exited yet.
		123
		124	load => $count (default: 2)
		125	The maximum number of concurrent jobs sent to a single
		126	worker process.
		127
		128	Jobs that cannot be sent to a worker immediately (because
		129	all workers are busy) will be queued until a worker is
		130	available.
		131
		132	Setting this low improves latency. For example, at 1, every
		133	job that is sent to a worker is sent to a completely idle
		134	worker that doesn't run any other jobs. The downside is that
		135	throughput is reduced - a worker that finishes a job needs
		136	to wait for a new job from the parent.
		137
		138	The default of 2 is usually a good compromise.
		139
		140	start => $seconds (default: 0.1)
		141	When there are fewer than "idle" workers (or all workers are
		142	completely busy), then a timer is started. If the timer
		143	elapses and there are still jobs that cannot be queued to a
		144	worker, a new worker is started.
		145
		146	This sets the minimum time that all workers must be busy
		147	before a new worker is started. Or, put differently, the
		148	minimum delay between starting new workers.
		149
		150	The delay is small by default, which means new workers will
		151	be started relatively quickly. A delay of 0 is possible, and
		152	ensures that the pool will grow as quickly as possible under
		153	load.
		154
		155	Non-zero values are useful to avoid "exploding" a pool
		156	because a lot of jobs are queued in an instant.
		157
		158	Higher values are often useful to improve efficiency at the
		159	cost of latency - when fewer processes can do the job over
		160	time, starting more and more is not necessarily going to
		161	help.
		162
		163	stop => $seconds (default: 10)
		164	When a worker has no jobs to execute it becomes idle. An
		165	idle worker that hasn't executed a job within this amount of
		166	time will be stopped, unless the other parameters say
		167	otherwise.
		168
		169	Setting this to a very high value means that workers stay
		170	around longer, even when they have nothing to do, which can
		171	be good as they don't have to be started on the netx load
		172	spike again.
		173
		174	Setting this to a lower value can be useful to avoid memory
		175	or simply process table wastage.
		176
		177	Usually, setting this to a time longer than the time between
		178	load spikes is best - if you expect a lot of requests every
		179	minute and little work in between, setting this to longer
		180	than a minute avoids having to stop and start workers. On
		181	the other hand, you have to ask yourself if letting workers
		182	run idle is a good use of your resources. Try to find a good
		183	balance between resource usage of your workers and the time
		184	to start new workers - the processes created by
		185	AnyEvent::Fork itself is fats at creating workers while not
		186	using much memory for them, so most of the overhead is
		187	likely from your own code.
		188
		189	on_destroy => $callback->() (default: none)
		190	When a pool object goes out of scope, the outstanding
		191	requests are still handled till completion. Only after
		192	handling all jobs will the workers be destroyed (and also
		193	the template process if it isn't referenced otherwise).
		194
		195	To find out when a pool really has finished its work, you
		196	can set this callback, which will be called when the pool
		197	has been destroyed.
		198
		199	AnyEvent::Fork::RPC Parameters
		200	These parameters are all passed more or less directly to
		201	AnyEvent::Fork::RPC. They are only briefly mentioned here, for
		202	their full documentation please refer to the AnyEvent::Fork::RPC
		203	documentation. Also, the default values mentioned here are only
		204	documented as a best effort - the AnyEvent::Fork::RPC
		205	documentation is binding.
		206
		207	async => $boolean (default: 0)
		208	Whether to use the synchronous or asynchronous RPC backend.
		209
		210	on_error => $callback->($message) (default: die with message)
		211	The callback to call on any (fatal) errors.
		212
		213	on_event => $callback->(...) (default: "sub { }", unlike
		214	AnyEvent::Fork::RPC)
		215	The callback to invoke on events.
		216
		217	init => $initfunction (default: none)
		218	The function to call in the child, once before handling
		219	requests.
		220
		221	serialiser => $serialiser (defailt:
		222	$AnyEvent::Fork::RPC::STRING_SERIALISER)
		223	The serialiser to use.
		224
		225	$pool->(..., $cb->(...))
		226	Call the RPC function of a worker with the given arguments, and when
		227	the worker is done, call the $cb with the results, just like calling
		228	the RPC object durectly - see the AnyEvent::Fork::RPC documentation
		229	for details on the RPC API.
		230
		231	If there is no free worker, the call will be queued until a worker
		232	becomes available.
		233
		234	Note that there can be considerable time between calling this method
		235	and the call actually being executed. During this time, the
		236	parameters passed to this function are effectively read-only -
		237	modifying them after the call and before the callback is invoked
		238	causes undefined behaviour.
		239
		240	CHILD USAGE
		241	In addition to the AnyEvent::Fork::RPC API, this module implements one
		242	more child-side function:
		243
		244	AnyEvent::Fork::Pool::retire ()
		245	This function sends an event to the parent process to request
		246	retirement: the worker is removed from the pool and no new jobs will
		247	be sent to it, but it has to handle the jobs that are already
		248	queued.
		249
		250	The parentheses are part of the syntax: the function usually isn't
		251	defined when you compile your code (because that happens before
		252	handing the template process over to "AnyEvent::Fork::Pool::run", so
		253	you need the empty parentheses to tell Perl that the function is
		254	indeed a function.
		255
		256	Retiring a worker can be useful to gracefully shut it down when the
		257	worker deems this useful. For example, after executing a job, one
		258	could check the process size or the number of jobs handled so far,
		259	and if either is too high, the worker could ask to get retired, to
		260	avoid memory leaks to accumulate.
		261
		262	POOL PARAMETERS RECIPES
		263	This section describes some recipes for pool paramaters. These are
		264	mostly meant for the synchronous RPC backend, as the asynchronous RPC
		265	backend changes the rules considerably, making workers themselves
		266	responsible for their scheduling.
		267
		268	low latency - set load = 1
		269	If you need a deterministic low latency, you should set the "load"
		270	parameter to 1. This ensures that never more than one job is sent to
		271	each worker. This avoids having to wait for a previous job to
		272	finish.
		273
		274	This makes most sense with the synchronous (default) backend, as the
		275	asynchronous backend can handle multiple requests concurrently.
		276
		277	lowest latency - set load = 1 and idle = max
		278	To achieve the lowest latency, you additionally should disable any
		279	dynamic resizing of the pool by setting "idle" to the same value as
		280	"max".
		281
		282	high throughput, cpu bound jobs - set load >= 2, max = #cpus
		283	To get high throughput with cpu-bound jobs, you should set the
		284	maximum pool size to the number of cpus in your system, and "load"
		285	to at least 2, to make sure there can be another job waiting for the
		286	worker when it has finished one.
		287
		288	The value of 2 for "load" is the minimum value that can achieve
		289	100% throughput, but if your parent process itself is sometimes
		290	busy, you might need higher values. Also there is a limit on the
		291	amount of data that can be "in flight" to the worker, so if you send
		292	big blobs of data to your worker, "load" might have much less of an
		293	effect.
		294
		295	high throughput, I/O bound jobs - set load >= 2, max = 1, or very high
		296	When your jobs are I/O bound, using more workers usually boils down
		297	to higher throughput, depending very much on your actual workload -
		298	sometimes having only one worker is best, for example, when you read
		299	or write big files at maixmum speed, as a second worker will
		300	increase seek times.
		301
		302	EXCEPTIONS
		303	The same "policy" as with AnyEvent::Fork::RPC applies - exceptins will
		304	not be caught, and exceptions in both worker and in callbacks causes
		305	undesirable or undefined behaviour.
		306
		307	SEE ALSO
		308	AnyEvent::Fork, to create the processes in the first place.
		309
		310	AnyEvent::Fork::RPC, which implements the RPC protocol and API.
		311
		312	AUTHOR AND CONTACT INFORMATION
		313	Marc Lehmann <schmorp@schmorp.de>
		314	http://software.schmorp.de/pkg/AnyEvent-Fork-Pool
		315

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork-Pool/README (file contents): Revision 1.1 by root, Thu Apr 18 14:24:01 2013 UTC vs. Revision 1.2 by root, Sun Apr 21 12:28:34 2013 UTC

Diff Legend

Comparing AnyEvent-Fork-Pool/README (file contents):
Revision 1.1 by root, Thu Apr 18 14:24:01 2013 UTC vs.
Revision 1.2 by root, Sun Apr 21 12:28:34 2013 UTC