[ViewVC] Diff of: cvs/AnyEvent-Fork-Pool/Pool.pm

Comparing AnyEvent-Fork-Pool/Pool.pm (file contents):
Revision 1.2 by root, Thu Apr 18 21:37:27 2013 UTC vs.
Revision 1.11 by root, Sun Apr 28 14:19:22 2013 UTC

…		…
1	=head1 NAME	1	=head1 NAME
2		2
3	AnyEvent::Fork::Pool - simple process pool manager on top of AnyEvent::Fork	3	AnyEvent::Fork::Pool - simple process pool manager on top of AnyEvent::Fork
		4
		5	THE API IS NOT FINISHED, CONSIDER THIS AN ALPHA RELEASE
4		6
5	=head1 SYNOPSIS	7	=head1 SYNOPSIS
6		8
7	use AnyEvent;	9	use AnyEvent;
8	use AnyEvent::Fork::Pool;	10	use AnyEvent::Fork::Pool;
9	# use AnyEvent::Fork is not needed	11	# use AnyEvent::Fork is not needed
10		12
11	# all parameters with default values	13	# all possible parameters shown, with default values
12	my $pool = new AnyEvent::Fork::Pool	14	my $pool = AnyEvent::Fork
13	"MyWorker::run",	15	->new
		16	->require ("MyWorker")
		17	->AnyEvent::Fork::Pool::run (
		18	"MyWorker::run", # the worker function
14		19
15	# pool management	20	# pool management
16	min => 0, # minimum # of processes
17	max => 4, # maximum # of processes	21	max => 4, # absolute maximum # of processes
18	max_queue => 2, # queue at most this number of jobs per process	22	idle => 0, # minimum # of idle processes
		23	load => 2, # queue at most this number of jobs per process
19	min_delay => 0, # wait this many seconds before starting a new process	24	start => 0.1, # wait this many seconds before starting a new process
20	min_idle => 0, # try to have at least this amount of idle processes
21	max_idle => 1, # at most this many idle processes
22	idle_time => 1, # wait this many seconds before killing an idle process	25	stop => 10, # wait this many seconds before stopping an idle process
23	on_destroy => (my $finish = AE::cv),	26	on_destroy => (my $finish = AE::cv), # called when object is destroyed
24		27
25	# template process
26	template => AnyEvent::Fork->new, # the template process to use
27	require => [MyWorker::], # module(s) to load
28	eval => "# perl code to execute in template",
29
30	# parameters passed to AnyEvent::Fork::RPC	28	# parameters passed to AnyEvent::Fork::RPC
31	async => 0,	29	async => 0,
32	on_error => sub { die "FATAL: $_[0]\n" },	30	on_error => sub { die "FATAL: $_[0]\n" },
33	on_event => sub { my @ev = @_ },	31	on_event => sub { my @ev = @_ },
34	init => "MyWorker::init",	32	init => "MyWorker::init",
35	serialiser => $AnyEvent::Fork::RPC::STRING_SERIALISER,	33	serialiser => $AnyEvent::Fork::RPC::STRING_SERIALISER,
36	;	34	);
37		35
38	for (1..10) {	36	for (1..10) {
39	$pool->call (doit => $_, sub {	37	$pool->(doit => $_, sub {
40	print "MyWorker::run returned @_\n";	38	print "MyWorker::run returned @_\n";
41	});	39	});
42	}	40	}
43		41
44	undef $pool;	42	undef $pool;
52	pool of processes that handles jobs.	50	pool of processes that handles jobs.
53		51
54	Understanding of L<AnyEvent::Fork> is helpful but not critical to be able	52	Understanding of L<AnyEvent::Fork> is helpful but not critical to be able
55	to use this module, but a thorough understanding of L<AnyEvent::Fork::RPC>	53	to use this module, but a thorough understanding of L<AnyEvent::Fork::RPC>
56	is, as it defines the actual API that needs to be implemented in the	54	is, as it defines the actual API that needs to be implemented in the
57	children.	55	worker processes.
58		56
59	=head1 EXAMPLES	57	=head1 EXAMPLES
60		58
61	=head1 PARENT USAGE	59	=head1 PARENT USAGE
62		60
		61	To create a pool, you first have to create a L<AnyEvent::Fork> object -
		62	this object becomes your template process. Whenever a new worker process
		63	is needed, it is forked from this template process. Then you need to
		64	"hand off" this template process to the C<AnyEvent::Fork::Pool> module by
		65	calling its run method on it:
		66
		67	my $template = AnyEvent::Fork
		68	->new
		69	->require ("SomeModule", "MyWorkerModule");
		70
		71	my $pool = $template->AnyEvent::Fork::Pool::run ("MyWorkerModule::myfunction");
		72
		73	The pool "object" is not a regular Perl object, but a code reference that
		74	you can call and that works roughly like calling the worker function
		75	directly, except that it returns nothing but instead you need to specify a
		76	callback to be invoked once results are in:
		77
		78	$pool->(1, 2, 3, sub { warn "myfunction(1,2,3) returned @_" });
		79
63	=over 4	80	=over 4
64		81
65	=cut	82	=cut
66		83
67	package AnyEvent::Fork::Pool;	84	package AnyEvent::Fork::Pool;
68		85
69	use common::sense;	86	use common::sense;
70		87
71	use Scalar::Util ();	88	use Scalar::Util ();
72		89
		90	use Guard ();
73	use Array::Heap ();	91	use Array::Heap ();
74		92
75	use AnyEvent;	93	use AnyEvent;
		94	# explicit version on next line, as some cpan-testers test with the 0.1 version,
		95	# ignoring dependencies, and this line will at least give a clear indication of that.
76	use AnyEvent::Fork; # we don't actually depend on it, this is for convenience	96	use AnyEvent::Fork 0.6; # we don't actually depend on it, this is for convenience
77	use AnyEvent::Fork::RPC;	97	use AnyEvent::Fork::RPC;
78		98
		99	# these are used for the first and last argument of events
		100	# in the hope of not colliding. yes, I don't like it either,
		101	# but didn't come up with an obviously better alternative.
79	my $magic0 = ':t6Z@HK1N%Dx@_7?=~-7NQgWDdAs6a,jFN=wLO0jD1%P';	102	my $magic0 = ':t6Z@HK1N%Dx@_7?=~-7NQgWDdAs6a,jFN=wLO0jD1%P';
80	my $magic2 = '<~53rexz.U`!]X[A235^"fyEoiTF\T~oH1l/N6+Djep9b~bI9`\1x%B~vWO1q*';	103	my $magic1 = '<~53rexz.U`!]X[A235^"fyEoiTF\T~oH1l/N6+Djep9b~bI9`\1x%B~vWO1q*';
81		104
82	our $VERSION = 0.1;	105	our $VERSION = 1.1;
83		106
84	=item my $rpc = new AnyEvent::Fork::Pool $function, [key => value...]	107	=item my $pool = AnyEvent::Fork::Pool::run $fork, $function, [key => value...]
		108
		109	The traditional way to call the pool creation function. But it is way
		110	cooler to call it in the following way:
		111
		112	=item my $pool = $fork->AnyEvent::Fork::Pool::run ($function, [key => value...])
85		113
86	Creates a new pool object with the specified C<$function> as function	114	Creates a new pool object with the specified C<$function> as function
87	(name) to call for each request.	115	(name) to call for each request. The pool uses the C<$fork> object as the
88		116	template when creating worker processes.
89	A pool consists of a template process that contains the code and data that
90	the worker processes need. And a number of worker processes that have been
91	forked off of that template process.
92		117
93	You can supply your own template process, or tell C<AnyEvent::Fork::Pool>	118	You can supply your own template process, or tell C<AnyEvent::Fork::Pool>
94	to create one.	119	to create one.
95		120
96	A relatively large number of key/value pairs can be specified to influence	121	A relatively large number of key/value pairs can be specified to influence
…		…
101		126
102	=item Pool Management	127	=item Pool Management
103		128
104	The pool consists of a certain number of worker processes. These options	129	The pool consists of a certain number of worker processes. These options
105	decide how many of these processes exist and when they are started and	130	decide how many of these processes exist and when they are started and
106	stopp.ed	131	stopped.
		132
		133	The worker pool is dynamically resized, according to (perceived :)
		134	load. The minimum size is given by the C<idle> parameter and the maximum
		135	size is given by the C<max> parameter. A new worker is started every
		136	C<start> seconds at most, and an idle worker is stopped at most every
		137	C<stop> second.
		138
		139	You can specify the amount of jobs sent to a worker concurrently using the
		140	C<load> parameter.
107		141
108	=over 4	142	=over 4
109		143
110	=item min => $count (default: 0)	144	=item idle => $count (default: 0)
111		145
112	The minimum number of processes in the pool, in addition to the template	146	The minimum amount of idle processes in the pool - when there are fewer
113	process. Even when idle, there will never be fewer than this number of	147	than this many idle workers, C<AnyEvent::Fork::Pool> will try to start new
114	worker processes. The default means that the pool can be empty.	148	ones, subject to the limits set by C<max> and C<start>.
		149
		150	This is also the initial amount of workers in the pool. The default of
		151	zero means that the pool starts empty and can shrink back to zero workers
		152	over time.
115		153
116	=item max => $count (default: 4)	154	=item max => $count (default: 4)
117		155
118	The maximum number of processes in the pool, in addition to the template	156	The maximum number of processes in the pool, in addition to the template
119	process. C<AnyEvent::Fork::Pool> will never create more than this number	157	process. C<AnyEvent::Fork::Pool> will never have more than this number of
120	of processes.	158	worker processes, although there can be more temporarily when a worker is
		159	shut down and hasn't exited yet.
121		160
122	=item max_queue => $count (default: 2)	161	=item load => $count (default: 2)
123		162
124	The maximum number of jobs sent to a single worker process. Worker	163	The maximum number of concurrent jobs sent to a single worker process.
125	processes that handle this number of jobs already are called "busy".
126		164
127	Jobs that cannot be sent to a worker immediately (because all workers are	165	Jobs that cannot be sent to a worker immediately (because all workers are
128	busy) will be queued until a worker is available.	166	busy) will be queued until a worker is available.
129		167
		168	Setting this low improves latency. For example, at C<1>, every job that
		169	is sent to a worker is sent to a completely idle worker that doesn't run
		170	any other jobs. The downside is that throughput is reduced - a worker that
		171	finishes a job needs to wait for a new job from the parent.
		172
		173	The default of C<2> is usually a good compromise.
		174
130	=item min_delay => $seconds (default: 0)	175	=item start => $seconds (default: 0.1)
131		176
132	When a job is queued and all workers are busy, a timer is started. If the	177	When there are fewer than C<idle> workers (or all workers are completely
133	timer elapses and there are still jobs that cannot be queued to a worker,	178	busy), then a timer is started. If the timer elapses and there are still
134	a new worker is started.	179	jobs that cannot be queued to a worker, a new worker is started.
135		180
136	This configurs the time that all workers must be busy before a new worker	181	This sets the minimum time that all workers must be busy before a new
137	is started. Or, put differently, the minimum delay betwene starting new	182	worker is started. Or, put differently, the minimum delay between starting
138	workers.	183	new workers.
139		184
140	The delay is zero by default, which means new workers will be started	185	The delay is small by default, which means new workers will be started
141	without delay.	186	relatively quickly. A delay of C<0> is possible, and ensures that the pool
		187	will grow as quickly as possible under load.
142		188
143	=item min_idle => $count (default: 0)	189	Non-zero values are useful to avoid "exploding" a pool because a lot of
		190	jobs are queued in an instant.
144		191
145	The minimum number of idle workers - when they are less, more	192	Higher values are often useful to improve efficiency at the cost of
146	are started. The C<min_delay> is still respected though, and	193	latency - when fewer processes can do the job over time, starting more and
147	C<min_idle>/C<min_delay> and C<max_idle>/C<idle_time> are useful to	194	more is not necessarily going to help.
148	dynamically adjust the pool.
149		195
150	=item max_idle => $count (default: 1)
151
152	The maximum number of idle workers. If a worker becomes idle and there are
153	already this many idle workers, it will be stopped immediately instead of
154	waiting for the idle timer to elapse.
155
156	=item idle_time => $seconds (default: 1)	196	=item stop => $seconds (default: 10)
157		197
158	When a worker has no jobs to execute it becomes idle. An idle worker that	198	When a worker has no jobs to execute it becomes idle. An idle worker that
159	hasn't executed a job within this amount of time will be stopped, unless	199	hasn't executed a job within this amount of time will be stopped, unless
160	the other parameters say otherwise.	200	the other parameters say otherwise.
161		201
		202	Setting this to a very high value means that workers stay around longer,
		203	even when they have nothing to do, which can be good as they don't have to
		204	be started on the netx load spike again.
		205
		206	Setting this to a lower value can be useful to avoid memory or simply
		207	process table wastage.
		208
		209	Usually, setting this to a time longer than the time between load spikes
		210	is best - if you expect a lot of requests every minute and little work
		211	in between, setting this to longer than a minute avoids having to stop
		212	and start workers. On the other hand, you have to ask yourself if letting
		213	workers run idle is a good use of your resources. Try to find a good
		214	balance between resource usage of your workers and the time to start new
		215	workers - the processes created by L<AnyEvent::Fork> itself is fats at
		216	creating workers while not using much memory for them, so most of the
		217	overhead is likely from your own code.
		218
162	=item on_destroy => $callback->() (default: none)	219	=item on_destroy => $callback->() (default: none)
163		220
164	When a pool object goes out of scope, it will still handle all outstanding	221	When a pool object goes out of scope, the outstanding requests are still
165	jobs. After that, it will destroy all workers (and also the template	222	handled till completion. Only after handling all jobs will the workers
166	process if it isn't referenced otherwise).	223	be destroyed (and also the template process if it isn't referenced
		224	otherwise).
		225
		226	To find out when a pool I<really> has finished its work, you can set this
		227	callback, which will be called when the pool has been destroyed.
167		228
168	=back	229	=back
169		230
170	=item Template Process	231	=item AnyEvent::Fork::RPC Parameters
171		232
172	The worker processes are all forked from a single template	233	These parameters are all passed more or less directly to
173	process. Ideally, all modules and all cdoe used by the worker, as well as	234	L<AnyEvent::Fork::RPC>. They are only briefly mentioned here, for
174	any shared data structures should be loaded into the template process, to	235	their full documentation please refer to the L<AnyEvent::Fork::RPC>
175	take advantage of data sharing via fork.	236	documentation. Also, the default values mentioned here are only documented
176		237	as a best effort - the L<AnyEvent::Fork::RPC> documentation is binding.
177	You can create your own template process by creating a L<AnyEvent::Fork>
178	object yourself and passing it as the C<template> parameter, but
179	C<AnyEvent::Fork::Pool> can create one for you, including some standard
180	options.
181		238
182	=over 4	239	=over 4
183		240
184	=item template => $fork (default: C<< AnyEvent::Fork->new >>)	241	=item async => $boolean (default: 0)
185		242
186	The template process to use, if you want to create your own.	243	Whether to use the synchronous or asynchronous RPC backend.
187		244
188	=item require => \@modules (default: C<[]>)	245	=item on_error => $callback->($message) (default: die with message)
189		246
190	The modules in this list will be laoded into the template process.	247	The callback to call on any (fatal) errors.
191		248
192	=item eval => "# perl code to execute in template" (default: none)	249	=item on_event => $callback->(...) (default: C<sub { }>, unlike L<AnyEvent::Fork::RPC>)
193		250
194	This is a perl string that is evaluated after creating the template	251	The callback to invoke on events.
195	process and after requiring the modules. It can do whatever it wants to	252
196	configure the process, but it must not do anything that would keep a later	253	=item init => $initfunction (default: none)
197	fork from working (so must not create event handlers or (real) threads for	254
198	example).	255	The function to call in the child, once before handling requests.
		256
		257	=item serialiser => $serialiser (defailt: $AnyEvent::Fork::RPC::STRING_SERIALISER)
		258
		259	The serialiser to use.
199		260
200	=back	261	=back
201		262
202	=item AnyEvent::Fork::RPC Parameters
203
204	These parameters are all passed directly to L<AnyEvent::Fork::RPC>. They
205	are only briefly mentioned here, for their full documentation
206	please refer to the L<AnyEvent::Fork::RPC> documentation. Also, the
207	default values mentioned here are only documented as a best effort -
208	L<AnyEvent::Fork::RPC> documentation is binding.
209
210	=over 4
211
212	=item async => $boolean (default: 0)
213
214	Whether to sue the synchronous or asynchronous RPC backend.
215
216	=item on_error => $callback->($message) (default: die with message)
217
218	The callback to call on any (fatal) errors.
219
220	=item on_event => $callback->(...) (default: C<sub { }>, unlike L<AnyEvent::Fork::RPC>)
221
222	The callback to invoke on events.
223
224	=item init => $initfunction (default: none)
225
226	The function to call in the child, once before handling requests.
227
228	=item serialiser => $serialiser (defailt: $AnyEvent::Fork::RPC::STRING_SERIALISER)
229
230	The serialiser to use.
231
232	=back	263	=back
233		264
234	=back
235
236	=cut	265	=cut
237		266
238	sub new {	267	sub run {
239	my ($class, $function, %arg) = @_;	268	my ($template, $function, %arg) = @_;
240		269
241	my $self = bless {	270	my $max = $arg{max} \|\| 4;
242	min => 0,	271	my $idle = $arg{idle} \|\| 0,
243	max => 4,	272	my $load = $arg{load} \|\| 2,
244	max_queue => 2,	273	my $start = $arg{start} \|\| 0.1,
245	min_delay => 0,	274	my $stop = $arg{stop} \|\| 10,
246	max_idle => 1,	275	my $on_event = $arg{on_event} \|\| sub { },
247	idle_time => 1,	276	my $on_destroy = $arg{on_destroy};
248	on_event => sub { },
249	%arg,
250	pool => [],
251	queue => [],
252	}, $class;
253		277
254	$self->{function} = $function;	278	my @rpc = (
		279	async => $arg{async},
		280	init => $arg{init},
		281	serialiser => delete $arg{serialiser},
		282	on_error => $arg{on_error},
		283	);
255		284
256	($self->{template} \|\|= new AnyEvent::Fork)	285	my (@pool, @queue, $nidle, $start_w, $stop_w, $shutdown);
		286	my ($start_worker, $stop_worker, $want_start, $want_stop, $scheduler);
		287
		288	my $destroy_guard = Guard::guard {
		289	$on_destroy->()
		290	if $on_destroy;
		291	};
		292
		293	$template
257	->require ("AnyEvent::Fork::RPC::" . ($self->{async} ? "Async" : "Sync"))	294	->require ("AnyEvent::Fork::RPC::" . ($arg{async} ? "Async" : "Sync"))
258	->require (@{ delete $self->{require} })
259	->eval ('	295	->eval ('
260	my ($magic0, $magic2) = @_;	296	my ($magic0, $magic1) = @_;
261	sub AnyEvent::Fork::Pool::quit() {	297	sub AnyEvent::Fork::Pool::retire() {
262	AnyEvent::Fork::RPC::on_event $magic0, "quit", $magic2;	298	AnyEvent::Fork::RPC::event $magic0, "quit", $magic1;
263	}	299	}
264	', $magic0, $magic2)	300	', $magic0, $magic1)
265	->eval (delete $self->{eval});	301	;
266		302
267	$self->start	303	$start_worker = sub {
268	while @{ $self->{pool} } < $self->{min};	304	my $proc = [0, 0, undef]; # load, index, rpc
269		305
270	$self	306	$proc->[2] = $template
		307	->fork
		308	->AnyEvent::Fork::RPC::run ($function,
		309	@rpc,
		310	on_event => sub {
		311	if (@_ == 3 && $_[0] eq $magic0 && $_[2] eq $magic1) {
		312	$destroy_guard if 0; # keep it alive
		313
		314	$_[1] eq "quit" and $stop_worker->($proc);
		315	return;
		316	}
		317
		318	&$on_event;
		319	},
		320	)
		321	;
		322
		323	++$nidle;
		324	Array::Heap::push_heap_idx @pool, $proc;
		325
		326	Scalar::Util::weaken $proc;
		327	};
		328
		329	$stop_worker = sub {
		330	my $proc = shift;
		331
		332	$proc->[0]
		333	or --$nidle;
		334
		335	Array::Heap::splice_heap_idx @pool, $proc->[1]
		336	if defined $proc->[1];
		337
		338	@$proc = 0; # tell others to leave it be
		339	};
		340
		341	$want_start = sub {
		342	undef $stop_w;
		343
		344	$start_w \|\|= AE::timer $start, $start, sub {
		345	if (($nidle < $idle \|\| @queue) && @pool < $max) {
		346	$start_worker->();
		347	$scheduler->();
		348	} else {
		349	undef $start_w;
		350	}
		351	};
		352	};
		353
		354	$want_stop = sub {
		355	$stop_w \|\|= AE::timer $stop, $stop, sub {
		356	$stop_worker->($pool[0])
		357	if $nidle;
		358
		359	undef $stop_w
		360	if $nidle <= $idle;
		361	};
		362	};
		363
		364	$scheduler = sub {
		365	if (@queue) {
		366	while (@queue) {
		367	@pool or $start_worker->();
		368
		369	my $proc = $pool[0];
		370
		371	if ($proc->[0] < $load) {
		372	# found free worker, increase load
		373	unless ($proc->[0]++) {
		374	# worker became busy
		375	--$nidle
		376	or undef $stop_w;
		377
		378	$want_start->()
		379	if $nidle < $idle && @pool < $max;
		380	}
		381
		382	Array::Heap::adjust_heap_idx @pool, 0;
		383
		384	my $job = shift @queue;
		385	my $ocb = pop @$job;
		386
		387	$proc->[2]->(@$job, sub {
		388	# reduce load
		389	--$proc->[0] # worker still busy?
		390	or ++$nidle > $idle # not too many idle processes?
		391	or $want_stop->();
		392
		393	Array::Heap::adjust_heap_idx @pool, $proc->[1]
		394	if defined $proc->[1];
		395
		396	&$ocb;
		397
		398	$scheduler->();
		399	});
		400	} else {
		401	$want_start->()
		402	unless @pool >= $max;
		403
		404	last;
		405	}
		406	}
		407	} elsif ($shutdown) {
		408	@pool = ();
		409	undef $start_w;
		410	undef $start_worker; # frees $destroy_guard reference
		411
		412	$stop_worker->($pool[0])
		413	while $nidle;
		414	}
		415	};
		416
		417	my $shutdown_guard = Guard::guard {
		418	$shutdown = 1;
		419	$scheduler->();
		420	};
		421
		422	$start_worker->()
		423	while @pool < $idle;
		424
		425	sub {
		426	$shutdown_guard if 0; # keep it alive
		427
		428	$start_worker->()
		429	unless @pool;
		430
		431	push @queue, [@_];
		432	$scheduler->();
		433	}
271	}	434	}
272		435
273	sub start {
274	my ($self) = @_;
275
276	warn "start\n";#d#
277
278	Scalar::Util::weaken $self;
279
280	my $proc = [0, undef, undef];
281
282	$proc->[1] = $self->{template}
283	->fork
284	->AnyEvent::Fork::RPC::run ($self->{function},
285	async => $self->{async},
286	init => $self->{init},
287	serialiser => $self->{serialiser},
288	on_error => $self->{on_error},
289	on_event => sub {
290	if (@_ == 3 && $_[0] eq $magic0 && $_[2] eq $magic2) {
291	if ($_[1] eq "quit") {
292	my $pool = $self->{pool};
293	for (0 .. $#$pool) {
294	if ($pool->[$_] == $proc) {
295	Array::Heap::splice_heap @$pool, $_;
296	return;
297	}
298	}
299	die;
300	}
301	return;
302	}
303
304	&{ $self->{on_event} };
305	},
306	)
307	;
308
309	++$self->{idle};
310	Array::Heap::push_heap @{ $self->{pool} }, $proc;
311	}
312
313	=item $pool->call (..., $cb->(...))	436	=item $pool->(..., $cb->(...))
314		437
315	Call the RPC function of a worker with the given arguments, and when the	438	Call the RPC function of a worker with the given arguments, and when the
316	worker is done, call the C<$cb> with the results, like just calling the	439	worker is done, call the C<$cb> with the results, just like calling the
317	L<AnyEvent::Fork::RPC> object directly.	440	RPC object durectly - see the L<AnyEvent::Fork::RPC> documentation for
		441	details on the RPC API.
318		442
319	If there is no free worker, the call will be queued.	443	If there is no free worker, the call will be queued until a worker becomes
		444	available.
320		445
321	Note that there can be considerable time between calling this method and	446	Note that there can be considerable time between calling this method and
322	the call actually being executed. During this time, the parameters passed	447	the call actually being executed. During this time, the parameters passed
323	to this function are effectively read-only - modifying them after the call	448	to this function are effectively read-only - modifying them after the call
324	and before the callback is invoked causes undefined behaviour.	449	and before the callback is invoked causes undefined behaviour.
325		450
326	=cut	451	=cut
327		452
328	sub scheduler {	453	=item $cpus = AnyEvent::Fork::Pool::ncpu [$default_cpus]
329	my $self = shift;
330		454
331	my $pool = $self->{pool};	455	=item ($cpus, $eus) = AnyEvent::Fork::Pool::ncpu [$default_cpus]
332	my $queue = $self->{queue};
333		456
334	$self->start	457	Tries to detect the number of CPUs (C<$cpus> often called cpu cores
335	unless @$pool;	458	nowadays) and execution units (C<$eus>) which include e.g. extra
		459	hyperthreaded units). When C<$cpus> cannot be determined reliably,
		460	C<$default_cpus> is returned for both values, or C<1> if it is missing.
336		461
337	while (@$queue) {	462	For normal CPU bound uses, it is wise to have as many worker processes
338	my $proc = $pool->[0];	463	as CPUs in the system (C<$cpus>), if nothing else uses the CPU. Using
		464	hyperthreading is usually detrimental to performance, but in those rare
		465	cases where that really helps it might be beneficial to use more workers
		466	(C<$eus>).
339		467
340	if ($proc->[0] < $self->{max_queue}) {	468	Currently, F</proc/cpuinfo> is parsed on GNU/Linux systems for both
341	warn "free $proc $proc->[0]\n";#d#	469	C<$cpus> and C<$eu>, and on {Free,Net,Open}BSD, F<sysctl -n hw.ncpu> is
342	# found free worker	470	used for C<$cpus>.
343	--$self->{idle}
344	unless $proc->[0]++;
345		471
346	undef $proc->[2];	472	Example: create a worker pool with as many workers as cpu cores, or C<2>,
		473	if the actual number could not be determined.
347		474
348	Array::Heap::adjust_heap @$pool, 0;	475	$fork->AnyEvent::Fork::Pool::run ("myworker::function",
		476	max => (scalar AnyEvent::Fork::Pool::ncpu 2),
		477	);
349		478
350	my $job = shift @$queue;	479	=cut
351	my $ocb = pop @$job;
352		480
353	$proc->[1]->(@$job, sub {	481	BEGIN {
354	for (0 .. $#$pool) {	482	if ($^O eq "linux") {
355	if ($pool->[$_] == $proc) {	483	*ncpu = sub(;$) {
356	# reduce queue counter	484	my ($cpus, $eus);
357	unless (--$pool->[$_][0]) {
358	# worker becomes idle
359	my $to = ++$self->{idle} > $self->{max_idle}
360	? 0
361	: $self->{idle_time};
362		485
363	$proc->[2] = AE::timer $to, 0, sub {	486	if (open my $fh, "<", "/proc/cpuinfo") {
364	undef $proc->[2];	487	my %id;
365		488
366	warn "destroy $proc afzer $to\n";#d#	489	while (<$fh>) {
367		490	if (/^core id\s:\s(\d+)/) {
368	for (0 .. $#$pool) {
369	if ($pool->[$_] == $proc) {
370	Array::Heap::splice_heap @$pool, $_;
371	--$self->{idle};
372	last;
373	}
374	}
375	};
376	}
377
378	Array::Heap::adjust_heap @$pool, $_;
379	last;	491	++$eus;
		492	undef $id{$1};
380	}	493	}
381	}	494	}
382	&$ocb;	495
383	});	496	$cpus = scalar keys %id;
384	} else {	497	} else {
385	warn "busy $proc->[0]\n";#d#	498	$cpus = $eus = @_ ? shift : 1;
386	# all busy, delay
387
388	$self->{min_delay_w} \|\|= AE::timer $self->{min_delay}, 0, sub {
389	delete $self->{min_delay_w};
390
391	if (@{ $self->{queue} }) {
392	$self->start;
393	$self->scheduler;
394	}
395	};	499	}
396	last;	500	wantarray ? ($cpus, $eus) : $cpus
397	}	501	};
		502	} elsif ($^O eq "freebsd" \|\| $^O eq "netbsd" \|\| $^O eq "openbsd") {
		503	*ncpu = sub(;$) {
		504	my $cpus = qx<sysctl -n hw.ncpu> * 1
		505	\|\| (@_ ? shift : 1);
		506	wantarray ? ($cpus, $cpus) : $cpus
		507	};
		508	} else {
		509	*ncpu = sub(;$) {
		510	my $cpus = @_ ? shift : 1;
		511	wantarray ? ($cpus, $cpus) : $cpus
		512	};
398	}	513	}
399	warn "last\n";#d#
400	}	514	}
401		515
402	sub call {
403	my $self = shift;
404
405	push @{ $self->{queue} }, [@_];
406	$self->scheduler;
407	}
408
409	sub DESTROY {
410	$_[0]{on_destroy}->();
411	}
412
413	=back	516	=back
		517
		518	=head1 CHILD USAGE
		519
		520	In addition to the L<AnyEvent::Fork::RPC> API, this module implements one
		521	more child-side function:
		522
		523	=over 4
		524
		525	=item AnyEvent::Fork::Pool::retire ()
		526
		527	This function sends an event to the parent process to request retirement:
		528	the worker is removed from the pool and no new jobs will be sent to it,
		529	but it has to handle the jobs that are already queued.
		530
		531	The parentheses are part of the syntax: the function usually isn't defined
		532	when you compile your code (because that happens I<before> handing the
		533	template process over to C<AnyEvent::Fork::Pool::run>, so you need the
		534	empty parentheses to tell Perl that the function is indeed a function.
		535
		536	Retiring a worker can be useful to gracefully shut it down when the worker
		537	deems this useful. For example, after executing a job, one could check
		538	the process size or the number of jobs handled so far, and if either is
		539	too high, the worker could ask to get retired, to avoid memory leaks to
		540	accumulate.
		541
		542	Example: retire a worker after it has handled roughly 100 requests.
		543
		544	my $count = 0;
		545
		546	sub my::worker {
		547
		548	++$count == 100
		549	and AnyEvent::Fork::Pool::retire ();
		550
		551	... normal code goes here
		552	}
		553
		554	=back
		555
		556	=head1 POOL PARAMETERS RECIPES
		557
		558	This section describes some recipes for pool paramaters. These are mostly
		559	meant for the synchronous RPC backend, as the asynchronous RPC backend
		560	changes the rules considerably, making workers themselves responsible for
		561	their scheduling.
		562
		563	=over 4
		564
		565	=item low latency - set load = 1
		566
		567	If you need a deterministic low latency, you should set the C<load>
		568	parameter to C<1>. This ensures that never more than one job is sent to
		569	each worker. This avoids having to wait for a previous job to finish.
		570
		571	This makes most sense with the synchronous (default) backend, as the
		572	asynchronous backend can handle multiple requests concurrently.
		573
		574	=item lowest latency - set load = 1 and idle = max
		575
		576	To achieve the lowest latency, you additionally should disable any dynamic
		577	resizing of the pool by setting C<idle> to the same value as C<max>.
		578
		579	=item high throughput, cpu bound jobs - set load >= 2, max = #cpus
		580
		581	To get high throughput with cpu-bound jobs, you should set the maximum
		582	pool size to the number of cpus in your system, and C<load> to at least
		583	C<2>, to make sure there can be another job waiting for the worker when it
		584	has finished one.
		585
		586	The value of C<2> for C<load> is the minimum value that I<can> achieve
		587	100% throughput, but if your parent process itself is sometimes busy, you
		588	might need higher values. Also there is a limit on the amount of data that
		589	can be "in flight" to the worker, so if you send big blobs of data to your
		590	worker, C<load> might have much less of an effect.
		591
		592	=item high throughput, I/O bound jobs - set load >= 2, max = 1, or very high
		593
		594	When your jobs are I/O bound, using more workers usually boils down to
		595	higher throughput, depending very much on your actual workload - sometimes
		596	having only one worker is best, for example, when you read or write big
		597	files at maixmum speed, as a second worker will increase seek times.
		598
		599	=back
		600
		601	=head1 EXCEPTIONS
		602
		603	The same "policy" as with L<AnyEvent::Fork::RPC> applies - exceptins will
		604	not be caught, and exceptions in both worker and in callbacks causes
		605	undesirable or undefined behaviour.
414		606
415	=head1 SEE ALSO	607	=head1 SEE ALSO
416		608
417	L<AnyEvent::Fork>, to create the processes in the first place.	609	L<AnyEvent::Fork>, to create the processes in the first place.
418		610

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing AnyEvent-Fork-Pool/Pool.pm (file contents): Revision 1.2 by root, Thu Apr 18 21:37:27 2013 UTC vs. Revision 1.11 by root, Sun Apr 28 14:19:22 2013 UTC

Diff Legend

Comparing AnyEvent-Fork-Pool/Pool.pm (file contents):
Revision 1.2 by root, Thu Apr 18 21:37:27 2013 UTC vs.
Revision 1.11 by root, Sun Apr 28 14:19:22 2013 UTC