1 | NAME |
1 | NAME |
2 | AnyEvent::Fork::Pool - simple process pool manager on top of |
2 | AnyEvent::Fork::Pool - simple process pool manager on top of |
3 | AnyEvent::Fork |
3 | AnyEvent::Fork |
4 | |
4 | |
5 | THE API IS NOT FINISHED, CONSIDER THIS AN ALPHA RELEASE |
|
|
6 | |
|
|
7 | SYNOPSIS |
5 | SYNOPSIS |
8 | use AnyEvent; |
6 | use AnyEvent; |
|
|
7 | use AnyEvent::Fork; |
9 | use AnyEvent::Fork::Pool; |
8 | use AnyEvent::Fork::Pool; |
10 | # use AnyEvent::Fork is not needed |
|
|
11 | |
9 | |
12 | # all possible parameters shown, with default values |
10 | # all possible parameters shown, with default values |
13 | my $pool = AnyEvent::Fork |
11 | my $pool = AnyEvent::Fork |
14 | ->new |
12 | ->new |
15 | ->require ("MyWorker") |
13 | ->require ("MyWorker") |
… | |
… | |
41 | undef $pool; |
39 | undef $pool; |
42 | |
40 | |
43 | $finish->recv; |
41 | $finish->recv; |
44 | |
42 | |
45 | DESCRIPTION |
43 | DESCRIPTION |
46 | This module uses processes created via AnyEvent::Fork and the RPC |
44 | This module uses processes created via AnyEvent::Fork (or |
47 | protocol implement in AnyEvent::Fork::RPC to create a load-balanced pool |
45 | AnyEvent::Fork::Remote) and the RPC protocol implement in |
48 | of processes that handles jobs. |
46 | AnyEvent::Fork::RPC to create a load-balanced pool of processes that |
|
|
47 | handles jobs. |
49 | |
48 | |
50 | Understanding of AnyEvent::Fork is helpful but not critical to be able |
49 | Understanding AnyEvent::Fork is helpful but not required to use this |
51 | to use this module, but a thorough understanding of AnyEvent::Fork::RPC |
50 | module, but a thorough understanding of AnyEvent::Fork::RPC is, as it |
52 | is, as it defines the actual API that needs to be implemented in the |
51 | defines the actual API that needs to be implemented in the worker |
53 | worker processes. |
52 | processes. |
54 | |
53 | |
55 | EXAMPLES |
|
|
56 | PARENT USAGE |
54 | PARENT USAGE |
57 | To create a pool, you first have to create a AnyEvent::Fork object - |
55 | To create a pool, you first have to create a AnyEvent::Fork object - |
58 | this object becomes your template process. Whenever a new worker process |
56 | this object becomes your template process. Whenever a new worker process |
59 | is needed, it is forked from this template process. Then you need to |
57 | is needed, it is forked from this template process. Then you need to |
60 | "hand off" this template process to the "AnyEvent::Fork::Pool" module by |
58 | "hand off" this template process to the "AnyEvent::Fork::Pool" module by |
… | |
… | |
235 | and the call actually being executed. During this time, the |
233 | and the call actually being executed. During this time, the |
236 | parameters passed to this function are effectively read-only - |
234 | parameters passed to this function are effectively read-only - |
237 | modifying them after the call and before the callback is invoked |
235 | modifying them after the call and before the callback is invoked |
238 | causes undefined behaviour. |
236 | causes undefined behaviour. |
239 | |
237 | |
|
|
238 | $cpus = AnyEvent::Fork::Pool::ncpu [$default_cpus] |
|
|
239 | ($cpus, $eus) = AnyEvent::Fork::Pool::ncpu [$default_cpus] |
|
|
240 | Tries to detect the number of CPUs ($cpus often called CPU cores |
|
|
241 | nowadays) and execution units ($eus) which include e.g. extra |
|
|
242 | hyperthreaded units). When $cpus cannot be determined reliably, |
|
|
243 | $default_cpus is returned for both values, or 1 if it is missing. |
|
|
244 | |
|
|
245 | For normal CPU bound uses, it is wise to have as many worker |
|
|
246 | processes as CPUs in the system ($cpus), if nothing else uses the |
|
|
247 | CPU. Using hyperthreading is usually detrimental to performance, but |
|
|
248 | in those rare cases where that really helps it might be beneficial |
|
|
249 | to use more workers ($eus). |
|
|
250 | |
|
|
251 | Currently, /proc/cpuinfo is parsed on GNU/Linux systems for both |
|
|
252 | $cpus and $eus, and on {Free,Net,Open}BSD, sysctl -n hw.ncpu is used |
|
|
253 | for $cpus. |
|
|
254 | |
|
|
255 | Example: create a worker pool with as many workers as CPU cores, or |
|
|
256 | 2, if the actual number could not be determined. |
|
|
257 | |
|
|
258 | $fork->AnyEvent::Fork::Pool::run ("myworker::function", |
|
|
259 | max => (scalar AnyEvent::Fork::Pool::ncpu 2), |
|
|
260 | ); |
|
|
261 | |
240 | CHILD USAGE |
262 | CHILD USAGE |
241 | In addition to the AnyEvent::Fork::RPC API, this module implements one |
263 | In addition to the AnyEvent::Fork::RPC API, this module implements one |
242 | more child-side function: |
264 | more child-side function: |
243 | |
265 | |
244 | AnyEvent::Fork::Pool::retire () |
266 | AnyEvent::Fork::Pool::retire () |
245 | This function sends an event to the parent process to request |
267 | This function sends an event to the parent process to request |
246 | retirement: the worker is removed from the pool and no new jobs will |
268 | retirement: the worker is removed from the pool and no new jobs will |
247 | be sent to it, but it has to handle the jobs that are already |
269 | be sent to it, but it still has to handle the jobs that are already |
248 | queued. |
270 | queued. |
249 | |
271 | |
250 | The parentheses are part of the syntax: the function usually isn't |
272 | The parentheses are part of the syntax: the function usually isn't |
251 | defined when you compile your code (because that happens *before* |
273 | defined when you compile your code (because that happens *before* |
252 | handing the template process over to "AnyEvent::Fork::Pool::run", so |
274 | handing the template process over to "AnyEvent::Fork::Pool::run", so |
253 | you need the empty parentheses to tell Perl that the function is |
275 | you need the empty parentheses to tell Perl that the function is |
254 | indeed a function. |
276 | indeed a function. |
255 | |
277 | |
256 | Retiring a worker can be useful to gracefully shut it down when the |
278 | Retiring a worker can be useful to gracefully shut it down when the |
257 | worker deems this useful. For example, after executing a job, one |
279 | worker deems this useful. For example, after executing a job, it |
258 | could check the process size or the number of jobs handled so far, |
280 | could check the process size or the number of jobs handled so far, |
259 | and if either is too high, the worker could ask to get retired, to |
281 | and if either is too high, the worker could request to be retired, |
260 | avoid memory leaks to accumulate. |
282 | to avoid memory leaks to accumulate. |
|
|
283 | |
|
|
284 | Example: retire a worker after it has handled roughly 100 requests. |
|
|
285 | It doesn't matter whether you retire at the beginning or end of your |
|
|
286 | request, as the worker will continue to handle some outstanding |
|
|
287 | requests. Likewise, it's ok to call retire multiple times. |
|
|
288 | |
|
|
289 | my $count = 0; |
|
|
290 | |
|
|
291 | sub my::worker { |
|
|
292 | |
|
|
293 | ++$count == 100 |
|
|
294 | and AnyEvent::Fork::Pool::retire (); |
|
|
295 | |
|
|
296 | ... normal code goes here |
|
|
297 | } |
261 | |
298 | |
262 | POOL PARAMETERS RECIPES |
299 | POOL PARAMETERS RECIPES |
263 | This section describes some recipes for pool paramaters. These are |
300 | This section describes some recipes for pool parameters. These are |
264 | mostly meant for the synchronous RPC backend, as the asynchronous RPC |
301 | mostly meant for the synchronous RPC backend, as the asynchronous RPC |
265 | backend changes the rules considerably, making workers themselves |
302 | backend changes the rules considerably, making workers themselves |
266 | responsible for their scheduling. |
303 | responsible for their scheduling. |
267 | |
304 | |
268 | low latency - set load = 1 |
305 | low latency - set load = 1 |
… | |
… | |
294 | |
331 | |
295 | high throughput, I/O bound jobs - set load >= 2, max = 1, or very high |
332 | high throughput, I/O bound jobs - set load >= 2, max = 1, or very high |
296 | When your jobs are I/O bound, using more workers usually boils down |
333 | When your jobs are I/O bound, using more workers usually boils down |
297 | to higher throughput, depending very much on your actual workload - |
334 | to higher throughput, depending very much on your actual workload - |
298 | sometimes having only one worker is best, for example, when you read |
335 | sometimes having only one worker is best, for example, when you read |
299 | or write big files at maixmum speed, as a second worker will |
336 | or write big files at maximum speed, as a second worker will |
300 | increase seek times. |
337 | increase seek times. |
301 | |
338 | |
302 | EXCEPTIONS |
339 | EXCEPTIONS |
303 | The same "policy" as with AnyEvent::Fork::RPC applies - exceptins will |
340 | The same "policy" as with AnyEvent::Fork::RPC applies - exceptions will |
304 | not be caught, and exceptions in both worker and in callbacks causes |
341 | not be caught, and exceptions in both worker and in callbacks causes |
305 | undesirable or undefined behaviour. |
342 | undesirable or undefined behaviour. |
306 | |
343 | |
307 | SEE ALSO |
344 | SEE ALSO |
308 | AnyEvent::Fork, to create the processes in the first place. |
345 | AnyEvent::Fork, to create the processes in the first place. |
|
|
346 | |
|
|
347 | AnyEvent::Fork::Remote, likewise, but helpful for remote processes. |
309 | |
348 | |
310 | AnyEvent::Fork::RPC, which implements the RPC protocol and API. |
349 | AnyEvent::Fork::RPC, which implements the RPC protocol and API. |
311 | |
350 | |
312 | AUTHOR AND CONTACT INFORMATION |
351 | AUTHOR AND CONTACT INFORMATION |
313 | Marc Lehmann <schmorp@schmorp.de> |
352 | Marc Lehmann <schmorp@schmorp.de> |