ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/AnyEvent-Fork-RPC/RPC.pm
Revision: 1.11
Committed: Thu Apr 18 07:59:46 2013 UTC (11 years, 1 month ago) by root
Branch: MAIN
Changes since 1.10: +111 -2 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 AnyEvent::Fork::RPC - simple RPC extension for AnyEvent::Fork
4
5 =head1 SYNOPSIS
6
7 use AnyEvent::Fork::RPC;
8 # use AnyEvent::Fork is not needed
9
10 my $rpc = AnyEvent::Fork
11 ->new
12 ->require ("MyModule")
13 ->AnyEvent::Fork::RPC::run (
14 "MyModule::server",
15 );
16
17 my $cv = AE::cv;
18
19 $rpc->(1, 2, 3, sub {
20 print "MyModule::server returned @_\n";
21 $cv->send;
22 });
23
24 $cv->recv;
25
26 =head1 DESCRIPTION
27
28 This module implements a simple RPC protocol and backend for processes
29 created via L<AnyEvent::Fork>, allowing you to call a function in the
30 child process and receive its return values (up to 4GB serialised).
31
32 It implements two different backends: a synchronous one that works like a
33 normal function call, and an asynchronous one that can run multiple jobs
34 concurrently in the child, using AnyEvent.
35
36 It also implements an asynchronous event mechanism from the child to the
37 parent, that could be used for progress indications or other information.
38
39 Loading this module also always loads L<AnyEvent::Fork>, so you can make a
40 separate C<use AnyEvent::Fork> if you wish, but you don't have to.
41
42 =head1 EXAMPLES
43
44 =head2 Example 1: Synchronous Backend
45
46 Here is a simple example that implements a backend that executes C<unlink>
47 and C<rmdir> calls, and reports their status back. It also reports the
48 number of requests it has processed every three requests, which is clearly
49 silly, but illustrates the use of events.
50
51 First the parent process:
52
53 use AnyEvent;
54 use AnyEvent::Fork::RPC;
55
56 my $done = AE::cv;
57
58 my $rpc = AnyEvent::Fork
59 ->new
60 ->require ("MyWorker")
61 ->AnyEvent::Fork::RPC::run ("MyWorker::run",
62 on_error => sub { warn "FATAL: $_[0]"; exit 1 },
63 on_event => sub { warn "$_[0] requests handled\n" },
64 on_destroy => $done,
65 );
66
67 for my $id (1..6) {
68 $rpc->(rmdir => "/tmp/somepath/$id", sub {
69 $_[0]
70 or warn "/tmp/somepath/$id: $_[1]\n";
71 });
72 }
73
74 undef $rpc;
75
76 $done->recv;
77
78 The parent creates the process, queues a few rmdir's. It then forgets
79 about the C<$rpc> object, so that the child exits after it has handled the
80 requests, and then it waits till the requests have been handled.
81
82 The child is implemented using a separate module, C<MyWorker>, shown here:
83
84 package MyWorker;
85
86 my $count;
87
88 sub run {
89 my ($cmd, $path) = @_;
90
91 AnyEvent::Fork::RPC::event ($count)
92 unless ++$count % 3;
93
94 my $status = $cmd eq "rmdir" ? rmdir $path
95 : $cmd eq "unlink" ? unlink $path
96 : die "fatal error, illegal command '$cmd'";
97
98 $status or (0, "$!")
99 }
100
101 1
102
103 The C<run> function first sends a "progress" event every three calls, and
104 then executes C<rmdir> or C<unlink>, depending on the first parameter (or
105 dies with a fatal error - obviously, you must never let this happen :).
106
107 Eventually it returns the status value true if the command was successful,
108 or the status value 0 and the stringified error message.
109
110 On my system, running the first code fragment with the given
111 F<MyWorker.pm> in the current directory yields:
112
113 /tmp/somepath/1: No such file or directory
114 /tmp/somepath/2: No such file or directory
115 3 requests handled
116 /tmp/somepath/3: No such file or directory
117 /tmp/somepath/4: No such file or directory
118 /tmp/somepath/5: No such file or directory
119 6 requests handled
120 /tmp/somepath/6: No such file or directory
121
122 Obviously, none of the directories I am trying to delete even exist. Also,
123 the events and responses are processed in exactly the same order as
124 they were created in the child, which is true for both synchronous and
125 asynchronous backends.
126
127 Note that the parentheses in the call to C<AnyEvent::Fork::RPC::event> are
128 not optional. That is because the function isn't defined when the code is
129 compiled. You can make sure it is visible by pre-loading the correct
130 backend module in the call to C<require>:
131
132 ->require ("AnyEvent::Fork::RPC::Sync", "MyWorker")
133
134 Since the backend module declares the C<event> function, loading it first
135 ensures that perl will correctly interpret calls to it.
136
137 And as a final remark, there is a fine module on CPAN that can
138 asynchronously C<rmdir> and C<unlink> and a lot more, and more efficiently
139 than this example, namely L<IO::AIO>.
140
141 =head3 Example 1a: the same with the asynchronous backend
142
143 This example only shows what needs to be changed to use the async backend
144 instead. Doing this is not very useful, the purpose of this example is
145 to show the minimum amount of change that is required to go from the
146 synchronous to the asynchronous backend.
147
148 To use the async backend in the previous example, you need to add the
149 C<async> parameter to the C<AnyEvent::Fork::RPC::run> call:
150
151 ->AnyEvent::Fork::RPC::run ("MyWorker::run",
152 async => 1,
153 ...
154
155 And since the function call protocol is now changed, you need to adopt
156 C<MyWorker::run> to the async API.
157
158 First, you need to accept the extra initial C<$done> callback:
159
160 sub run {
161 my ($done, $cmd, $path) = @_;
162
163 And since a response is now generated when C<$done> is called, as opposed
164 to when the function returns, we need to call the C<$done> function with
165 the status:
166
167 $done->($status or (0, "$!"));
168
169 A few remarks are in order. First, it's quite pointless to use the async
170 backend for this example - but it I<is> possible. Second, you can call
171 C<$done> before or after returning from the function. Third, having both
172 returned from the function and having called the C<$done> callback, the
173 child process may exit at any time, so you should call C<$done> only when
174 you really I<are> done.
175
176 =head2 Example 2: Asynchronous Backend
177
178 This example implements multiple count-downs in the child, using
179 L<AnyEvent> timers. While this is a bit silly (one could use timers in te
180 parent just as well), it illustrates the ability to use AnyEvent in the
181 child and the fact that responses can arrive in a different order then the
182 requests.
183
184 It also shows how to embed the actual child code into a C<__DATA__>
185 section, so it doesn't need any external files at all.
186
187 And when your parent process is often busy, and you have stricter timing
188 requirements, then running timers in a child process suddenly doesn't look
189 so silly anymore.
190
191 Without further ado, here is the code:
192
193 use AnyEvent;
194 use AnyEvent::Fork::RPC;
195
196 my $done = AE::cv;
197
198 my $rpc = AnyEvent::Fork
199 ->new
200 ->require ("AnyEvent::Fork::RPC::Async")
201 ->eval (do { local $/; <DATA> })
202 ->AnyEvent::Fork::RPC::run ("run",
203 async => 1,
204 on_error => sub { warn "FATAL: $_[0]"; exit 1 },
205 on_event => sub { print $_[0] },
206 on_destroy => $done,
207 );
208
209 for my $count (3, 2, 1) {
210 $rpc->($count, sub {
211 warn "job $count finished\n";
212 });
213 }
214
215 undef $rpc;
216
217 $done->recv;
218
219 __DATA__
220
221 # this ends up in main, as we don't use a package declaration
222
223 use AnyEvent;
224
225 sub run {
226 my ($done, $count) = @_;
227
228 my $n;
229
230 AnyEvent::Fork::RPC::event "starting to count up to $count\n";
231
232 my $w; $w = AE::timer 1, 1, sub {
233 ++$n;
234
235 AnyEvent::Fork::RPC::event "count $n of $count\n";
236
237 if ($n == $count) {
238 undef $w;
239 $done->();
240 }
241 };
242 }
243
244 The parent part (the one before the C<__DATA__> section) isn't very
245 different from the earlier examples. It sets async mode, preloads
246 the backend module (so the C<AnyEvent::Fork::RPC::event> function is
247 declared), uses a slightly different C<on_event> handler (which we use
248 simply for logging purposes) and then, instead of loading a module with
249 the actual worker code, it C<eval>'s the code from the data section in the
250 child process.
251
252 It then starts three countdowns, from 3 to 1 seconds downwards, destroys
253 the rpc object so the example finishes eventually, and then just waits for
254 the stuff to trickle in.
255
256 The worker code uses the event function to log some progress messages, but
257 mostly just creates a recurring one-second timer.
258
259 The timer callback increments a counter, logs a message, and eventually,
260 when the count has been reached, calls the finish callback.
261
262 On my system, this results in the following output. Since all timers fire
263 at roughly the same time, the actual order isn't guaranteed, but the order
264 shown is very likely what you would get, too.
265
266 starting to count up to 3
267 starting to count up to 2
268 starting to count up to 1
269 count 1 of 3
270 count 1 of 2
271 count 1 of 1
272 job 1 finished
273 count 2 of 2
274 job 2 finished
275 count 2 of 3
276 count 3 of 3
277 job 3 finished
278
279 While the overall ordering isn't guaranteed, the async backend still
280 guarantees that events and responses are delivered to the parent process
281 in the exact same ordering as they were generated in the child process.
282
283 And unless your system is I<very> busy, it should clearly show that the
284 job started last will finish first, as it has the lowest count.
285
286 This concludes the async example. Since L<AnyEvent::Fork> does not
287 actually fork, you are free to use about any module in the child, not just
288 L<AnyEvent>, but also L<IO::AIO>, or L<Tk> for example.
289
290 =head1 PARENT PROCESS USAGE
291
292 This module exports nothing, and only implements a single function:
293
294 =over 4
295
296 =cut
297
298 package AnyEvent::Fork::RPC;
299
300 use common::sense;
301
302 use Errno ();
303 use Guard ();
304
305 use AnyEvent;
306 use AnyEvent::Fork; # we don't actually depend on it, this is for convenience
307
308 our $VERSION = 0.1;
309
310 =item my $rpc = AnyEvent::Fork::RPC::run $fork, $function, [key => value...]
311
312 The traditional way to call it. But it is way cooler to call it in the
313 following way:
314
315 =item my $rpc = $fork->AnyEvent::Fork::RPC::run ($function, [key => value...])
316
317 This C<run> function/method can be used in place of the
318 L<AnyEvent::Fork::run> method. Just like that method, it takes over
319 the L<AnyEvent::Fork> process, but instead of calling the specified
320 C<$function> directly, it runs a server that accepts RPC calls and handles
321 responses.
322
323 It returns a function reference that can be used to call the function in
324 the child process, handling serialisation and data transfers.
325
326 The following key/value pairs are allowed. It is recommended to have at
327 least an C<on_error> or C<on_event> handler set.
328
329 =over 4
330
331 =item on_error => $cb->($msg)
332
333 Called on (fatal) errors, with a descriptive (hopefully) message. If
334 this callback is not provided, but C<on_event> is, then the C<on_event>
335 callback is called with the first argument being the string C<error>,
336 followed by the error message.
337
338 If neither handler is provided it prints the error to STDERR and will
339 start failing badly.
340
341 =item on_event => $cb->(...)
342
343 Called for every call to the C<AnyEvent::Fork::RPC::event> function in the
344 child, with the arguments of that function passed to the callback.
345
346 Also called on errors when no C<on_error> handler is provided.
347
348 =item on_destroy => $cb->()
349
350 Called when the C<$rpc> object has been destroyed and all requests have
351 been successfully handled. This is useful when you queue some requests and
352 want the child to go away after it has handled them. The problem is that
353 the parent must not exit either until all requests have been handled, and
354 this can be accomplished by waiting for this callback.
355
356 =item init => $function (default none)
357
358 When specified (by name), this function is called in the child as the very
359 first thing when taking over the process, with all the arguments normally
360 passed to the C<AnyEvent::Fork::run> function, except the communications
361 socket.
362
363 It can be used to do one-time things in the child such as storing passed
364 parameters or opening database connections.
365
366 It is called very early - before the serialisers are created or the
367 C<$function> name is resolved into a function reference, so it could be
368 used to load any modules that provide the serialiser or function. It can
369 not, however, create events.
370
371 =item async => $boolean (default: 0)
372
373 The default server used in the child does all I/O blockingly, and only
374 allows a single RPC call to execute concurrently.
375
376 Setting C<async> to a true value switches to another implementation that
377 uses L<AnyEvent> in the child and allows multiple concurrent RPC calls.
378
379 The actual API in the child is documented in the section that describes
380 the calling semantics of the returned C<$rpc> function.
381
382 If you want to pre-load the actual back-end modules to enable memory
383 sharing, then you should load C<AnyEvent::Fork::RPC::Sync> for
384 synchronous, and C<AnyEvent::Fork::RPC::Async> for asynchronous mode.
385
386 If you use a template process and want to fork both sync and async
387 children, then it is permissible to load both modules.
388
389 =item serialiser => $string (default: '(sub { pack "(w/a*)*", @_ }, sub { unpack "(w/a*)*", shift })')
390
391 All arguments, result data and event data have to be serialised to be
392 transferred between the processes. For this, they have to be frozen and
393 thawed in both parent and child processes.
394
395 By default, only octet strings can be passed between the processes, which
396 is reasonably fast and efficient.
397
398 For more complicated use cases, you can provide your own freeze and thaw
399 functions, by specifying a string with perl source code. It's supposed to
400 return two code references when evaluated: the first receives a list of
401 perl values and must return an octet string. The second receives the octet
402 string and must return the original list of values.
403
404 If you need an external module for serialisation, then you can either
405 pre-load it into your L<AnyEvent::Fork> process, or you can add a C<use>
406 or C<require> statement into the serialiser string. Or both.
407
408 =back
409
410 See the examples section earlier in this document for some actual
411 examples.
412
413 =cut
414
415 our $STRING_SERIALISER = '(sub { pack "(w/a*)*", @_ }, sub { unpack "(w/a*)*", shift })';
416
417 sub run {
418 my ($self, $function, %arg) = @_;
419
420 my $serialiser = delete $arg{serialiser} || $STRING_SERIALISER;
421 my $on_event = delete $arg{on_event};
422 my $on_error = delete $arg{on_error};
423 my $on_destroy = delete $arg{on_destroy};
424
425 # default for on_error is to on_event, if specified
426 $on_error ||= $on_event
427 ? sub { $on_event->(error => shift) }
428 : sub { die "AnyEvent::Fork::RPC: uncaught error: $_[0].\n" };
429
430 # default for on_event is to raise an error
431 $on_event ||= sub { $on_error->("event received, but no on_event handler") };
432
433 my ($f, $t) = eval $serialiser; die $@ if $@;
434
435 my (@rcb, %rcb, $fh, $shutdown, $wbuf, $ww);
436 my ($rlen, $rbuf, $rw) = 512 - 16;
437
438 my $wcb = sub {
439 my $len = syswrite $fh, $wbuf;
440
441 unless (defined $len) {
442 if ($! != Errno::EAGAIN && $! != Errno::EWOULDBLOCK) {
443 undef $rw; undef $ww; # it ends here
444 $on_error->("$!");
445 }
446 }
447
448 substr $wbuf, 0, $len, "";
449
450 unless (length $wbuf) {
451 undef $ww;
452 $shutdown and shutdown $fh, 1;
453 }
454 };
455
456 my $module = "AnyEvent::Fork::RPC::" . ($arg{async} ? "Async" : "Sync");
457
458 $self->require ($module)
459 ->send_arg ($function, $arg{init}, $serialiser)
460 ->run ("$module\::run", sub {
461 $fh = shift;
462
463 my ($id, $len);
464 $rw = AE::io $fh, 0, sub {
465 $rlen = $rlen * 2 + 16 if $rlen - 128 < length $rbuf;
466 $len = sysread $fh, $rbuf, $rlen - length $rbuf, length $rbuf;
467
468 if ($len) {
469 while (8 <= length $rbuf) {
470 ($id, $len) = unpack "LL", $rbuf;
471 8 + $len <= length $rbuf
472 or last;
473
474 my @r = $t->(substr $rbuf, 8, $len);
475 substr $rbuf, 0, 8 + $len, "";
476
477 if ($id) {
478 if (@rcb) {
479 (shift @rcb)->(@r);
480 } elsif (my $cb = delete $rcb{$id}) {
481 $cb->(@r);
482 } else {
483 undef $rw; undef $ww;
484 $on_error->("unexpected data from child");
485 }
486 } else {
487 $on_event->(@r);
488 }
489 }
490 } elsif (defined $len) {
491 undef $rw; undef $ww; # it ends here
492
493 if (@rcb || %rcb) {
494 use Data::Dump;ddx[\@rcb,\%rcb];#d#
495 $on_error->("unexpected eof");
496 } else {
497 $on_destroy->();
498 }
499 } elsif ($! != Errno::EAGAIN && $! != Errno::EWOULDBLOCK) {
500 undef $rw; undef $ww; # it ends here
501 $on_error->("read: $!");
502 }
503 };
504
505 $ww ||= AE::io $fh, 1, $wcb;
506 });
507
508 my $guard = Guard::guard {
509 $shutdown = 1;
510 $ww ||= $fh && AE::io $fh, 1, $wcb;
511 };
512
513 my $id;
514
515 $arg{async}
516 ? sub {
517 $id = ($id == 0xffffffff ? 0 : $id) + 1;
518 $id = ($id == 0xffffffff ? 0 : $id) + 1 while exists $rcb{$id}; # rarely loops
519
520 $rcb{$id} = pop;
521
522 $guard; # keep it alive
523
524 $wbuf .= pack "LL/a*", $id, &$f;
525 $ww ||= $fh && AE::io $fh, 1, $wcb;
526 }
527 : sub {
528 push @rcb, pop;
529
530 $guard; # keep it alive
531
532 $wbuf .= pack "L/a*", &$f;
533 $ww ||= $fh && AE::io $fh, 1, $wcb;
534 }
535 }
536
537 =item $rpc->(..., $cb->(...))
538
539 The RPC object returned by C<AnyEvent::Fork::RPC::run> is actually a code
540 reference. There are two things you can do with it: call it, and let it go
541 out of scope (let it get destroyed).
542
543 If C<async> was false when C<$rpc> was created (the default), then, if you
544 call C<$rpc>, the C<$function> is invoked with all arguments passed to
545 C<$rpc> except the last one (the callback). When the function returns, the
546 callback will be invoked with all the return values.
547
548 If C<async> was true, then the C<$function> receives an additional
549 initial argument, the result callback. In this case, returning from
550 C<$function> does nothing - the function only counts as "done" when the
551 result callback is called, and any arguments passed to it are considered
552 the return values. This makes it possible to "return" from event handlers
553 or e.g. Coro threads.
554
555 The other thing that can be done with the RPC object is to destroy it. In
556 this case, the child process will execute all remaining RPC calls, report
557 their results, and then exit.
558
559 See the examples section earlier in this document for some actual
560 examples.
561
562 =back
563
564 =head1 CHILD PROCESS USAGE
565
566 The following function is not available in this module. They are only
567 available in the namespace of this module when the child is running,
568 without having to load any extra modules. They are part of the child-side
569 API of L<AnyEvent::Fork::RPC>.
570
571 =over 4
572
573 =item AnyEvent::Fork::RPC::event ...
574
575 Send an event to the parent. Events are a bit like RPC calls made by the
576 child process to the parent, except that there is no notion of return
577 values.
578
579 See the examples section earlier in this document for some actual
580 examples.
581
582 =back
583
584 =head1 SEE ALSO
585
586 L<AnyEvent::Fork> (to create the processes in the first place),
587 L<AnyEvent::Fork::Pool> (to manage whole pools of processes).
588
589 =head1 AUTHOR AND CONTACT INFORMATION
590
591 Marc Lehmann <schmorp@schmorp.de>
592 http://software.schmorp.de/pkg/AnyEvent-Fork-RPC
593
594 =cut
595
596 1
597