1 |
=head1 Introduction to Coro |
2 |
|
3 |
This tutorial will introduce you to the main features of the Coro module |
4 |
family. |
5 |
|
6 |
It first introduces some basic concepts, and later gives a short overview |
7 |
of the module family. |
8 |
|
9 |
|
10 |
=head1 What is Coro? |
11 |
|
12 |
Coro started as a simple module that implemented a specific form of |
13 |
first class continuations called Coroutines. These basically allow you |
14 |
to capture the current point execution and jump to another point, while |
15 |
allowing you to return at any time, as kind of non-local jump, not unlike |
16 |
C's C<setjmp>/C<longjmp>. This is nowadays known as a L<Coro::State>. |
17 |
|
18 |
One natural application for these is to include a scheduler, resulting in |
19 |
cooperative threads, which is the main use case for Coro today. Still, |
20 |
much of the documentation and custom refers to these threads as |
21 |
"coroutines" or often just "coros". |
22 |
|
23 |
A thread is very much like a stripped-down perl interpreter, or a |
24 |
process: Unlike a full interpreter process, a thread doesn't have its own |
25 |
variable or code namespaces - everything is shared. That means that when |
26 |
one thread modifies a variable (or any value, e.g. through a reference), |
27 |
then other threads immediately see this change when they look at the same |
28 |
variable or location. |
29 |
|
30 |
Cooperative means that these threads must cooperate with each other, when |
31 |
it comes to CPU usage - only one thread ever has the CPU, and if another |
32 |
thread wants the CPU, the running thread has to give it up. The latter |
33 |
is either explicitly, by calling a function to do so, or implicitly, when |
34 |
waiting on a resource (such as a Semaphore, or the completion of some I/O |
35 |
request). This threading model is popular in scripting languages (such as |
36 |
python or ruby), and this implementation is typically far more efficient |
37 |
than threads implemented in other languages. |
38 |
|
39 |
Perl itself uses rather confusing terminilogy - what perl calls a "thread" |
40 |
(or "ithread") is actually called a "process" everywhere else: The |
41 |
so-called "perl threads" are actually artifacts of the unix process |
42 |
emulation code used on Windows, which is consequently why they are |
43 |
actually processes and not threads. The biggest difference is that neither |
44 |
variables (nor code) are shared between processes or ithreads. |
45 |
|
46 |
|
47 |
=head1 Cooperative Threads |
48 |
|
49 |
Cooperative threads is what the Coro module gives you. Obviously, you have |
50 |
to C<use> it first: |
51 |
|
52 |
use Coro; |
53 |
|
54 |
To create a thread, you can use the C<async> function that automatically |
55 |
gets exported from that module: |
56 |
|
57 |
async { |
58 |
print "hello\n"; |
59 |
}; |
60 |
|
61 |
Async expects a code block as first argument (in indirect object |
62 |
notation). You can actually pass it extra arguments, and these will end up |
63 |
in C<@_> when executing the codeblock, but since it is a closure, you can |
64 |
also just refer to any lexical variables that are currently visible. |
65 |
|
66 |
The above lines create a thread, but if you save them in a file and |
67 |
execute it as a perl program, you will not get any output. |
68 |
|
69 |
The reasons is that, although you created a thread, and the thread is |
70 |
ready to execute (because C<async> puts it into the so-called I<ready |
71 |
queue>), it never gets any CPU time to actually execute, as the main |
72 |
program - which also is a thread almost like any other - never gives up |
73 |
the CPU but instead exits the whole program, by running off the end of |
74 |
the file. Since Coro threads are cooperative, the main thread has to |
75 |
cooperate, and give up the CPU. |
76 |
|
77 |
To explicitly give up the CPU, use the C<cede> function (which is often |
78 |
called C<yield> in other thread implementations): |
79 |
|
80 |
use Coro; |
81 |
|
82 |
async { |
83 |
print "hello\n"; |
84 |
}; |
85 |
|
86 |
cede; |
87 |
|
88 |
Running the above prints C<hello> and exits. |
89 |
|
90 |
Now, this is not very interesting, so let's try a slightly more |
91 |
interesting program: |
92 |
|
93 |
use Coro; |
94 |
|
95 |
async { |
96 |
print "async 1\n"; |
97 |
cede; |
98 |
print "async 2\n"; |
99 |
}; |
100 |
|
101 |
print "main 1\n"; |
102 |
cede; |
103 |
print "main 2\n"; |
104 |
cede; |
105 |
|
106 |
Running this program prints: |
107 |
|
108 |
main 1 |
109 |
async 1 |
110 |
main 2 |
111 |
async 2 |
112 |
|
113 |
This nicely illustrates the non-local jump ability: the main program |
114 |
prints the first line, and then yields the CPU to whatever other |
115 |
threads there are. And there is one other, which runs and prints |
116 |
"async 1", and itself yields the CPU. Since the only other thread |
117 |
available is the main program, it continues running and so on. |
118 |
|
119 |
Let's look at the example in more detail: C<async> first creates a new |
120 |
thread. All new threads start in a suspended state. To make them run, |
121 |
they need to be put into the ready queue, which is the second thing that |
122 |
C<async> does. Each time a thread gives up the CPU, Coro runs a so-called |
123 |
I<scheduler>. The scheduler selects the next thread from the ready queue, |
124 |
removes it from the queue, and runs it. |
125 |
|
126 |
C<cede> also does two things: first it puts the running thread into the |
127 |
ready queue, and then it jumps into the scheduler. This has the effect of |
128 |
giving up the CPU, but also ensures that, eventually, the thread gets run |
129 |
again. |
130 |
|
131 |
In fact, C<cede> could be implemented like this: |
132 |
|
133 |
sub my_cede { |
134 |
$Coro::current->ready; |
135 |
schedule; |
136 |
} |
137 |
|
138 |
This works because C<$Coro::current> always contains the currently |
139 |
running thread, and the scheduler itself can be called directly via |
140 |
C<Coro::schedule>. |
141 |
|
142 |
What is the effect of just calling C<schedule> without putting the current |
143 |
thread into the ready queue first? Simple: the scheduler selects the |
144 |
next ready thread and runs it. And the current thread, as it hasn't been |
145 |
put into the ready queue, will go to sleep until something wakes it |
146 |
up. If. Ever. |
147 |
|
148 |
The following example remembers the current thread in a variable, |
149 |
creates a thread and then puts the main program to sleep. |
150 |
|
151 |
The newly created thread uses rand to wake up the main thread by |
152 |
calling its C<ready> method - or not. |
153 |
|
154 |
use Coro; |
155 |
|
156 |
my $wakeme = $Coro::current; |
157 |
|
158 |
async { |
159 |
$wakeme->ready if 0.5 > rand; |
160 |
}; |
161 |
|
162 |
schedule; |
163 |
|
164 |
Now, when you run it, one of two things happen: Either the C<async> thread |
165 |
wakes up the main thread again, in which case the program silently exits, |
166 |
or it doesn't, in which case you get something like this: |
167 |
|
168 |
FATAL: deadlock detected. |
169 |
PID SC RSS USES Description Where |
170 |
31976480 -C 19k 0 [main::] [program:9] |
171 |
32223768 UC 12k 1 [Coro.pm:691] |
172 |
32225088 -- 2068 1 [coro manager] [Coro.pm:691] |
173 |
32225184 N- 216 0 [unblock_sub scheduler] - |
174 |
|
175 |
Why is that? Well, when the C<async> thread runs into the end of its |
176 |
block, it will be terminated (via a call to C<Coro::terminate>) and the |
177 |
scheduler is called again. Since the C<async> thread hasn't woken up the |
178 |
main thread, and there aren't any other threads, there is nothing to wake |
179 |
up, and the program cannot continue. Since there I<are> threads that |
180 |
I<could> be running (main) but none are I<ready> to do so, Coro signals a |
181 |
I<deadlock> - no progress is possible. Usually you also get a listing of |
182 |
all threads, which might help you track down the problem. |
183 |
|
184 |
However, there is an important case where progress I<is>, in fact, |
185 |
possible, despite no threads being ready - namely in an event-based |
186 |
program. In such a program, some threads could wait for I<external> |
187 |
events, such as a timeout, or some data to arrive on a socket. |
188 |
|
189 |
Since a deadlock in such a case would not be very useful, there is a |
190 |
module named L<Coro::AnyEvent> that integrates threads into an event |
191 |
loop. It configures Coro in a way that, instead of C<die>ing with an error |
192 |
message, it instead runs the event loop in the hope of receiving an event |
193 |
that will wake up some thread. |
194 |
|
195 |
|
196 |
=head2 Semaphores and other locks |
197 |
|
198 |
Using only C<ready>, C<cede> and C<schedule> to synchronise threads is |
199 |
difficult, especially if many threads are ready at the same time. Coro |
200 |
supports a number of primitives to help synchronising threads in easier |
201 |
ways. The first such primitives is L<Coro::Semaphore>, which implements |
202 |
counting semaphores (binary semaphores are available as L<Coro::Signal>, |
203 |
and there are L<Coro::SemaphoreSet> and L<Coro::RWLock> primitives as |
204 |
well). |
205 |
|
206 |
Counting semaphores, in a sense, store a count of resources. You can |
207 |
remove/allocate/reserve a resource by calling the C<< ->down >> method, |
208 |
which decrements the counter, and you can add or free a resource by |
209 |
calling the C<< ->up >> method, which increments the counter. If the |
210 |
counter is C<0>, then C<< ->down >> cannot decrement the semaphore - it is |
211 |
locked - and the thread will wait until a count becomes available again. |
212 |
|
213 |
Here is an example: |
214 |
|
215 |
use Coro; |
216 |
|
217 |
my $sem = new Coro::Semaphore 0; # a locked semaphore |
218 |
|
219 |
async { |
220 |
print "unlocking semaphore\n"; |
221 |
$sem->up; |
222 |
}; |
223 |
|
224 |
print "trying to lock semaphore\n"; |
225 |
$sem->down; |
226 |
print "we got it!\n"; |
227 |
|
228 |
This program creates a I<locked> semaphore (a semaphore with count C<0>) |
229 |
and tries to lock it (by trying to decrement it's counter in the C<down> |
230 |
method). Since the semaphore count is already exhausted, this will block |
231 |
the main thread until the semaphore becomes available. |
232 |
|
233 |
This yields the CPU to the only other read thread in the process,t he |
234 |
one created with C<async>, which unlocks the semaphore (and instantly |
235 |
terminates itself by returning). |
236 |
|
237 |
Since the semaphore is now available, the main program locks it and |
238 |
continues: "we got it!". |
239 |
|
240 |
Counting semaphores are most often used to lock resources, or to exclude |
241 |
other threads from accessing or using a resource. For example, consider |
242 |
a very costly function (that temporarily allocates a lot of ram, for |
243 |
example). You wouldn't want to have many threads calling this function at |
244 |
the same time, so you use a semaphore: |
245 |
|
246 |
my $lock = new Coro::Semaphore; # unlocked initially - default is 1 |
247 |
|
248 |
sub costly_function { |
249 |
$lock->down; # acquire semaphore |
250 |
|
251 |
# do costly operation that blocks |
252 |
|
253 |
$lock->up; # unlock it |
254 |
} |
255 |
|
256 |
No matter how many threads call C<costly_function>, only one will run |
257 |
the body of it, all others will wait in the C<down> call. If you want to |
258 |
limit the number of concurrent executions to five, you could create the |
259 |
semaphore with an initial count of C<5>. |
260 |
|
261 |
Why does the comment mention an "operation the blocks"? Again, that's |
262 |
because coro's threads are cooperative: unless C<costly_function> |
263 |
willingly gives up the CPU, other threads of control will simply not |
264 |
run. This makes locking superfluous in cases where the function itself |
265 |
never gives up the CPU, but when dealing with the outside world, this is |
266 |
rare. |
267 |
|
268 |
Now consider what happens when the code C<die>s after executing C<down>, |
269 |
but before C<up>. This will leave the semaphore in a locked state, which |
270 |
often isn't what you want - imagine the caller expecting a failure and |
271 |
wrapping the call into an C<eval {}>. |
272 |
|
273 |
So normally you would want to free the lock again if execution somehow |
274 |
leaves the function, whether "normally" or via an exception. Here the |
275 |
C<guard> method proves useful: |
276 |
|
277 |
my $lock = new Coro::Semaphore; # unlocked initially |
278 |
|
279 |
sub costly_function { |
280 |
my $guard = $lock->guard; # acquire guard |
281 |
|
282 |
# do costly operation that blocks |
283 |
} |
284 |
|
285 |
The C<guard> method C<down>s the semaphore and returns a so-called guard |
286 |
object. Nothing happens as long as there are references to it (i.e. it is |
287 |
in scope somehow), but when all references are gone, for example, when |
288 |
C<costly_function> returns or throws an exception, it will automatically |
289 |
call C<up> on the semaphore, no way to forget it. Even when the thread |
290 |
gets C<cancel>ed by another thread will the guard object ensure that the |
291 |
lock is freed. |
292 |
|
293 |
This concludes this introduction to semaphores and locks. Apart from |
294 |
L<Coro::Semaphore> and L<Coro::Signal>, there is also a reader-writer lock |
295 |
(L<Coro::RWLock>) and a semaphore set (L<Coro::SemaphoreSet>). All of |
296 |
these come with their own manpage. |
297 |
|
298 |
|
299 |
=head2 Channels |
300 |
|
301 |
Semaphores are fine, but usually you want to communicate by exchanging |
302 |
data as well. Of course, you can just use some locks, and array of sorts |
303 |
and use that to communicate, but there is a useful abstraction for |
304 |
communicaiton between threads: L<Coro::Channel>. Channels are the Coro |
305 |
equivalent of a unix pipe (and very similar to AmigaOS message ports :) - |
306 |
you can put stuff into it on one side, and read data from it on the other. |
307 |
|
308 |
Here is a simple example that creates a thread and sends numbers to |
309 |
it. The thread calculates the square of each number and puts that into |
310 |
another channel, which the main thread reads the result from: |
311 |
|
312 |
use Coro; |
313 |
|
314 |
my $calculate = new Coro::Channel; |
315 |
my $result = new Coro::Channel; |
316 |
|
317 |
async { |
318 |
# endless loop |
319 |
while () { |
320 |
my $num = $calculate->get; # read a number |
321 |
$num **= 2; # square it |
322 |
$result->put ($num); # put the result into the result queue |
323 |
} |
324 |
}; |
325 |
|
326 |
for (1, 2, 5, 10, 77) { |
327 |
$calculate->put ($_); |
328 |
print "$_ ** 2 = ", $result->get, "\n"; |
329 |
} |
330 |
|
331 |
Gives: |
332 |
|
333 |
1 ** 2 = 1 |
334 |
2 ** 2 = 4 |
335 |
5 ** 2 = 25 |
336 |
10 ** 2 = 100 |
337 |
77 ** 2 = 5929 |
338 |
|
339 |
Both C<get> and C<put> methods can block the current thread: C<get> first |
340 |
checks whether there I<is> some data available, and if not, it block the |
341 |
current thread until some data arrives. C<put> can also block, as each |
342 |
Channel has a "maximum item capacity", i.e. you cannot store more than a |
343 |
specific number of items, which can be configured when the Channel gets |
344 |
created. |
345 |
|
346 |
In the above example, C<put> never blocks, as the default capacity |
347 |
of a Channel is very high. So the for loop first puts data into the |
348 |
channel, then tries to C<get> the result. Since the async thread hasn't |
349 |
put anything in there yet (on the first iteration it hasn't even run |
350 |
yet), the result Channel is still empty, so the main thread blocks. |
351 |
|
352 |
Since the only other runnable/ready thread at this point is the squaring |
353 |
thread, it will be woken up, will C<get> the number, square it and put it |
354 |
into the result channel, waking up the main thread again. It will still |
355 |
continue to run, as waking up other threads just puts them into the ready |
356 |
queue, nothing less, nothing more. |
357 |
|
358 |
Only when the async thread tries to C<get> the next number from the |
359 |
calculate channel will it block (because nothing is there yet) and the |
360 |
main thread will continue running. And so on. |
361 |
|
362 |
This illustrates a general principle used by Coro: a thread will I<only |
363 |
ever block> when it has to. Neither the Coro module itself nor any of its |
364 |
submodules will ever give up the CPU unless they have to, because they |
365 |
wait for some event to happen. |
366 |
|
367 |
Be careful, however: when multiple threads put numbers into C<$calculate> |
368 |
and read from C<$result>, they won't know which result is theirs. The |
369 |
solution for this is to either use a semaphore, or send not just the |
370 |
number, but also your own private result channel. |
371 |
|
372 |
|
373 |
=head2 What is mine, what is ours? |
374 |
|
375 |
What, exactly, constitutes a thread? Obviously it contains the current |
376 |
point of execution. Not so obviously, it also has to include all |
377 |
lexical variables, that means, every thread has its own set of lexical |
378 |
variables. |
379 |
|
380 |
To see why this is necessary, consider this program: |
381 |
|
382 |
use Coro; |
383 |
|
384 |
sub printit { |
385 |
my ($string) = @_; |
386 |
|
387 |
cede; |
388 |
|
389 |
print $string; |
390 |
} |
391 |
|
392 |
async { printit "Hello, " }; |
393 |
async { printit "World!\n" }; |
394 |
|
395 |
cede; cede; # do it |
396 |
|
397 |
The above prints C<Hello, World!\n>. If C<printit> wouldn't have |
398 |
its own per-thread C<$string> variable, it would probably print |
399 |
C<World!\nWorld\n>, which is rather unexpected, and would make it very |
400 |
difficult to make good use of threads. |
401 |
|
402 |
To make things run smoothly, there are quite a number of other things that |
403 |
are per-thread: |
404 |
|
405 |
=over 4 |
406 |
|
407 |
=item $_, @_, $@ and the regex result vars, $&, %+, $1, $2, ... |
408 |
|
409 |
C<$_> is used much like a local variable, so it gets localised |
410 |
per-thread. The same is true for regex results (C<$1>, C<$2> and so on). |
411 |
|
412 |
C<@_> contains the arguments, so like lexicals, it also must be |
413 |
per-thread. |
414 |
|
415 |
C<$@> is not obviously required to be per-thread, but it is quite useful. |
416 |
|
417 |
=item $/ and the default output file handle |
418 |
|
419 |
Threads most often block when doing I/O. Since C<$/> is used when reading |
420 |
lines, it would be very inconvenient if it were a shared variable, so it |
421 |
is per-thread. |
422 |
|
423 |
The default output handle (see C<select>) is a difficult case: sometimes |
424 |
being global is preferable, sometimes per-thread is preferable. Since |
425 |
per-thread seems to be more common, it is per-thread. |
426 |
|
427 |
=item $SIG{__DIE__} and $SIG{__WARN__} |
428 |
|
429 |
If these weren't per-thread, then common constructs such as: |
430 |
|
431 |
eval { |
432 |
local $SIG{__DIE__} = sub { ... }; |
433 |
... |
434 |
}; |
435 |
|
436 |
Would not allow coroutine switching. Since exception-handling is |
437 |
per-thread, those variables should be per-thread as well. |
438 |
|
439 |
=item Lots of other esoteric stuff |
440 |
|
441 |
For example, C<$^H> is per-thread. Most of the additional per-thread state |
442 |
is not directly visible to Perl, but required to make the interpreter |
443 |
work. You won't normally notice these. |
444 |
|
445 |
=back |
446 |
|
447 |
Everything else is shared between all threads. For example, the globals |
448 |
C<$a> and C<$b> are shared. When does that matter? When using C<sort>, |
449 |
these variables become special, and therefore, switching threads when |
450 |
sorting might have surprising results. |
451 |
|
452 |
Other examples are the C<$!>, errno, C<$.>, the current input line number, |
453 |
C<$,>, C<$\>, C<$"> and many other special variables. |
454 |
|
455 |
While in some cases a good argument could be made for localising them to |
456 |
the thread, they are rarely used, and sometimes hard to localise. |
457 |
|
458 |
Future versions of Coro might include more per-thread state when it |
459 |
becomes a problem. |
460 |
|
461 |
|
462 |
=head2 Debugging |
463 |
|
464 |
Sometimes it can be useful to find out what each thread is doing (or which |
465 |
threads exist in the first place). The L<Coro::Debug> module has (among |
466 |
other goodies), a function that allows you to print a "ps"-like listing - |
467 |
you have seen it in action earlier when Coro detected a deadlock. |
468 |
|
469 |
You use it like this: |
470 |
|
471 |
use Coro::Debug; |
472 |
|
473 |
Coro::Debug::command "ps"; |
474 |
|
475 |
Remember the example with the two channels and a worker thread that |
476 |
squared numbers? Running "ps" just after C<< $calculate->get >> outputs |
477 |
something similar to this: |
478 |
|
479 |
PID SC RSS USES Description Where |
480 |
8917312 -C 22k 0 [main::] [introscript:20] |
481 |
8964448 N- 152 0 [coro manager] - |
482 |
8964520 N- 152 0 [unblock_sub scheduler] - |
483 |
8591752 UC 152 1 [introscript:12] |
484 |
11546944 N- 152 0 [EV idle process] - |
485 |
|
486 |
Interesting - there is more going on in the background than one would |
487 |
expect. Ignoring the extra threads, the main thread has pid |
488 |
C<8917312>, and the one started by C<async> has pid C<8591752>. |
489 |
|
490 |
The latter is also the only thread that doesn't have a description, |
491 |
simply because we haven't set one. Setting one is easy, just put it into |
492 |
C<< $Coro::current->{desc} >>: |
493 |
|
494 |
async { |
495 |
$Coro::current->{desc} = "cruncher"; |
496 |
... |
497 |
}; |
498 |
|
499 |
This can be rather useful when debugging a program, or when using the |
500 |
interactive debug shell of L<Coro::Debug>. |
501 |
|
502 |
|
503 |
=head1 The Real World - Event Loops |
504 |
|
505 |
Coro really wants to run in a program using some event loop. In fact, most |
506 |
real-world programs using Coro threads are written with a combination of |
507 |
event-based and thread-based techniques, as it is easy to get the best of |
508 |
both worlds with Coro. |
509 |
|
510 |
Coro integrates automatically into any event loop supported by L<AnyEvent> |
511 |
(see L<Coro::AnyEvent> for details), but can take special advantage of the |
512 |
L<EV> and L<Event> modules. |
513 |
|
514 |
Here is a simple finger client, using whatever event loop L<AnyEvent> |
515 |
comes up with: |
516 |
|
517 |
use Coro; |
518 |
use Coro::Socket; |
519 |
|
520 |
sub finger { |
521 |
my ($user, $host) = @_; |
522 |
|
523 |
my $fh = new Coro::Socket PeerHost => $host, PeerPort => "finger" |
524 |
or die "$user\@$host: $!"; |
525 |
|
526 |
print $fh "$user\n"; |
527 |
|
528 |
print "$user\@$host: $_" while <$fh>; |
529 |
print "$user\@$host: done\n"; |
530 |
} |
531 |
|
532 |
# now finger a few accounts |
533 |
for ( |
534 |
(async { finger "abc", "cornell.edu" }), |
535 |
(async { finger "sebbo", "world.std.com" }), |
536 |
(async { finger "trouble", "noc.dfn.de" }), |
537 |
) { |
538 |
$_->join; # wait for the result |
539 |
} |
540 |
|
541 |
There are a few new things here. First of all, there is |
542 |
L<Coro::Socket>. This module works much the same way as |
543 |
L<IO::Socket::INET>, except that it is coroutine-aware. This means that |
544 |
L<IO::Socket::INET>, when waiting for the network, will block the whole |
545 |
process - that means all threads, which is clearly undesirable. |
546 |
|
547 |
On the other hand, L<Coro::Socket> knows how to give up the CPU to other |
548 |
threads when it waits for the network, which makes parallel execution |
549 |
possible. |
550 |
|
551 |
The other new thing is the C<join> method: All we want to do in this |
552 |
example is start three C<async> threads and only exit when they have |
553 |
done their job. This could be done using a counting semaphore, but it is |
554 |
much simpler to synchronously wait for them to C<terminate>, which is |
555 |
exactly what the C<join> method does. |
556 |
|
557 |
It doesn't matter that the three C<async>s will probably finish in a |
558 |
different order then the for loop C<join>s them - when the thread is still |
559 |
running, C<join> simply waits. If the thread has already terminated, it |
560 |
will simply fetch its return status. |
561 |
|
562 |
If you are experienced in event-based programming, you will see that the |
563 |
above program doesn't quite follow the normal pattern, where you start |
564 |
some work, and then run the event loop (e.v. C<EV::loop>). |
565 |
|
566 |
In fact, nontrivial programs follow this pattern even with Coro, so a Coro |
567 |
program that uses EV usually looks like this: |
568 |
|
569 |
use EV; |
570 |
use Coro; |
571 |
|
572 |
# start coroutines or event watchers |
573 |
|
574 |
EV::loop; # and loop |
575 |
|
576 |
And in fact, for debugging, you often do something like this: |
577 |
|
578 |
use EV; |
579 |
use Coro::Debug; |
580 |
|
581 |
my $shell = new_unix_server Coro::Debug "/tmp/myshell"; |
582 |
|
583 |
EV::loop; # and loop |
584 |
|
585 |
This runs your program, but also an interactive shell on the unix domain |
586 |
socket in F</tmp/myshell>. You can use the F<socat> program to access it: |
587 |
|
588 |
# socat readline /tmp/myshell |
589 |
coro debug session. use help for more info |
590 |
|
591 |
> ps |
592 |
PID SC RSS USES Description Where |
593 |
136672312 RC 19k 177k [main::] [myprog:28] |
594 |
136710424 -- 1268 48 [coro manager] [Coro.pm:349] |
595 |
> help |
596 |
ps [w|v] show the list of all coroutines (wide, verbose) |
597 |
bt <pid> show a full backtrace of coroutine <pid> |
598 |
eval <pid> <perl> evaluate <perl> expression in context of <pid> |
599 |
trace <pid> enable tracing for this coroutine |
600 |
untrace <pid> disable tracing for this coroutine |
601 |
kill <pid> <reason> throws the given <reason> string in <pid> |
602 |
cancel <pid> cancels this coroutine |
603 |
ready <pid> force <pid> into the ready queue |
604 |
<anything else> evaluate as perl and print results |
605 |
<anything else> & same as above, but evaluate asynchronously |
606 |
you can use (find_coro <pid>) in perl expressions |
607 |
to find the coro with the given pid, e.g. |
608 |
(find_coro 9768720)->ready |
609 |
loglevel <int> enable logging for messages of level <int> and lower |
610 |
exit end this session |
611 |
|
612 |
Microsft victims can of course use the even less secure C<new_tcp_server> |
613 |
constructor. |
614 |
|
615 |
|
616 |
=head2 The Real World - File I/O |
617 |
|
618 |
Disk I/O, while often much faster than the network, nevertheless can take |
619 |
quite a long time in which the CPU could do other things, if one would |
620 |
only be able to do something. |
621 |
|
622 |
Fortunately, the L<IO::AIO> module on CPAN allows you to move these |
623 |
I/O calls into the background, letting you do useful work in the |
624 |
foreground. It is event-/callback-based, but Coro has a nice wrapper |
625 |
around it, called L<Coro::AIO>, which lets you use its functions |
626 |
naturally from within threads: |
627 |
|
628 |
use Fcntl; |
629 |
use Coro::AIO; |
630 |
|
631 |
my $fh = aio_open "$filename~", O_WRONLY | O_CREAT, 0600 |
632 |
or die "$filename~: $!"; |
633 |
|
634 |
aio_write $fh, 0, (length $data), $data, 0; |
635 |
aio_fsync $fh; |
636 |
aio_close $fh; |
637 |
aio_rename "$filename~", "$filename"; |
638 |
|
639 |
The above creates a new file, writes data into it, syncs the data to disk |
640 |
and atomically replaces a base file with a new copy. |
641 |
|
642 |
|
643 |
=head2 Inversion of control - rouse functions |
644 |
|
645 |
Last not least, me talk about inversion of control. The "control" refers |
646 |
to "who calls whom", who is in control of the program. In this program, |
647 |
the main program is in control and passes this to all functions it calls: |
648 |
|
649 |
use LWP; |
650 |
|
651 |
# pass control to get |
652 |
my $res = get "http://example.org/"; |
653 |
# control returned to us |
654 |
|
655 |
print $res; |
656 |
|
657 |
When switching to event-based programs, instead of "us calling them", |
658 |
"they call us" - this is the inversion of control form the title: |
659 |
|
660 |
use AnyEvent::HTTP; |
661 |
|
662 |
# do not pass control for long - http_get immediately returns |
663 |
http_get "http://example.org/", sub { |
664 |
print $_[0]; |
665 |
}; |
666 |
|
667 |
# we stay in control and can do other things |
668 |
|
669 |
Event based programming can be nice, but sometimes it's just easier to |
670 |
write down some processing in "linear" fashion, without callbacks. Coro |
671 |
provides some special functions to reduce typing: |
672 |
|
673 |
use AnyEvent::HTTP; |
674 |
|
675 |
# do not pass control for long - http_get immediately returns |
676 |
http_get "http://example.org/", Coro::rouse_cb; |
677 |
|
678 |
# we stay in control and can do other things... |
679 |
# ...such as wait for the result |
680 |
my ($res) = Coro::rouse_wait; |
681 |
|
682 |
C<Coro::rouse_cb> creates and returns a special callback. You can pass |
683 |
this callback to any function that would expect a callback. |
684 |
|
685 |
C<Coro::rouse_wait> waits (block the current thread) until the most |
686 |
recently created callback has been called, and returns whatever was passed |
687 |
to it. |
688 |
|
689 |
These two functions allow you to I<mechanically> invert the control from |
690 |
"callback based style" used by most event-based libraries to "blocking |
691 |
style", whenever you wish to. |
692 |
|
693 |
The pattern is simple: instead of... |
694 |
|
695 |
some_func ..., sub { |
696 |
my @res = @_; |
697 |
... |
698 |
}; |
699 |
|
700 |
... you write: |
701 |
|
702 |
some_func ..., Coro::rouse_cb; |
703 |
my @res = Coro::rouse_wait; |
704 |
... |
705 |
|
706 |
Callback-based interfaces are plenty, and the rouse functions allow you to |
707 |
use them in an often more convenient way. |
708 |
|
709 |
|
710 |
=head1 Other Modules |
711 |
|
712 |
This introduction only mentions a few methods and modules, Coro has many |
713 |
other functions (see the L<Coro> manpage) and modules (documented in the |
714 |
C<SEE ALSO> section of the L<Coro> manpage). |
715 |
|
716 |
Noteworthy modules are L<Coro::LWP> (for parallel LWP requests, but see |
717 |
L<AnyEvent::HTTP> for a better HTTP-only alternative), L<Coro::BDB>, for |
718 |
when you need an asynchronous database, L<Coro::Handle>, when you need |
719 |
to use any file handle in a coroutine (popular to access C<STDIN> and |
720 |
C<STDOUT>) and L<Coro::EV>, the optimised interface to L<EV> (which gets |
721 |
used automatically by L<Coro::AnyEvent>). |
722 |
|
723 |
There are a number of Coro-related moduels that might be useful for your problem |
724 |
(see L<http://search.cpan.org/search?query=Coro&mode=module>). And since Coro |
725 |
integrates so well into AnyEvent, it's often easy to adapt existing AnyEvent modules |
726 |
(see L<http://search.cpan.org/search?query=AnyEvent&mode=module>). |
727 |
|
728 |
|
729 |
=head1 AUTHOR |
730 |
|
731 |
Marc Lehmann <schmorp@schmorp.de> |
732 |
http://home.schmorp.de/ |
733 |
|