1 | NAME |
1 | NAME |
2 | Coro - the only real threads in perl |
2 | Coro - the only real threads in perl |
3 | |
3 | |
4 | SYNOPSIS |
4 | SYNOPSIS |
5 | use Coro; |
5 | use Coro; |
6 | |
6 | |
7 | async { |
7 | async { |
8 | # some asynchronous thread of execution |
8 | # some asynchronous thread of execution |
9 | print "2\n"; |
9 | print "2\n"; |
10 | cede; # yield back to main |
10 | cede; # yield back to main |
11 | print "4\n"; |
11 | print "4\n"; |
12 | }; |
12 | }; |
13 | print "1\n"; |
13 | print "1\n"; |
14 | cede; # yield to coro |
14 | cede; # yield to coro |
15 | print "3\n"; |
15 | print "3\n"; |
16 | cede; # and again |
16 | cede; # and again |
17 | |
17 | |
18 | # use locking |
18 | # use locking |
19 | use Coro::Semaphore; |
19 | use Coro::Semaphore; |
20 | my $lock = new Coro::Semaphore; |
20 | my $lock = new Coro::Semaphore; |
21 | my $locked; |
21 | my $locked; |
22 | |
22 | |
23 | $lock->down; |
23 | $lock->down; |
24 | $locked = 1; |
24 | $locked = 1; |
25 | $lock->up; |
25 | $lock->up; |
26 | |
26 | |
27 | DESCRIPTION |
27 | DESCRIPTION |
28 | For a tutorial-style introduction, please read the Coro::Intro manpage. |
28 | For a tutorial-style introduction, please read the Coro::Intro manpage. |
… | |
… | |
37 | easily-identified points in your program, so locking and parallel access |
37 | easily-identified points in your program, so locking and parallel access |
38 | are rarely an issue, making thread programming much safer and easier |
38 | are rarely an issue, making thread programming much safer and easier |
39 | than using other thread models. |
39 | than using other thread models. |
40 | |
40 | |
41 | Unlike the so-called "Perl threads" (which are not actually real threads |
41 | Unlike the so-called "Perl threads" (which are not actually real threads |
42 | but only the windows process emulation ported to unix, and as such act |
42 | but only the windows process emulation (see section of same name for |
43 | as processes), Coro provides a full shared address space, which makes |
43 | more details) ported to unix, and as such act as processes), Coro |
44 | communication between threads very easy. And Coro's threads are fast, |
44 | provides a full shared address space, which makes communication between |
|
|
45 | threads very easy. And Coro's threads are fast, too: disabling the |
45 | too: disabling the Windows process emulation code in your perl and using |
46 | Windows process emulation code in your perl and using Coro can easily |
46 | Coro can easily result in a two to four times speed increase for your |
47 | result in a two to four times speed increase for your programs. A |
47 | programs. A parallel matrix multiplication benchmark runs over 300 times |
48 | parallel matrix multiplication benchmark runs over 300 times faster on a |
48 | faster on a single core than perl's pseudo-threads on a quad core using |
49 | single core than perl's pseudo-threads on a quad core using all four |
49 | all four cores. |
50 | cores. |
50 | |
51 | |
51 | Coro achieves that by supporting multiple running interpreters that |
52 | Coro achieves that by supporting multiple running interpreters that |
52 | share data, which is especially useful to code pseudo-parallel processes |
53 | share data, which is especially useful to code pseudo-parallel processes |
53 | and for event-based programming, such as multiple HTTP-GET requests |
54 | and for event-based programming, such as multiple HTTP-GET requests |
54 | running concurrently. See Coro::AnyEvent to learn more on how to |
55 | running concurrently. See Coro::AnyEvent to learn more on how to |
… | |
… | |
82 | $Coro::idle |
83 | $Coro::idle |
83 | This variable is mainly useful to integrate Coro into event loops. |
84 | This variable is mainly useful to integrate Coro into event loops. |
84 | It is usually better to rely on Coro::AnyEvent or Coro::EV, as this |
85 | It is usually better to rely on Coro::AnyEvent or Coro::EV, as this |
85 | is pretty low-level functionality. |
86 | is pretty low-level functionality. |
86 | |
87 | |
87 | This variable stores either a Coro object or a callback. |
88 | This variable stores a Coro object that is put into the ready queue |
|
|
89 | when there are no other ready threads (without invoking any ready |
|
|
90 | hooks). |
88 | |
91 | |
89 | If it is a callback, the it is called whenever the scheduler finds |
92 | The default implementation dies with "FATAL: deadlock detected.", |
90 | no ready coros to run. The default implementation prints "FATAL: |
93 | followed by a thread listing, because the program has no other way |
91 | deadlock detected" and exits, because the program has no other way |
|
|
92 | to continue. |
94 | to continue. |
93 | |
|
|
94 | If it is a coro object, then this object will be readied (without |
|
|
95 | invoking any ready hooks, however) when the scheduler finds no other |
|
|
96 | ready coros to run. |
|
|
97 | |
95 | |
98 | This hook is overwritten by modules such as "Coro::EV" and |
96 | This hook is overwritten by modules such as "Coro::EV" and |
99 | "Coro::AnyEvent" to wait on an external event that hopefully wake up |
97 | "Coro::AnyEvent" to wait on an external event that hopefully wake up |
100 | a coro so the scheduler can run it. |
98 | a coro so the scheduler can run it. |
101 | |
99 | |
102 | Note that the callback *must not*, under any circumstances, block |
|
|
103 | the current coro. Normally, this is achieved by having an "idle |
|
|
104 | coro" that calls the event loop and then blocks again, and then |
|
|
105 | readying that coro in the idle handler, or by simply placing the |
|
|
106 | idle coro in this variable. |
|
|
107 | |
|
|
108 | See Coro::Event or Coro::AnyEvent for examples of using this |
100 | See Coro::EV or Coro::AnyEvent for examples of using this technique. |
109 | technique. |
|
|
110 | |
|
|
111 | Please note that if your callback recursively invokes perl (e.g. for |
|
|
112 | event handlers), then it must be prepared to be called recursively |
|
|
113 | itself. |
|
|
114 | |
101 | |
115 | SIMPLE CORO CREATION |
102 | SIMPLE CORO CREATION |
116 | async { ... } [@args...] |
103 | async { ... } [@args...] |
117 | Create a new coro and return its Coro object (usually unused). The |
104 | Create a new coro and return its Coro object (usually unused). The |
118 | coro will be put into the ready queue, so it will start running |
105 | coro will be put into the ready queue, so it will start running |
… | |
… | |
181 | |
168 | |
182 | schedule |
169 | schedule |
183 | Calls the scheduler. The scheduler will find the next coro that is |
170 | Calls the scheduler. The scheduler will find the next coro that is |
184 | to be run from the ready queue and switches to it. The next coro to |
171 | to be run from the ready queue and switches to it. The next coro to |
185 | be run is simply the one with the highest priority that is longest |
172 | be run is simply the one with the highest priority that is longest |
186 | in its ready queue. If there is no coro ready, it will clal the |
173 | in its ready queue. If there is no coro ready, it will call the |
187 | $Coro::idle hook. |
174 | $Coro::idle hook. |
188 | |
175 | |
189 | Please note that the current coro will *not* be put into the ready |
176 | Please note that the current coro will *not* be put into the ready |
190 | queue, so calling this function usually means you will never be |
177 | queue, so calling this function usually means you will never be |
191 | called again unless something else (e.g. an event handler) calls |
178 | called again unless something else (e.g. an event handler) calls |
… | |
… | |
238 | |
225 | |
239 | These functions implement the same concept as "dynamic-wind" in |
226 | These functions implement the same concept as "dynamic-wind" in |
240 | scheme does, and are useful when you want to localise some resource |
227 | scheme does, and are useful when you want to localise some resource |
241 | to a specific coro. |
228 | to a specific coro. |
242 | |
229 | |
243 | They slow down coro switching considerably for coros that use them |
230 | They slow down thread switching considerably for coros that use them |
|
|
231 | (about 40% for a BLOCK with a single assignment, so thread switching |
244 | (But coro switching is still reasonably fast if the handlers are |
232 | is still reasonably fast if the handlers are fast). |
245 | fast). |
|
|
246 | |
233 | |
247 | These functions are best understood by an example: The following |
234 | These functions are best understood by an example: The following |
248 | function will change the current timezone to |
235 | function will change the current timezone to |
249 | "Antarctica/South_Pole", which requires a call to "tzset", but by |
236 | "Antarctica/South_Pole", which requires a call to "tzset", but by |
250 | using "on_enter" and "on_leave", which remember/change the current |
237 | using "on_enter" and "on_leave", which remember/change the current |
251 | timezone and restore the previous value, respectively, the timezone |
238 | timezone and restore the previous value, respectively, the timezone |
252 | is only changes for the coro that installed those handlers. |
239 | is only changed for the coro that installed those handlers. |
253 | |
240 | |
254 | use POSIX qw(tzset); |
241 | use POSIX qw(tzset); |
255 | |
242 | |
256 | async { |
243 | async { |
257 | my $old_tz; # store outside TZ value here |
244 | my $old_tz; # store outside TZ value here |
… | |
… | |
273 | }; |
260 | }; |
274 | |
261 | |
275 | This can be used to localise about any resource (locale, uid, |
262 | This can be used to localise about any resource (locale, uid, |
276 | current working directory etc.) to a block, despite the existance of |
263 | current working directory etc.) to a block, despite the existance of |
277 | other coros. |
264 | other coros. |
|
|
265 | |
|
|
266 | Another interesting example implements time-sliced multitasking |
|
|
267 | using interval timers (this could obviously be optimised, but does |
|
|
268 | the job): |
|
|
269 | |
|
|
270 | # "timeslice" the given block |
|
|
271 | sub timeslice(&) { |
|
|
272 | use Time::HiRes (); |
|
|
273 | |
|
|
274 | Coro::on_enter { |
|
|
275 | # on entering the thread, we set an VTALRM handler to cede |
|
|
276 | $SIG{VTALRM} = sub { cede }; |
|
|
277 | # and then start the interval timer |
|
|
278 | Time::HiRes::setitimer &Time::HiRes::ITIMER_VIRTUAL, 0.01, 0.01; |
|
|
279 | }; |
|
|
280 | Coro::on_leave { |
|
|
281 | # on leaving the thread, we stop the interval timer again |
|
|
282 | Time::HiRes::setitimer &Time::HiRes::ITIMER_VIRTUAL, 0, 0; |
|
|
283 | }; |
|
|
284 | |
|
|
285 | &{+shift}; |
|
|
286 | } |
|
|
287 | |
|
|
288 | # use like this: |
|
|
289 | timeslice { |
|
|
290 | # The following is an endless loop that would normally |
|
|
291 | # monopolise the process. Since it runs in a timesliced |
|
|
292 | # environment, it will regularly cede to other threads. |
|
|
293 | while () { } |
|
|
294 | }; |
278 | |
295 | |
279 | killall |
296 | killall |
280 | Kills/terminates/cancels all coros except the currently running one. |
297 | Kills/terminates/cancels all coros except the currently running one. |
281 | |
298 | |
282 | Note that while this will try to free some of the main interpreter |
299 | Note that while this will try to free some of the main interpreter |
… | |
… | |
476 | when you use a module that uses AnyEvent (and you use |
493 | when you use a module that uses AnyEvent (and you use |
477 | Coro::AnyEvent) and it provides callbacks that are the result of |
494 | Coro::AnyEvent) and it provides callbacks that are the result of |
478 | some event callback, then you must not block either, or use |
495 | some event callback, then you must not block either, or use |
479 | "unblock_sub". |
496 | "unblock_sub". |
480 | |
497 | |
481 | $cb = Coro::rouse_cb |
498 | $cb = rouse_cb |
482 | Create and return a "rouse callback". That's a code reference that, |
499 | Create and return a "rouse callback". That's a code reference that, |
483 | when called, will remember a copy of its arguments and notify the |
500 | when called, will remember a copy of its arguments and notify the |
484 | owner coro of the callback. |
501 | owner coro of the callback. |
485 | |
502 | |
486 | See the next function. |
503 | See the next function. |
487 | |
504 | |
488 | @args = Coro::rouse_wait [$cb] |
505 | @args = rouse_wait [$cb] |
489 | Wait for the specified rouse callback (or the last one that was |
506 | Wait for the specified rouse callback (or the last one that was |
490 | created in this coro). |
507 | created in this coro). |
491 | |
508 | |
492 | As soon as the callback is invoked (or when the callback was invoked |
509 | As soon as the callback is invoked (or when the callback was invoked |
493 | before "rouse_wait"), it will return the arguments originally passed |
510 | before "rouse_wait"), it will return the arguments originally passed |
494 | to the rouse callback. |
511 | to the rouse callback. In scalar context, that means you get the |
|
|
512 | *last* argument, just as if "rouse_wait" had a "return ($a1, $a2, |
|
|
513 | $a3...)" statement at the end. |
495 | |
514 | |
496 | See the section HOW TO WAIT FOR A CALLBACK for an actual usage |
515 | See the section HOW TO WAIT FOR A CALLBACK for an actual usage |
497 | example. |
516 | example. |
498 | |
517 | |
499 | HOW TO WAIT FOR A CALLBACK |
518 | HOW TO WAIT FOR A CALLBACK |
… | |
… | |
583 | That means you *MUST NOT* call any function that might "block" the |
602 | That means you *MUST NOT* call any function that might "block" the |
584 | current coro - "cede", "schedule" "Coro::Semaphore->down" or |
603 | current coro - "cede", "schedule" "Coro::Semaphore->down" or |
585 | anything that calls those. Everything else, including calling |
604 | anything that calls those. Everything else, including calling |
586 | "ready", works. |
605 | "ready", works. |
587 | |
606 | |
|
|
607 | WINDOWS PROCESS EMULATION |
|
|
608 | A great many people seem to be confused about ithreads (for example, |
|
|
609 | Chip Salzenberg called me unintelligent, incapable, stupid and gullible, |
|
|
610 | while in the same mail making rather confused statements about perl |
|
|
611 | ithreads (for example, that memory or files would be shared), showing |
|
|
612 | his lack of understanding of this area - if it is hard to understand for |
|
|
613 | Chip, it is probably not obvious to everybody). |
|
|
614 | |
|
|
615 | What follows is an ultra-condensed version of my talk about threads in |
|
|
616 | scripting languages given onthe perl workshop 2009: |
|
|
617 | |
|
|
618 | The so-called "ithreads" were originally implemented for two reasons: |
|
|
619 | first, to (badly) emulate unix processes on native win32 perls, and |
|
|
620 | secondly, to replace the older, real thread model ("5.005-threads"). |
|
|
621 | |
|
|
622 | It does that by using threads instead of OS processes. The difference |
|
|
623 | between processes and threads is that threads share memory (and other |
|
|
624 | state, such as files) between threads within a single process, while |
|
|
625 | processes do not share anything (at least not semantically). That means |
|
|
626 | that modifications done by one thread are seen by others, while |
|
|
627 | modifications by one process are not seen by other processes. |
|
|
628 | |
|
|
629 | The "ithreads" work exactly like that: when creating a new ithreads |
|
|
630 | process, all state is copied (memory is copied physically, files and |
|
|
631 | code is copied logically). Afterwards, it isolates all modifications. On |
|
|
632 | UNIX, the same behaviour can be achieved by using operating system |
|
|
633 | processes, except that UNIX typically uses hardware built into the |
|
|
634 | system to do this efficiently, while the windows process emulation |
|
|
635 | emulates this hardware in software (rather efficiently, but of course it |
|
|
636 | is still much slower than dedicated hardware). |
|
|
637 | |
|
|
638 | As mentioned before, loading code, modifying code, modifying data |
|
|
639 | structures and so on is only visible in the ithreads process doing the |
|
|
640 | modification, not in other ithread processes within the same OS process. |
|
|
641 | |
|
|
642 | This is why "ithreads" do not implement threads for perl at all, only |
|
|
643 | processes. What makes it so bad is that on non-windows platforms, you |
|
|
644 | can actually take advantage of custom hardware for this purpose (as |
|
|
645 | evidenced by the forks module, which gives you the (i-) threads API, |
|
|
646 | just much faster). |
|
|
647 | |
|
|
648 | Sharing data is in the i-threads model is done by transfering data |
|
|
649 | structures between threads using copying semantics, which is very slow - |
|
|
650 | shared data simply does not exist. Benchmarks using i-threads which are |
|
|
651 | communication-intensive show extremely bad behaviour with i-threads (in |
|
|
652 | fact, so bad that Coro, which cannot take direct advantage of multiple |
|
|
653 | CPUs, is often orders of magnitude faster because it shares data using |
|
|
654 | real threads, refer to my talk for details). |
|
|
655 | |
|
|
656 | As summary, i-threads *use* threads to implement processes, while the |
|
|
657 | compatible forks module *uses* processes to emulate, uhm, processes. |
|
|
658 | I-threads slow down every perl program when enabled, and outside of |
|
|
659 | windows, serve no (or little) practical purpose, but disadvantages every |
|
|
660 | single-threaded Perl program. |
|
|
661 | |
|
|
662 | This is the reason that I try to avoid the name "ithreads", as it is |
|
|
663 | misleading as it implies that it implements some kind of thread model |
|
|
664 | for perl, and prefer the name "windows process emulation", which |
|
|
665 | describes the actual use and behaviour of it much better. |
|
|
666 | |
588 | SEE ALSO |
667 | SEE ALSO |
589 | Event-Loop integration: Coro::AnyEvent, Coro::EV, Coro::Event. |
668 | Event-Loop integration: Coro::AnyEvent, Coro::EV, Coro::Event. |
590 | |
669 | |
591 | Debugging: Coro::Debug. |
670 | Debugging: Coro::Debug. |
592 | |
671 | |