ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/Coro/README
(Generate patch)

Comparing Coro/README (file contents):
Revision 1.25 by root, Tue Jun 30 08:28:55 2009 UTC vs.
Revision 1.29 by root, Sat Feb 19 06:51:22 2011 UTC

37 easily-identified points in your program, so locking and parallel access 37 easily-identified points in your program, so locking and parallel access
38 are rarely an issue, making thread programming much safer and easier 38 are rarely an issue, making thread programming much safer and easier
39 than using other thread models. 39 than using other thread models.
40 40
41 Unlike the so-called "Perl threads" (which are not actually real threads 41 Unlike the so-called "Perl threads" (which are not actually real threads
42 but only the windows process emulation ported to unix, and as such act 42 but only the windows process emulation (see section of same name for
43 as processes), Coro provides a full shared address space, which makes 43 more details) ported to UNIX, and as such act as processes), Coro
44 communication between threads very easy. And Coro's threads are fast, 44 provides a full shared address space, which makes communication between
45 too: disabling the Windows process emulation code in your perl and using 45 threads very easy. And coro threads are fast, too: disabling the Windows
46 Coro can easily result in a two to four times speed increase for your 46 process emulation code in your perl and using Coro can easily result in
47 programs. A parallel matrix multiplication benchmark runs over 300 times 47 a two to four times speed increase for your programs. A parallel matrix
48 multiplication benchmark (very communication-intensive) runs over 300
48 faster on a single core than perl's pseudo-threads on a quad core using 49 times faster on a single core than perls pseudo-threads on a quad core
49 all four cores. 50 using all four cores.
50 51
51 Coro achieves that by supporting multiple running interpreters that 52 Coro achieves that by supporting multiple running interpreters that
52 share data, which is especially useful to code pseudo-parallel processes 53 share data, which is especially useful to code pseudo-parallel processes
53 and for event-based programming, such as multiple HTTP-GET requests 54 and for event-based programming, such as multiple HTTP-GET requests
54 running concurrently. See Coro::AnyEvent to learn more on how to 55 running concurrently. See Coro::AnyEvent to learn more on how to
60 important global variables (see Coro::State for more configuration and 61 important global variables (see Coro::State for more configuration and
61 background info). 62 background info).
62 63
63 See also the "SEE ALSO" section at the end of this document - the Coro 64 See also the "SEE ALSO" section at the end of this document - the Coro
64 module family is quite large. 65 module family is quite large.
66
67CORO THREAD LIFE CYCLE
68 During the long and exciting (or not) life of a coro thread, it goes
69 through a number of states:
70
71 1. Creation
72 The first thing in the life of a coro thread is it's creation -
73 obviously. The typical way to create a thread is to call the "async
74 BLOCK" function:
75
76 async {
77 # thread code goes here
78 };
79
80 You can also pass arguments, which are put in @_:
81
82 async {
83 print $_[1]; # prints 2
84 } 1, 2, 3;
85
86 This creates a new coro thread and puts it into the ready queue,
87 meaning it will run as soon as the CPU is free for it.
88
89 "async" will return a coro object - you can store this for future
90 reference or ignore it, the thread itself will keep a reference to
91 it's thread object - threads are alive on their own.
92
93 Another way to create a thread is to call the "new" constructor with
94 a code-reference:
95
96 new Coro sub {
97 # thread code goes here
98 }, @optional_arguments;
99
100 This is quite similar to calling "async", but the important
101 difference is that the new thread is not put into the ready queue,
102 so the thread will not run until somebody puts it there. "async" is,
103 therefore, identical to this sequence:
104
105 my $coro = new Coro sub {
106 # thread code goes here
107 };
108 $coro->ready;
109 return $coro;
110
111 2. Startup
112 When a new coro thread is created, only a copy of the code reference
113 and the arguments are stored, no extra memory for stacks and so on
114 is allocated, keeping the coro thread in a low-memory state.
115
116 Only when it actually starts executing will all the resources be
117 finally allocated.
118
119 The optional arguments specified at coro creation are available in
120 @_, similar to function calls.
121
122 3. Running / Blocking
123 A lot can happen after the coro thread has started running. Quite
124 usually, it will not run to the end in one go (because you could use
125 a function instead), but it will give up the CPU regularly because
126 it waits for external events.
127
128 As long as a coro thread runs, it's coro object is available in the
129 global variable $Coro::current.
130
131 The low-level way to give up the CPU is to call the scheduler, which
132 selects a new coro thread to run:
133
134 Coro::schedule;
135
136 Since running threads are not in the ready queue, calling the
137 scheduler without doing anything else will block the coro thread
138 forever - you need to arrange either for the coro to put woken up
139 (readied) by some other event or some other thread, or you can put
140 it into the ready queue before scheduling:
141
142 # this is exactly what Coro::cede does
143 $Coro::current->ready;
144 Coro::schedule;
145
146 All the higher-level synchronisation methods (Coro::Semaphore,
147 Coro::rouse_*...) are actually implemented via "->ready" and
148 "Coro::schedule".
149
150 While the coro thread is running it also might get assigned a
151 C-level thread, or the C-level thread might be unassigned from it,
152 as the Coro runtime wishes. A C-level thread needs to be assigned
153 when your perl thread calls into some C-level function and that
154 function in turn calls perl and perl then wants to switch
155 coroutines. This happens most often when you run an event loop and
156 block in the callback, or when perl itself calls some function such
157 as "AUTOLOAD" or methods via the "tie" mechanism.
158
159 4. Termination
160 Many threads actually terminate after some time. There are a number
161 of ways to terminate a coro thread, the simplest is returning from
162 the top-level code reference:
163
164 async {
165 # after returning from here, the coro thread is terminated
166 };
167
168 async {
169 return if 0.5 < rand; # terminate a little earlier, maybe
170 print "got a chance to print this\n";
171 # or here
172 };
173
174 Any values returned from the coroutine can be recovered using
175 "->join":
176
177 my $coro = async {
178 "hello, world\n" # return a string
179 };
180
181 my $hello_world = $coro->join;
182
183 print $hello_world;
184
185 Another way to terminate is to call "Coro::terminate", which at any
186 subroutine call nesting level:
187
188 async {
189 Coro::terminate "return value 1", "return value 2";
190 };
191
192 And yet another way is to "->cancel" the coro thread from another
193 thread:
194
195 my $coro = async {
196 exit 1;
197 };
198
199 $coro->cancel; # an also accept values for ->join to retrieve
200
201 Cancellation *can* be dangerous - it's a bit like calling "exit"
202 without actually exiting, and might leave C libraries and XS modules
203 in a weird state. Unlike other thread implementations, however, Coro
204 is exceptionally safe with regards to cancellation, as perl will
205 always be in a consistent state.
206
207 So, cancelling a thread that runs in an XS event loop might not be
208 the best idea, but any other combination that deals with perl only
209 (cancelling when a thread is in a "tie" method or an "AUTOLOAD" for
210 example) is safe.
211
212 5. Cleanup
213 Threads will allocate various resources. Most but not all will be
214 returned when a thread terminates, during clean-up.
215
216 Cleanup is quite similar to throwing an uncaught exception: perl
217 will work it's way up through all subroutine calls and blocks. On
218 it's way, it will release all "my" variables, undo all "local"'s and
219 free any other resources truly local to the thread.
220
221 So, a common way to free resources is to keep them referenced only
222 by my variables:
223
224 async {
225 my $big_cache = new Cache ...;
226 };
227
228 If there are no other references, then the $big_cache object will be
229 freed when the thread terminates, regardless of how it does so.
230
231 What it does "NOT" do is unlock any Coro::Semaphores or similar
232 resources, but that's where the "guard" methods come in handy:
233
234 my $sem = new Coro::Semaphore;
235
236 async {
237 my $lock_guard = $sem->guard;
238 # if we reutrn, or die or get cancelled, here,
239 # then the semaphore will be "up"ed.
240 };
241
242 The "Guard::guard" function comes in handy for any custom cleanup
243 you might want to do:
244
245 async {
246 my $window = new Gtk2::Window "toplevel";
247 # The window will not be cleaned up automatically, even when $window
248 # gets freed, so use a guard to ensure it's destruction
249 # in case of an error:
250 my $window_guard = Guard::guard { $window->destroy };
251
252 # we are safe here
253 };
254
255 Last not least, "local" can often be handy, too, e.g. when
256 temporarily replacing the coro thread description:
257
258 sub myfunction {
259 local $Coro::current->{desc} = "inside myfunction(@_)";
260
261 # if we return or die here, the description will be restored
262 }
263
264 6. Viva La Zombie Muerte
265 Even after a thread has terminated and cleaned up it's resources,
266 the coro object still is there and stores the return values of the
267 thread. Only in this state will the coro object be "reference
268 counted" in the normal perl sense: the thread code keeps a reference
269 to it when it is active, but not after it has terminated.
270
271 The means the coro object gets freed automatically when the thread
272 has terminated and cleaned up and there arenot other references.
273
274 If there are, the coro object will stay around, and you can call
275 "->join" as many times as you wish to retrieve the result values:
276
277 async {
278 print "hi\n";
279 1
280 };
281
282 # run the async above, and free everything before returning
283 # from Coro::cede:
284 Coro::cede;
285
286 {
287 my $coro = async {
288 print "hi\n";
289 1
290 };
291
292 # run the async above, and clean up, but do not free the coro
293 # object:
294 Coro::cede;
295
296 # optionally retrieve the result values
297 my @results = $coro->join;
298
299 # now $coro goes out of scope, and presumably gets freed
300 };
65 301
66GLOBAL VARIABLES 302GLOBAL VARIABLES
67 $Coro::main 303 $Coro::main
68 This variable stores the Coro object that represents the main 304 This variable stores the Coro object that represents the main
69 program. While you cna "ready" it and do most other things you can 305 program. While you cna "ready" it and do most other things you can
82 $Coro::idle 318 $Coro::idle
83 This variable is mainly useful to integrate Coro into event loops. 319 This variable is mainly useful to integrate Coro into event loops.
84 It is usually better to rely on Coro::AnyEvent or Coro::EV, as this 320 It is usually better to rely on Coro::AnyEvent or Coro::EV, as this
85 is pretty low-level functionality. 321 is pretty low-level functionality.
86 322
87 This variable stores either a Coro object or a callback. 323 This variable stores a Coro object that is put into the ready queue
324 when there are no other ready threads (without invoking any ready
325 hooks).
88 326
89 If it is a callback, the it is called whenever the scheduler finds 327 The default implementation dies with "FATAL: deadlock detected.",
90 no ready coros to run. The default implementation prints "FATAL: 328 followed by a thread listing, because the program has no other way
91 deadlock detected" and exits, because the program has no other way
92 to continue. 329 to continue.
93 330
94 If it is a coro object, then this object will be readied (without
95 invoking any ready hooks, however) when the scheduler finds no other
96 ready coros to run.
97
98 This hook is overwritten by modules such as "Coro::EV" and 331 This hook is overwritten by modules such as "Coro::EV" and
99 "Coro::AnyEvent" to wait on an external event that hopefully wake up 332 "Coro::AnyEvent" to wait on an external event that hopefully wakes
100 a coro so the scheduler can run it. 333 up a coro so the scheduler can run it.
101 334
102 Note that the callback *must not*, under any circumstances, block
103 the current coro. Normally, this is achieved by having an "idle
104 coro" that calls the event loop and then blocks again, and then
105 readying that coro in the idle handler, or by simply placing the
106 idle coro in this variable.
107
108 See Coro::Event or Coro::AnyEvent for examples of using this 335 See Coro::EV or Coro::AnyEvent for examples of using this technique.
109 technique.
110
111 Please note that if your callback recursively invokes perl (e.g. for
112 event handlers), then it must be prepared to be called recursively
113 itself.
114 336
115SIMPLE CORO CREATION 337SIMPLE CORO CREATION
116 async { ... } [@args...] 338 async { ... } [@args...]
117 Create a new coro and return its Coro object (usually unused). The 339 Create a new coro and return its Coro object (usually unused). The
118 coro will be put into the ready queue, so it will start running 340 coro will be put into the ready queue, so it will start running
181 403
182 schedule 404 schedule
183 Calls the scheduler. The scheduler will find the next coro that is 405 Calls the scheduler. The scheduler will find the next coro that is
184 to be run from the ready queue and switches to it. The next coro to 406 to be run from the ready queue and switches to it. The next coro to
185 be run is simply the one with the highest priority that is longest 407 be run is simply the one with the highest priority that is longest
186 in its ready queue. If there is no coro ready, it will clal the 408 in its ready queue. If there is no coro ready, it will call the
187 $Coro::idle hook. 409 $Coro::idle hook.
188 410
189 Please note that the current coro will *not* be put into the ready 411 Please note that the current coro will *not* be put into the ready
190 queue, so calling this function usually means you will never be 412 queue, so calling this function usually means you will never be
191 called again unless something else (e.g. an event handler) calls 413 called again unless something else (e.g. an event handler) calls
424 "terminate" or "cancel" functions. "join" can be called concurrently 646 "terminate" or "cancel" functions. "join" can be called concurrently
425 from multiple coro, and all will be resumed and given the status 647 from multiple coro, and all will be resumed and given the status
426 return once the $coro terminates. 648 return once the $coro terminates.
427 649
428 $coro->on_destroy (\&cb) 650 $coro->on_destroy (\&cb)
429 Registers a callback that is called when this coro gets destroyed, 651 Registers a callback that is called when this coro thread gets
430 but before it is joined. The callback gets passed the terminate 652 destroyed, but before it is joined. The callback gets passed the
431 arguments, if any, and *must not* die, under any circumstances. 653 terminate arguments, if any, and *must not* die, under any
654 circumstances.
655
656 There can be any number of "on_destroy" callbacks per coro.
432 657
433 $oldprio = $coro->prio ($newprio) 658 $oldprio = $coro->prio ($newprio)
434 Sets (or gets, if the argument is missing) the priority of the coro. 659 Sets (or gets, if the argument is missing) the priority of the coro
435 Higher priority coro get run before lower priority coro. Priorities 660 thread. Higher priority coro get run before lower priority coros.
436 are small signed integers (currently -4 .. +3), that you can refer 661 Priorities are small signed integers (currently -4 .. +3), that you
437 to using PRIO_xxx constants (use the import tag :prio to get then): 662 can refer to using PRIO_xxx constants (use the import tag :prio to
663 get then):
438 664
439 PRIO_MAX > PRIO_HIGH > PRIO_NORMAL > PRIO_LOW > PRIO_IDLE > PRIO_MIN 665 PRIO_MAX > PRIO_HIGH > PRIO_NORMAL > PRIO_LOW > PRIO_IDLE > PRIO_MIN
440 3 > 1 > 0 > -1 > -3 > -4 666 3 > 1 > 0 > -1 > -3 > -4
441 667
442 # set priority to HIGH 668 # set priority to HIGH
443 current->prio (PRIO_HIGH); 669 current->prio (PRIO_HIGH);
444 670
445 The idle coro ($Coro::idle) always has a lower priority than any 671 The idle coro thread ($Coro::idle) always has a lower priority than
446 existing coro. 672 any existing coro.
447 673
448 Changing the priority of the current coro will take effect 674 Changing the priority of the current coro will take effect
449 immediately, but changing the priority of coro in the ready queue 675 immediately, but changing the priority of a coro in the ready queue
450 (but not running) will only take effect after the next schedule (of 676 (but not running) will only take effect after the next schedule (of
451 that coro). This is a bug that will be fixed in some future version. 677 that coro). This is a bug that will be fixed in some future version.
452 678
453 $newprio = $coro->nice ($change) 679 $newprio = $coro->nice ($change)
454 Similar to "prio", but subtract the given value from the priority 680 Similar to "prio", but subtract the given value from the priority
455 (i.e. higher values mean lower priority, just as in unix). 681 (i.e. higher values mean lower priority, just as in UNIX's nice
682 command).
456 683
457 $olddesc = $coro->desc ($newdesc) 684 $olddesc = $coro->desc ($newdesc)
458 Sets (or gets in case the argument is missing) the description for 685 Sets (or gets in case the argument is missing) the description for
459 this coro. This is just a free-form string you can associate with a 686 this coro thread. This is just a free-form string you can associate
460 coro. 687 with a coro.
461 688
462 This method simply sets the "$coro->{desc}" member to the given 689 This method simply sets the "$coro->{desc}" member to the given
463 string. You can modify this member directly if you wish. 690 string. You can modify this member directly if you wish, and in
691 fact, this is often preferred to indicate major processing states
692 that cna then be seen for example in a Coro::Debug session:
693
694 sub my_long_function {
695 local $Coro::current->{desc} = "now in my_long_function";
696 ...
697 $Coro::current->{desc} = "my_long_function: phase 1";
698 ...
699 $Coro::current->{desc} = "my_long_function: phase 2";
700 ...
701 }
464 702
465GLOBAL FUNCTIONS 703GLOBAL FUNCTIONS
466 Coro::nready 704 Coro::nready
467 Returns the number of coro that are currently in the ready state, 705 Returns the number of coro that are currently in the ready state,
468 i.e. that can be switched to by calling "schedule" directory or 706 i.e. that can be switched to by calling "schedule" directory or
485 The reason this function exists is that many event libraries (such 723 The reason this function exists is that many event libraries (such
486 as the venerable Event module) are not thread-safe (a weaker form of 724 as the venerable Event module) are not thread-safe (a weaker form of
487 reentrancy). This means you must not block within event callbacks, 725 reentrancy). This means you must not block within event callbacks,
488 otherwise you might suffer from crashes or worse. The only event 726 otherwise you might suffer from crashes or worse. The only event
489 library currently known that is safe to use without "unblock_sub" is 727 library currently known that is safe to use without "unblock_sub" is
490 EV. 728 EV (but you might still run into deadlocks if all event loops are
729 blocked).
730
731 Coro will try to catch you when you block in the event loop
732 ("FATAL:$Coro::IDLE blocked itself"), but this is just best effort
733 and only works when you do not run your own event loop.
491 734
492 This function allows your callbacks to block by executing them in 735 This function allows your callbacks to block by executing them in
493 another coro where it is safe to block. One example where blocking 736 another coro where it is safe to block. One example where blocking
494 is handy is when you use the Coro::AIO functions to save results to 737 is handy is when you use the Coro::AIO functions to save results to
495 disk, for example. 738 disk, for example.
506 when you use a module that uses AnyEvent (and you use 749 when you use a module that uses AnyEvent (and you use
507 Coro::AnyEvent) and it provides callbacks that are the result of 750 Coro::AnyEvent) and it provides callbacks that are the result of
508 some event callback, then you must not block either, or use 751 some event callback, then you must not block either, or use
509 "unblock_sub". 752 "unblock_sub".
510 753
511 $cb = Coro::rouse_cb 754 $cb = rouse_cb
512 Create and return a "rouse callback". That's a code reference that, 755 Create and return a "rouse callback". That's a code reference that,
513 when called, will remember a copy of its arguments and notify the 756 when called, will remember a copy of its arguments and notify the
514 owner coro of the callback. 757 owner coro of the callback.
515 758
516 See the next function. 759 See the next function.
517 760
518 @args = Coro::rouse_wait [$cb] 761 @args = rouse_wait [$cb]
519 Wait for the specified rouse callback (or the last one that was 762 Wait for the specified rouse callback (or the last one that was
520 created in this coro). 763 created in this coro).
521 764
522 As soon as the callback is invoked (or when the callback was invoked 765 As soon as the callback is invoked (or when the callback was invoked
523 before "rouse_wait"), it will return the arguments originally passed 766 before "rouse_wait"), it will return the arguments originally passed
608 unix roughly halves perl performance, even when not used. 851 unix roughly halves perl performance, even when not used.
609 852
610 coro switching is not signal safe 853 coro switching is not signal safe
611 You must not switch to another coro from within a signal handler 854 You must not switch to another coro from within a signal handler
612 (only relevant with %SIG - most event libraries provide safe 855 (only relevant with %SIG - most event libraries provide safe
613 signals). 856 signals), *unless* you are sure you are not interrupting a Coro
857 function.
614 858
615 That means you *MUST NOT* call any function that might "block" the 859 That means you *MUST NOT* call any function that might "block" the
616 current coro - "cede", "schedule" "Coro::Semaphore->down" or 860 current coro - "cede", "schedule" "Coro::Semaphore->down" or
617 anything that calls those. Everything else, including calling 861 anything that calls those. Everything else, including calling
618 "ready", works. 862 "ready", works.
619 863
864WINDOWS PROCESS EMULATION
865 A great many people seem to be confused about ithreads (for example,
866 Chip Salzenberg called me unintelligent, incapable, stupid and gullible,
867 while in the same mail making rather confused statements about perl
868 ithreads (for example, that memory or files would be shared), showing
869 his lack of understanding of this area - if it is hard to understand for
870 Chip, it is probably not obvious to everybody).
871
872 What follows is an ultra-condensed version of my talk about threads in
873 scripting languages given on the perl workshop 2009:
874
875 The so-called "ithreads" were originally implemented for two reasons:
876 first, to (badly) emulate unix processes on native win32 perls, and
877 secondly, to replace the older, real thread model ("5.005-threads").
878
879 It does that by using threads instead of OS processes. The difference
880 between processes and threads is that threads share memory (and other
881 state, such as files) between threads within a single process, while
882 processes do not share anything (at least not semantically). That means
883 that modifications done by one thread are seen by others, while
884 modifications by one process are not seen by other processes.
885
886 The "ithreads" work exactly like that: when creating a new ithreads
887 process, all state is copied (memory is copied physically, files and
888 code is copied logically). Afterwards, it isolates all modifications. On
889 UNIX, the same behaviour can be achieved by using operating system
890 processes, except that UNIX typically uses hardware built into the
891 system to do this efficiently, while the windows process emulation
892 emulates this hardware in software (rather efficiently, but of course it
893 is still much slower than dedicated hardware).
894
895 As mentioned before, loading code, modifying code, modifying data
896 structures and so on is only visible in the ithreads process doing the
897 modification, not in other ithread processes within the same OS process.
898
899 This is why "ithreads" do not implement threads for perl at all, only
900 processes. What makes it so bad is that on non-windows platforms, you
901 can actually take advantage of custom hardware for this purpose (as
902 evidenced by the forks module, which gives you the (i-) threads API,
903 just much faster).
904
905 Sharing data is in the i-threads model is done by transfering data
906 structures between threads using copying semantics, which is very slow -
907 shared data simply does not exist. Benchmarks using i-threads which are
908 communication-intensive show extremely bad behaviour with i-threads (in
909 fact, so bad that Coro, which cannot take direct advantage of multiple
910 CPUs, is often orders of magnitude faster because it shares data using
911 real threads, refer to my talk for details).
912
913 As summary, i-threads *use* threads to implement processes, while the
914 compatible forks module *uses* processes to emulate, uhm, processes.
915 I-threads slow down every perl program when enabled, and outside of
916 windows, serve no (or little) practical purpose, but disadvantages every
917 single-threaded Perl program.
918
919 This is the reason that I try to avoid the name "ithreads", as it is
920 misleading as it implies that it implements some kind of thread model
921 for perl, and prefer the name "windows process emulation", which
922 describes the actual use and behaviour of it much better.
923
620SEE ALSO 924SEE ALSO
621 Event-Loop integration: Coro::AnyEvent, Coro::EV, Coro::Event. 925 Event-Loop integration: Coro::AnyEvent, Coro::EV, Coro::Event.
622 926
623 Debugging: Coro::Debug. 927 Debugging: Coro::Debug.
624 928

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines