--- Coro/Coro.pm 2011/02/13 04:39:14 1.284 +++ Coro/Coro.pm 2011/08/03 14:52:18 1.304 @@ -18,7 +18,6 @@ cede; # and again # use locking - use Coro::Semaphore; my $lock = new Coro::Semaphore; my $locked; @@ -42,14 +41,15 @@ thread models. Unlike the so-called "Perl threads" (which are not actually real threads -but only the windows process emulation (see section of same name for more -details) ported to unix, and as such act as processes), Coro provides -a full shared address space, which makes communication between threads -very easy. And Coro's threads are fast, too: disabling the Windows +but only the windows process emulation (see section of same name for +more details) ported to UNIX, and as such act as processes), Coro +provides a full shared address space, which makes communication between +threads very easy. And coro threads are fast, too: disabling the Windows process emulation code in your perl and using Coro can easily result in a two to four times speed increase for your programs. A parallel matrix -multiplication benchmark runs over 300 times faster on a single core than -perl's pseudo-threads on a quad core using all four cores. +multiplication benchmark (very communication-intensive) runs over 300 +times faster on a single core than perls pseudo-threads on a quad core +using all four cores. Coro achieves that by supporting multiple running interpreters that share data, which is especially useful to code pseudo-parallel processes and @@ -65,6 +65,267 @@ See also the C section at the end of this document - the Coro module family is quite large. +=head1 CORO THREAD LIFE CYCLE + +During the long and exciting (or not) life of a coro thread, it goes +through a number of states: + +=over 4 + +=item 1. Creation + +The first thing in the life of a coro thread is it's creation - +obviously. The typical way to create a thread is to call the C function: + + async { + # thread code goes here + }; + +You can also pass arguments, which are put in C<@_>: + + async { + print $_[1]; # prints 2 + } 1, 2, 3; + +This creates a new coro thread and puts it into the ready queue, meaning +it will run as soon as the CPU is free for it. + +C will return a Coro object - you can store this for future +reference or ignore it - a thread that is running, ready to run or waiting +for some event is alive on it's own. + +Another way to create a thread is to call the C constructor with a +code-reference: + + new Coro sub { + # thread code goes here + }, @optional_arguments; + +This is quite similar to calling C, but the important difference is +that the new thread is not put into the ready queue, so the thread will +not run until somebody puts it there. C is, therefore, identical to +this sequence: + + my $coro = new Coro sub { + # thread code goes here + }; + $coro->ready; + return $coro; + +=item 2. Startup + +When a new coro thread is created, only a copy of the code reference +and the arguments are stored, no extra memory for stacks and so on is +allocated, keeping the coro thread in a low-memory state. + +Only when it actually starts executing will all the resources be finally +allocated. + +The optional arguments specified at coro creation are available in C<@_>, +similar to function calls. + +=item 3. Running / Blocking + +A lot can happen after the coro thread has started running. Quite usually, +it will not run to the end in one go (because you could use a function +instead), but it will give up the CPU regularly because it waits for +external events. + +As long as a coro thread runs, its Coro object is available in the global +variable C<$Coro::current>. + +The low-level way to give up the CPU is to call the scheduler, which +selects a new coro thread to run: + + Coro::schedule; + +Since running threads are not in the ready queue, calling the scheduler +without doing anything else will block the coro thread forever - you need +to arrange either for the coro to put woken up (readied) by some other +event or some other thread, or you can put it into the ready queue before +scheduling: + + # this is exactly what Coro::cede does + $Coro::current->ready; + Coro::schedule; + +All the higher-level synchronisation methods (Coro::Semaphore, +Coro::rouse_*...) are actually implemented via C<< ->ready >> and C<< +Coro::schedule >>. + +While the coro thread is running it also might get assigned a C-level +thread, or the C-level thread might be unassigned from it, as the Coro +runtime wishes. A C-level thread needs to be assigned when your perl +thread calls into some C-level function and that function in turn calls +perl and perl then wants to switch coroutines. This happens most often +when you run an event loop and block in the callback, or when perl +itself calls some function such as C or methods via the C +mechanism. + +=item 4. Termination + +Many threads actually terminate after some time. There are a number of +ways to terminate a coro thread, the simplest is returning from the +top-level code reference: + + async { + # after returning from here, the coro thread is terminated + }; + + async { + return if 0.5 < rand; # terminate a little earlier, maybe + print "got a chance to print this\n"; + # or here + }; + +Any values returned from the coroutine can be recovered using C<< ->join +>>: + + my $coro = async { + "hello, world\n" # return a string + }; + + my $hello_world = $coro->join; + + print $hello_world; + +Another way to terminate is to call C<< Coro::terminate >>, which at any +subroutine call nesting level: + + async { + Coro::terminate "return value 1", "return value 2"; + }; + +And yet another way is to C<< ->cancel >> (or C<< ->safe_cancel >>) the +coro thread from another thread: + + my $coro = async { + exit 1; + }; + + $coro->cancel; # also accepts values for ->join to retrieve + +Cancellation I be dangerous - it's a bit like calling C without +actually exiting, and might leave C libraries and XS modules in a weird +state. Unlike other thread implementations, however, Coro is exceptionally +safe with regards to cancellation, as perl will always be in a consistent +state, and for those cases where you want to do truly marvellous things +with your coro while it is being cancelled - that is, make sure all +cleanup code is executed from the thread being cancelled - there is even a +C<< ->safe_cancel >> method. + +So, cancelling a thread that runs in an XS event loop might not be the +best idea, but any other combination that deals with perl only (cancelling +when a thread is in a C method or an C for example) is +safe. + +Lastly, a coro thread object that isn't referenced is C<< ->cancel >>'ed +automatically - just like other objects in Perl. This is not such a common +case, however - a running thread is referencedy b C<$Coro::current>, a +thread ready to run is referenced by the ready queue, a thread waiting +on a lock or semaphore is referenced by being in some wait list and so +on. But a thread that isn't in any of those queues gets cancelled: + + async { + schedule; # cede to other coros, don't go into the ready queue + }; + + cede; + # now the async above is destroyed, as it is not referenced by anything. + +=item 5. Cleanup + +Threads will allocate various resources. Most but not all will be returned +when a thread terminates, during clean-up. + +Cleanup is quite similar to throwing an uncaught exception: perl will +work it's way up through all subroutine calls and blocks. On it's way, it +will release all C variables, undo all C's and free any other +resources truly local to the thread. + +So, a common way to free resources is to keep them referenced only by my +variables: + + async { + my $big_cache = new Cache ...; + }; + +If there are no other references, then the C<$big_cache> object will be +freed when the thread terminates, regardless of how it does so. + +What it does C do is unlock any Coro::Semaphores or similar +resources, but that's where the C methods come in handy: + + my $sem = new Coro::Semaphore; + + async { + my $lock_guard = $sem->guard; + # if we reutrn, or die or get cancelled, here, + # then the semaphore will be "up"ed. + }; + +The C function comes in handy for any custom cleanup you +might want to do (but you cannot switch to other coroutines form those +code blocks): + + async { + my $window = new Gtk2::Window "toplevel"; + # The window will not be cleaned up automatically, even when $window + # gets freed, so use a guard to ensure it's destruction + # in case of an error: + my $window_guard = Guard::guard { $window->destroy }; + + # we are safe here + }; + +Last not least, C can often be handy, too, e.g. when temporarily +replacing the coro thread description: + + sub myfunction { + local $Coro::current->{desc} = "inside myfunction(@_)"; + + # if we return or die here, the description will be restored + } + +=item 6. Viva La Zombie Muerte + +Even after a thread has terminated and cleaned up its resources, the Coro +object still is there and stores the return values of the thread. + +The means the Coro object gets freed automatically when the thread has +terminated and cleaned up and there arenot other references. + +If there are, the Coro object will stay around, and you can call C<< +->join >> as many times as you wish to retrieve the result values: + + async { + print "hi\n"; + 1 + }; + + # run the async above, and free everything before returning + # from Coro::cede: + Coro::cede; + + { + my $coro = async { + print "hi\n"; + 1 + }; + + # run the async above, and clean up, but do not free the coro + # object: + Coro::cede; + + # optionally retrieve the result values + my @results = $coro->join; + + # now $coro goes out of scope, and presumably gets freed + }; + +=back + =cut package Coro; @@ -83,7 +344,7 @@ our $main; # main coro our $current; # current coro -our $VERSION = 5.26; +our $VERSION = 6.04; our @EXPORT = qw(async async_pool cede schedule terminate current unblock_sub rouse_cb rouse_wait); our %EXPORT_TAGS = ( @@ -133,7 +394,7 @@ by a thread listing, because the program has no other way to continue. This hook is overwritten by modules such as C and -C to wait on an external event that hopefully wake up a +C to wait on an external event that hopefully wakes up a coro so the scheduler can run it. See L or L for examples of using this technique. @@ -154,7 +415,7 @@ $manager = new Coro sub { while () { - Coro::State::cancel shift @destroy + _destroy shift @destroy while @destroy; &schedule; @@ -298,7 +559,8 @@ =item terminate [arg...] -Terminates the current coro with the given status values (see L). +Terminates the current coro with the given status values (see +L). The values will not be copied, but referenced directly. =item Coro::on_enter BLOCK, Coro::on_leave BLOCK @@ -464,6 +726,23 @@ against spurious wakeups, and the one in the Coro family certainly do that. +=item $state->is_new + +Returns true iff this Coro object is "new", i.e. has never been run +yet. Those states basically consist of only the code reference to call and +the arguments, but consumes very little other resources. New states will +automatically get assigned a perl interpreter when they are transfered to. + +=item $state->is_zombie + +Returns true iff the Coro object has been cancelled, i.e. +it's resources freed because they were C'ed, C'd, +C'ed or simply went out of scope. + +The name "zombie" stems from UNIX culture, where a process that has +exited and only stores and exit status and no other resources is called a +"zombie". + =item $is_ready = $coro->is_ready Returns true iff the Coro object is in the ready queue. Unless the Coro @@ -482,22 +761,78 @@ =item $coro->cancel (arg...) -Terminates the given Coro and makes it return the given arguments as -status (default: the empty list). Never returns if the Coro is the +Terminates the given Coro thread and makes it return the given arguments as +status (default: an empty list). Never returns if the Coro is the current Coro. -=cut - -sub cancel { - my $self = shift; +This is a rather brutal way to free a coro, with some limitations - if +the thread is inside a C callback that doesn't expect to be canceled, +bad things can happen, or if the cancelled thread insists on running +complicated cleanup handlers that rely on its thread context, things will +not work. + +Any cleanup code being run (e.g. from C blocks) will be run without +a thread context, and is not allowed to switch to other threads. On the +plus side, C<< ->cancel >> will always clean up the thread, no matter +what. If your cleanup code is complex or you want to avoid cancelling a +C-thread that doesn't know how to clean up itself, it can be better to C<< +->throw >> an exception, or use C<< ->safe_cancel >>. + +The arguments to C<< ->cancel >> are not copied, but instead will +be referenced directly (e.g. if you pass C<$var> and after the call +change that variable, then you might change the return values passed to +e.g. C, so don't do that). + +The resources of the Coro are usually freed (or destructed) before this +call returns, but this can be delayed for an indefinite amount of time, as +in some cases the manager thread has to run first to actually destruct the +Coro object. + +=item $coro->safe_cancel ($arg...) + +Works mostly like C<< ->cancel >>, but is inherently "safer", and +consequently, can fail with an exception in cases the thread is not in a +cancellable state. + +This method works a bit like throwing an exception that cannot be caught +- specifically, it will clean up the thread from within itself, so +all cleanup handlers (e.g. C blocks) are run with full thread +context and can block if they wish. The downside is that there is no +guarantee that the thread can be cancelled when you call this method, and +therefore, it might fail. It is also considerably slower than C or +C. + +A thread is in a safe-cancellable state if it either hasn't been run yet, +or it has no C context attached and is inside an SLF function. + +The latter two basically mean that the thread isn't currently inside a +perl callback called from some C function (usually via some XS modules) +and isn't currently executing inside some C function itself (via Coro's XS +API). + +This call returns true when it could cancel the thread, or croaks with an +error otherwise (i.e. it either returns true or doesn't return at all). + +Why the weird interface? Well, there are two common models on how and +when to cancel things. In the first, you have the expectation that your +coro thread can be cancelled when you want to cancel it - if the thread +isn't cancellable, this would be a bug somewhere, so C<< ->safe_cancel >> +croaks to notify of the bug. + +In the second model you sometimes want to ask nicely to cancel a thread, +but if it's not a good time, well, then don't cancel. This can be done +relatively easy like this: - if ($current == $self) { - terminate @_; - } else { - $self->{_status} = [@_]; - Coro::State::cancel $self; + if (! eval { $coro->safe_cancel }) { + warn "unable to cancel thread: $@"; } -} + +However, what you never should do is first try to cancel "safely" and +if that fails, cancel the "hard" way with C<< ->cancel >>. That makes +no sense: either you rely on being able to execute cleanup code in your +thread context, or you don't. If you do, then C<< ->safe_cancel >> is the +only way, and if you don't, then C<< ->cancel >> is always faster and more +direct. =item $coro->schedule_to @@ -526,17 +861,18 @@ Coro will check for the exception each time a schedule-like-function returns, i.e. after each C, C, C<< Coro::Semaphore->down ->>, C<< Coro::Handle->readable >> and so on. Most of these functions -detect this case and return early in case an exception is pending. +>>, C<< Coro::Handle->readable >> and so on. Most of those functions (all +that are part of Coro itself) detect this case and return early in case an +exception is pending. The exception object will be thrown "as is" with the specified scalar in C<$@>, i.e. if it is a string, no line number or newline will be appended (unlike with C). -This can be used as a softer means than C to ask a coro to -end itself, although there is no guarantee that the exception will lead to -termination, and if the exception isn't caught it might well end the whole -program. +This can be used as a softer means than either C or Cto ask a coro to end itself, although there is no guarantee that the +exception will lead to termination, and if the exception isn't caught it +might well end the whole program. You might also think of C as being the moral equivalent of Cing a coro with a signal (in this case, a scalar). @@ -545,43 +881,18 @@ Wait until the coro terminates and return any values given to the C or C functions. C can be called concurrently -from multiple coro, and all will be resumed and given the status +from multiple threads, and all will be resumed and given the status return once the C<$coro> terminates. -=cut - -sub join { - my $self = shift; - - unless ($self->{_status}) { - my $current = $current; - - push @{$self->{_on_destroy}}, sub { - $current->ready; - undef $current; - }; - - &schedule while $current; - } - - wantarray ? @{$self->{_status}} : $self->{_status}[0]; -} - =item $coro->on_destroy (\&cb) Registers a callback that is called when this coro thread gets destroyed, -but before it is joined. The callback gets passed the terminate arguments, -if any, and I die, under any circumstances. - -There can be any number of C callbacks per coro. - -=cut +that is, after it's resources have been freed but before it is joined. The +callback gets passed the terminate/cancel arguments, if any, and I die, under any circumstances. -sub on_destroy { - my ($self, $cb) = @_; - - push @{ $self->{_on_destroy} }, $cb; -} +There can be any number of C callbacks per coro, and there is +no way currently to remove a callback once added. =item $oldprio = $coro->prio ($newprio) @@ -863,6 +1174,9 @@ the windows process emulation enabled under unix roughly halves perl performance, even when not used. +Attempts to use threads created in another emulated process will crash +("cleanly", with a null pointer exception). + =item coro switching is not signal safe You must not switch to another coro from within a signal handler (only