--- Coro/Coro.pm 2001/07/25 21:12:57 1.25 +++ Coro/Coro.pm 2008/12/13 22:08:13 1.244 @@ -1,270 +1,764 @@ =head1 NAME -Coro - coroutine process abstraction +Coro - the only real threads in perl =head1 SYNOPSIS - use Coro; - - async { - # some asynchronous thread of execution - }; - - # alternatively create an async process like this: - - sub some_func : Coro { - # some more async code - } - - cede; + use Coro; + + async { + # some asynchronous thread of execution + print "2\n"; + cede; # yield back to main + print "4\n"; + }; + print "1\n"; + cede; # yield to coroutine + print "3\n"; + cede; # and again + + # use locking + use Coro::Semaphore; + my $lock = new Coro::Semaphore; + my $locked; + + $lock->down; + $locked = 1; + $lock->up; =head1 DESCRIPTION -This module collection manages coroutines. Coroutines are similar to -Threads but don't run in parallel. +For a tutorial-style introduction, please read the L +manpage. This manpage mainly contains reference information. -This module is still experimental, see the BUGS section below. +This module collection manages continuations in general, most often +in the form of cooperative threads (also called coroutines in the +documentation). They are similar to kernel threads but don't (in general) +run in parallel at the same time even on SMP machines. The specific flavor +of thread offered by this module also guarantees you that it will not +switch between threads unless necessary, at easily-identified points in +your program, so locking and parallel access are rarely an issue, making +thread programming much safer and easier than using other thread models. + +Unlike the so-called "Perl threads" (which are not actually real threads +but only the windows process emulation ported to unix), Coro provides a +full shared address space, which makes communication between threads +very easy. And threads are fast, too: disabling the Windows process +emulation code in your perl and using Coro can easily result in a two to +four times speed increase for your programs. + +Coro achieves that by supporting multiple running interpreters that share +data, which is especially useful to code pseudo-parallel processes and +for event-based programming, such as multiple HTTP-GET requests running +concurrently. See L to learn more on how to integrate Coro +into an event-based environment. + +In this module, a thread is defined as "callchain + lexical variables + +@_ + $_ + $@ + $/ + C stack), that is, a thread has its own callchain, +its own set of lexicals and its own set of perls most important global +variables (see L for more configuration and background info). -In this module, coroutines are defined as "callchain + lexical variables -+ @_ + $_ + $@ + $^W + C stack), that is, a coroutine has it's own -callchain, it's own set of lexicals and it's own set of perl's most -important global variables. +See also the C section at the end of this document - the Coro +module family is quite large. =cut package Coro; +use strict qw(vars subs); +no warnings "uninitialized"; + use Coro::State; -use base Exporter; +use base qw(Coro::State Exporter); -$VERSION = 0.12; +our $idle; # idle handler +our $main; # main coroutine +our $current; # current coroutine -@EXPORT = qw(async cede schedule terminate current); -@EXPORT_OK = qw($current); +our $VERSION = 5.13; -{ - my @async; - - # this way of handling attributes simply is NOT scalable ;() - sub import { - Coro->export_to_level(1, @_); - my $old = *{(caller)[0]."::MODIFY_CODE_ATTRIBUTES"}{CODE}; - *{(caller)[0]."::MODIFY_CODE_ATTRIBUTES"} = sub { - my ($package, $ref) = (shift, shift); - my @attrs; - for (@_) { - if ($_ eq "Coro") { - push @async, $ref; - } else { - push @attrs, $_; - } - } - return $old ? $old->($package, $ref, @attrs) : @attrs; - }; - } +our @EXPORT = qw(async async_pool cede schedule terminate current unblock_sub); +our %EXPORT_TAGS = ( + prio => [qw(PRIO_MAX PRIO_HIGH PRIO_NORMAL PRIO_LOW PRIO_IDLE PRIO_MIN)], +); +our @EXPORT_OK = (@{$EXPORT_TAGS{prio}}, qw(nready)); - sub INIT { - &async(pop @async) while @async; - } -} +=head1 GLOBAL VARIABLES + +=over 4 -=item $main +=item $Coro::main -This coroutine represents the main program. +This variable stores the coroutine object that represents the main +program. While you cna C it and do most other things you can do to +coroutines, it is mainly useful to compare again C<$Coro::current>, to see +whether you are running in the main program or not. =cut -our $main = new Coro; +# $main is now being initialised by Coro::State -=item $current (or as function: current) +=item $Coro::current -The current coroutine (the last coroutine switched to). The initial value is C<$main> (of course). +The coroutine object representing the current coroutine (the last +coroutine that the Coro scheduler switched to). The initial value is +C<$Coro::main> (of course). + +This variable is B I. You can take copies of the +value stored in it and use it as any other coroutine object, but you must +not otherwise modify the variable itself. =cut -# maybe some other module used Coro::Specific before... -if ($current) { - $main->{specific} = $current->{specific}; -} +sub current() { $current } # [DEPRECATED] + +=item $Coro::idle + +This variable is mainly useful to integrate Coro into event loops. It is +usually better to rely on L or L, as this is +pretty low-level functionality. + +This variable stores either a coroutine or a callback. + +If it is a callback, the it is called whenever the scheduler finds no +ready coroutines to run. The default implementation prints "FATAL: +deadlock detected" and exits, because the program has no other way to +continue. -our $current = $main; +If it is a coroutine object, then this object will be readied (without +invoking any ready hooks, however) when the scheduler finds no other ready +coroutines to run. -sub current() { $current } +This hook is overwritten by modules such as C and +C to wait on an external event that hopefully wake up a +coroutine so the scheduler can run it. -=item $idle +Note that the callback I, under any circumstances, block +the current coroutine. Normally, this is achieved by having an "idle +coroutine" that calls the event loop and then blocks again, and then +readying that coroutine in the idle handler, or by simply placing the idle +coroutine in this variable. -The coroutine to switch to when no other coroutine is running. The default -implementation prints "FATAL: deadlock detected" and exits. +See L or L for examples of using this +technique. + +Please note that if your callback recursively invokes perl (e.g. for event +handlers), then it must be prepared to be called recursively itself. =cut -# should be done using priorities :( -our $idle = new Coro sub { - print STDERR "FATAL: deadlock detected\n"; - exit(51); +$idle = sub { + require Carp; + Carp::croak ("FATAL: deadlock detected"); }; # this coroutine is necessary because a coroutine # cannot destroy itself. -my @destroy; -my $manager = new Coro sub { - while() { - delete ((pop @destroy)->{_coro_state}) while @destroy; +our @destroy; +our $manager; + +$manager = new Coro sub { + while () { + Coro::_cancel shift @destroy + while @destroy; + &schedule; } }; +$manager->{desc} = "[coro manager]"; +$manager->prio (PRIO_MAX); -# we really need priorities... -my @ready; # the ready queue. hehe, rather broken ;) - -# static methods. not really. - -=head2 STATIC METHODS +=back -Static methods are actually functions that operate on the current process only. +=head1 SIMPLE COROUTINE CREATION =over 4 =item async { ... } [@args...] -Create a new asynchronous process and return it's process object -(usually unused). When the sub returns the new process is automatically +Create a new coroutine and return its coroutine object (usually +unused). The coroutine will be put into the ready queue, so +it will start running automatically on the next scheduler run. + +The first argument is a codeblock/closure that should be executed in the +coroutine. When it returns argument returns the coroutine is automatically terminated. - # create a new coroutine that just prints its arguments +The remaining arguments are passed as arguments to the closure. + +See the C constructor for info about the coroutine +environment in which coroutines are executed. + +Calling C in a coroutine will do the same as calling exit outside +the coroutine. Likewise, when the coroutine dies, the program will exit, +just as it would in the main program. + +If you do not want that, you can provide a default C handler, or +simply avoid dieing (by use of C). + +Example: Create a new coroutine that just prints its arguments. + async { print "@_\n"; } 1,2,3,4; -The coderef you submit MUST NOT be a closure that refers to variables -in an outer scope. This does NOT work. Pass arguments into it instead. - =cut sub async(&@) { - my $pid = new Coro @_; - $manager->ready; # this ensures that the stack is cloned from the manager - $pid->ready; - $pid; + my $coro = new Coro @_; + $coro->ready; + $coro } -=item schedule +=item async_pool { ... } [@args...] -Calls the scheduler. Please note that the current process will not be put -into the ready queue, so calling this function usually means you will -never be called again. +Similar to C, but uses a coroutine pool, so you should not call +terminate or join on it (although you are allowed to), and you get a +coroutine that might have executed other code already (which can be good +or bad :). + +On the plus side, this function is about twice as fast as creating (and +destroying) a completely new coroutine, so if you need a lot of generic +coroutines in quick successsion, use C, not C. + +The code block is executed in an C context and a warning will be +issued in case of an exception instead of terminating the program, as +C does. As the coroutine is being reused, stuff like C +will not work in the expected way, unless you call terminate or cancel, +which somehow defeats the purpose of pooling (but is fine in the +exceptional case). + +The priority will be reset to C<0> after each run, tracing will be +disabled, the description will be reset and the default output filehandle +gets restored, so you can change all these. Otherwise the coroutine will +be re-used "as-is": most notably if you change other per-coroutine global +stuff such as C<$/> you I revert that change, which is most +simply done by using local as in: C<< local $/ >>. + +The idle pool size is limited to C<8> idle coroutines (this can be +adjusted by changing $Coro::POOL_SIZE), but there can be as many non-idle +coros as required. + +If you are concerned about pooled coroutines growing a lot because a +single C used a lot of stackspace you can e.g. C once per second or so to slowly replenish the pool. In +addition to that, when the stacks used by a handler grows larger than 32kb +(adjustable via $Coro::POOL_RSS) it will also be destroyed. =cut -my $prev; +our $POOL_SIZE = 8; +our $POOL_RSS = 32 * 1024; +our @async_pool; + +sub pool_handler { + while () { + eval { + &{&_pool_handler} while 1; + }; -sub schedule { - # should be done using priorities :( - ($prev, $current) = ($current, shift @ready || $idle); - Coro::State::transfer($prev, $current); + warn $@ if $@; + } } +=back + +=head1 STATIC METHODS + +Static methods are actually functions that implicitly operate on the +current coroutine. + +=over 4 + +=item schedule + +Calls the scheduler. The scheduler will find the next coroutine that is +to be run from the ready queue and switches to it. The next coroutine +to be run is simply the one with the highest priority that is longest +in its ready queue. If there is no coroutine ready, it will clal the +C<$Coro::idle> hook. + +Please note that the current coroutine will I be put into the ready +queue, so calling this function usually means you will never be called +again unless something else (e.g. an event handler) calls C<< ->ready >>, +thus waking you up. + +This makes C I generic method to use to block the current +coroutine and wait for events: first you remember the current coroutine in +a variable, then arrange for some callback of yours to call C<< ->ready +>> on that once some event happens, and last you call C to put +yourself to sleep. Note that a lot of things can wake your coroutine up, +so you need to check whether the event indeed happened, e.g. by storing the +status in a variable. + +See B, below, for some ways to wait for callbacks. + =item cede -"Cede" to other processes. This function puts the current process into the -ready queue and calls C, which has the effect of giving up the -current "timeslice" to other coroutines of the same or higher priority. +"Cede" to other coroutines. This function puts the current coroutine into +the ready queue and calls C, which has the effect of giving +up the current "timeslice" to other coroutines of the same or higher +priority. Once your coroutine gets its turn again it will automatically be +resumed. -=cut +This function is often called C in other languages. -sub cede { - $current->ready; - &schedule; -} +=item Coro::cede_notself + +Works like cede, but is not exported by default and will cede to I +coroutine, regardless of priority. This is useful sometimes to ensure +progress is made. + +=item terminate [arg...] -=item terminate +Terminates the current coroutine with the given status values (see L). -Terminates the current process. +=item killall -Future versions of this function will allow result arguments. +Kills/terminates/cancels all coroutines except the currently running +one. This is useful after a fork, either in the child or the parent, as +usually only one of them should inherit the running coroutines. + +Note that while this will try to free some of the main programs resources, +you cannot free all of them, so if a coroutine that is not the main +program calls this function, there will be some one-time resource leak. =cut -sub terminate { - push @destroy, $current; - $manager->ready; - &schedule; - # NORETURN +sub killall { + for (Coro::State::list) { + $_->cancel + if $_ != $current && UNIVERSAL::isa $_, "Coro"; + } } =back -# dynamic methods - -=head2 PROCESS METHODS +=head1 COROUTINE OBJECT METHODS -These are the methods you can call on process objects. +These are the methods you can call on coroutine objects (or to create +them). =over 4 =item new Coro \&sub [, @args...] -Create a new process and return it. When the sub returns the process -automatically terminates. To start the process you must first put it into -the ready queue by calling the ready method. +Create a new coroutine and return it. When the sub returns, the coroutine +automatically terminates as if C with the returned values were +called. To make the coroutine run you must first put it into the ready +queue by calling the ready method. -The coderef you submit MUST NOT be a closure that refers to variables -in an outer scope. This does NOT work. Pass arguments into it instead. +See C and C for additional info about the +coroutine environment. =cut -sub _newcoro { +sub _coro_run { terminate &{+shift}; } -sub new { - my $class = shift; - bless { - _coro_state => (new Coro::State $_[0] && \&_newcoro, @_), - }, $class; +=item $success = $coroutine->ready + +Put the given coroutine into the end of its ready queue (there is one +queue for each priority) and return true. If the coroutine is already in +the ready queue, do nothing and return false. + +This ensures that the scheduler will resume this coroutine automatically +once all the coroutines of higher priority and all coroutines of the same +priority that were put into the ready queue earlier have been resumed. + +=item $is_ready = $coroutine->is_ready + +Return whether the coroutine is currently the ready queue or not, + +=item $coroutine->cancel (arg...) + +Terminates the given coroutine and makes it return the given arguments as +status (default: the empty list). Never returns if the coroutine is the +current coroutine. + +=cut + +sub cancel { + my $self = shift; + + if ($current == $self) { + terminate @_; + } else { + $self->{_status} = [@_]; + $self->_cancel; + } +} + +=item $coroutine->schedule_to + +Puts the current coroutine to sleep (like C), but instead +of continuing with the next coro from the ready queue, always switch to +the given coroutine object (regardless of priority etc.). The readyness +state of that coroutine isn't changed. + +This is an advanced method for special cases - I'd love to hear about any +uses for this one. + +=item $coroutine->cede_to + +Like C, but puts the current coroutine into the ready +queue. This has the effect of temporarily switching to the given +coroutine, and continuing some time later. + +This is an advanced method for special cases - I'd love to hear about any +uses for this one. + +=item $coroutine->throw ([$scalar]) + +If C<$throw> is specified and defined, it will be thrown as an exception +inside the coroutine at the next convenient point in time. Otherwise +clears the exception object. + +Coro will check for the exception each time a schedule-like-function +returns, i.e. after each C, C, C<< Coro::Semaphore->down +>>, C<< Coro::Handle->readable >> and so on. Most of these functions +detect this case and return early in case an exception is pending. + +The exception object will be thrown "as is" with the specified scalar in +C<$@>, i.e. if it is a string, no line number or newline will be appended +(unlike with C). + +This can be used as a softer means than C to ask a coroutine to +end itself, although there is no guarantee that the exception will lead to +termination, and if the exception isn't caught it might well end the whole +program. + +You might also think of C as being the moral equivalent of +Cing a coroutine with a signal (in this case, a scalar). + +=item $coroutine->join + +Wait until the coroutine terminates and return any values given to the +C or C functions. C can be called concurrently +from multiple coroutines, and all will be resumed and given the status +return once the C<$coroutine> terminates. + +=cut + +sub join { + my $self = shift; + + unless ($self->{_status}) { + my $current = $current; + + push @{$self->{_on_destroy}}, sub { + $current->ready; + undef $current; + }; + + &schedule while $current; + } + + wantarray ? @{$self->{_status}} : $self->{_status}[0]; } -=item $process->ready +=item $coroutine->on_destroy (\&cb) -Put the current process into the ready queue. +Registers a callback that is called when this coroutine gets destroyed, +but before it is joined. The callback gets passed the terminate arguments, +if any, and I die, under any circumstances. =cut -sub ready { - push @ready, $_[0]; +sub on_destroy { + my ($self, $cb) = @_; + + push @{ $self->{_on_destroy} }, $cb; +} + +=item $oldprio = $coroutine->prio ($newprio) + +Sets (or gets, if the argument is missing) the priority of the +coroutine. Higher priority coroutines get run before lower priority +coroutines. Priorities are small signed integers (currently -4 .. +3), +that you can refer to using PRIO_xxx constants (use the import tag :prio +to get then): + + PRIO_MAX > PRIO_HIGH > PRIO_NORMAL > PRIO_LOW > PRIO_IDLE > PRIO_MIN + 3 > 1 > 0 > -1 > -3 > -4 + + # set priority to HIGH + current->prio(PRIO_HIGH); + +The idle coroutine ($Coro::idle) always has a lower priority than any +existing coroutine. + +Changing the priority of the current coroutine will take effect immediately, +but changing the priority of coroutines in the ready queue (but not +running) will only take effect after the next schedule (of that +coroutine). This is a bug that will be fixed in some future version. + +=item $newprio = $coroutine->nice ($change) + +Similar to C, but subtract the given value from the priority (i.e. +higher values mean lower priority, just as in unix). + +=item $olddesc = $coroutine->desc ($newdesc) + +Sets (or gets in case the argument is missing) the description for this +coroutine. This is just a free-form string you can associate with a +coroutine. + +This method simply sets the C<< $coroutine->{desc} >> member to the given +string. You can modify this member directly if you wish. + +=cut + +sub desc { + my $old = $_[0]{desc}; + $_[0]{desc} = $_[1] if @_ > 1; + $old; +} + +sub transfer { + require Carp; + Carp::croak ("You must not call ->transfer on Coro objects. Use Coro::State objects or the ->schedule_to method. Caught"); } =back +=head1 GLOBAL FUNCTIONS + +=over 4 + +=item Coro::nready + +Returns the number of coroutines that are currently in the ready state, +i.e. that can be switched to by calling C directory or +indirectly. The value C<0> means that the only runnable coroutine is the +currently running one, so C would have no effect, and C +would cause a deadlock unless there is an idle handler that wakes up some +coroutines. + +=item my $guard = Coro::guard { ... } + +This function still exists, but is deprecated. Please use the +C function instead. + +=cut + +BEGIN { *guard = \&Guard::guard } + +=item unblock_sub { ... } + +This utility function takes a BLOCK or code reference and "unblocks" it, +returning a new coderef. Unblocking means that calling the new coderef +will return immediately without blocking, returning nothing, while the +original code ref will be called (with parameters) from within another +coroutine. + +The reason this function exists is that many event libraries (such as the +venerable L module) are not coroutine-safe (a weaker form +of reentrancy). This means you must not block within event callbacks, +otherwise you might suffer from crashes or worse. The only event library +currently known that is safe to use without C is L. + +This function allows your callbacks to block by executing them in another +coroutine where it is safe to block. One example where blocking is handy +is when you use the L functions to save results to +disk, for example. + +In short: simply use C instead of C when +creating event callbacks that want to block. + +If your handler does not plan to block (e.g. simply sends a message to +another coroutine, or puts some other coroutine into the ready queue), +there is no reason to use C. + +Note that you also need to use C for any other callbacks that +are indirectly executed by any C-based event loop. For example, when you +use a module that uses L (and you use L) and it +provides callbacks that are the result of some event callback, then you +must not block either, or use C. + +=cut + +our @unblock_queue; + +# we create a special coro because we want to cede, +# to reduce pressure on the coro pool (because most callbacks +# return immediately and can be reused) and because we cannot cede +# inside an event callback. +our $unblock_scheduler = new Coro sub { + while () { + while (my $cb = pop @unblock_queue) { + &async_pool (@$cb); + + # for short-lived callbacks, this reduces pressure on the coro pool + # as the chance is very high that the async_poll coro will be back + # in the idle state when cede returns + cede; + } + schedule; # sleep well + } +}; +$unblock_scheduler->{desc} = "[unblock_sub scheduler]"; + +sub unblock_sub(&) { + my $cb = shift; + + sub { + unshift @unblock_queue, [$cb, @_]; + $unblock_scheduler->ready; + } +} + +=item $cb = Coro::rouse_cb + +Create and return a "rouse callback". That's a code reference that, +when called, will remember a copy of its arguments and notify the owner +coroutine of the callback. + +See the next function. + +=item @args = Coro::rouse_wait [$cb] + +Wait for the specified rouse callback (or the last one that was created in +this coroutine). + +As soon as the callback is invoked (or when the callback was invoked +before C), it will return the arguments originally passed to +the rouse callback. + +See the section B for an actual usage example. + +=back + =cut 1; +=head1 HOW TO WAIT FOR A CALLBACK + +It is very common for a coroutine to wait for some callback to be +called. This occurs naturally when you use coroutines in an otherwise +event-based program, or when you use event-based libraries. + +These typically register a callback for some event, and call that callback +when the event occured. In a coroutine, however, you typically want to +just wait for the event, simplyifying things. + +For example C<< AnyEvent->child >> registers a callback to be called when +a specific child has exited: + + my $child_watcher = AnyEvent->child (pid => $pid, cb => sub { ... }); + +But from withina coroutine, you often just want to write this: + + my $status = wait_for_child $pid; + +Coro offers two functions specifically designed to make this easy, +C and C. + +The first function, C, generates and returns a callback that, +when invoked, will save its arguments and notify the coroutine that +created the callback. + +The second function, C, waits for the callback to be called +(by calling C to go to sleep) and returns the arguments +originally passed to the callback. + +Using these functions, it becomes easy to write the C +function mentioned above: + + sub wait_for_child($) { + my ($pid) = @_; + + my $watcher = AnyEvent->child (pid => $pid, cb => Coro::rouse_cb); + + my ($rpid, $rstatus) = Coro::rouse_wait; + $rstatus + } + +In the case where C and C are not flexible enough, +you can roll your own, using C: + + sub wait_for_child($) { + my ($pid) = @_; + + # store the current coroutine in $current, + # and provide result variables for the closure passed to ->child + my $current = $Coro::current; + my ($done, $rstatus); + + # pass a closure to ->child + my $watcher = AnyEvent->child (pid => $pid, cb => sub { + $rstatus = $_[1]; # remember rstatus + $done = 1; # mark $rstatus as valud + }); + + # wait until the closure has been called + schedule while !$done; + + $rstatus + } + + =head1 BUGS/LIMITATIONS - - could be faster, especially when the core would introduce special - support for coroutines (like it does for threads). - - there is still a memleak on coroutine termination that I could not - identify. Could be as small as a single SV. - - this module is not well-tested. - - if variables or arguments "disappear" (become undef) or become - corrupted please contact the author so he cen iron out the - remaining bugs. - - this module is not thread-safe. You must only ever use this module from - the same thread (this requirement might be loosened in the future to - allow per-thread schedulers, but Coro::State does not yet allow this). +=over 4 + +=item fork with pthread backend + +When Coro is compiled using the pthread backend (which isn't recommended +but required on many BSDs as their libcs are completely broken), then +coroutines will not survive a fork. There is no known workaround except to +fix your libc and use a saner backend. + +=item perl process emulation ("threads") + +This module is not perl-pseudo-thread-safe. You should only ever use this +module from the first thread (this requirement might be removed in the +future to allow per-thread schedulers, but Coro::State does not yet allow +this). I recommend disabling thread support and using processes, as having +the windows process emulation enabled under unix roughly halves perl +performance, even when not used. + +=item coroutine switching not signal safe + +You must not switch to another coroutine from within a signal handler +(only relevant with %SIG - most event libraries provide safe signals). + +That means you I call any function that might "block" the +current coroutine - C, C C<< Coro::Semaphore->down >> or +anything that calls those. Everything else, including calling C, +works. + +=back + =head1 SEE ALSO -L, L, L, L, -L, L, L, L, -L, L. +Event-Loop integration: L, L, L. + +Debugging: L. + +Support/Utility: L, L. + +Locking and IPC: L, L, L, +L, L. + +I/O and Timers: L, L, L, L. + +Compatibility with other modules: L (but see also L for +a better-working alternative), L, L, +L. + +XS API: L. + +Low level Configuration, Thread Environment, Continuations: L. =head1 AUTHOR - Marc Lehmann - http://www.goof.com/pcg/marc/ + Marc Lehmann + http://home.schmorp.de/ =cut