--- cvsroot/Coro/Coro.pm 2008/12/13 19:18:36 1.243 +++ cvsroot/Coro/Coro.pm 2016/06/26 21:46:03 1.343 @@ -13,12 +13,11 @@ print "4\n"; }; print "1\n"; - cede; # yield to coroutine + cede; # yield to coro print "3\n"; cede; # and again # use locking - use Coro::Semaphore; my $lock = new Coro::Semaphore; my $locked; @@ -31,21 +30,26 @@ For a tutorial-style introduction, please read the L manpage. This manpage mainly contains reference information. -This module collection manages continuations in general, most often -in the form of cooperative threads (also called coroutines in the -documentation). They are similar to kernel threads but don't (in general) -run in parallel at the same time even on SMP machines. The specific flavor -of thread offered by this module also guarantees you that it will not -switch between threads unless necessary, at easily-identified points in -your program, so locking and parallel access are rarely an issue, making -thread programming much safer and easier than using other thread models. +This module collection manages continuations in general, most often in +the form of cooperative threads (also called coros, or simply "coro" +in the documentation). They are similar to kernel threads but don't (in +general) run in parallel at the same time even on SMP machines. The +specific flavor of thread offered by this module also guarantees you that +it will not switch between threads unless necessary, at easily-identified +points in your program, so locking and parallel access are rarely an +issue, making thread programming much safer and easier than using other +thread models. Unlike the so-called "Perl threads" (which are not actually real threads -but only the windows process emulation ported to unix), Coro provides a -full shared address space, which makes communication between threads -very easy. And threads are fast, too: disabling the Windows process -emulation code in your perl and using Coro can easily result in a two to -four times speed increase for your programs. +but only the windows process emulation (see section of same name for +more details) ported to UNIX, and as such act as processes), Coro +provides a full shared address space, which makes communication between +threads very easy. And coro threads are fast, too: disabling the Windows +process emulation code in your perl and using Coro can easily result in +a two to four times speed increase for your programs. A parallel matrix +multiplication benchmark (very communication-intensive) runs over 300 +times faster on a single core than perls pseudo-threads on a quad core +using all four cores. Coro achieves that by supporting multiple running interpreters that share data, which is especially useful to code pseudo-parallel processes and @@ -54,31 +58,319 @@ into an event-based environment. In this module, a thread is defined as "callchain + lexical variables + -@_ + $_ + $@ + $/ + C stack), that is, a thread has its own callchain, +some package variables + C stack), that is, a thread has its own callchain, its own set of lexicals and its own set of perls most important global variables (see L for more configuration and background info). See also the C section at the end of this document - the Coro module family is quite large. +=head1 CORO THREAD LIFE CYCLE + +During the long and exciting (or not) life of a coro thread, it goes +through a number of states: + +=over 4 + +=item 1. Creation + +The first thing in the life of a coro thread is it's creation - +obviously. The typical way to create a thread is to call the C function: + + async { + # thread code goes here + }; + +You can also pass arguments, which are put in C<@_>: + + async { + print $_[1]; # prints 2 + } 1, 2, 3; + +This creates a new coro thread and puts it into the ready queue, meaning +it will run as soon as the CPU is free for it. + +C will return a Coro object - you can store this for future +reference or ignore it - a thread that is running, ready to run or waiting +for some event is alive on it's own. + +Another way to create a thread is to call the C constructor with a +code-reference: + + new Coro sub { + # thread code goes here + }, @optional_arguments; + +This is quite similar to calling C, but the important difference is +that the new thread is not put into the ready queue, so the thread will +not run until somebody puts it there. C is, therefore, identical to +this sequence: + + my $coro = new Coro sub { + # thread code goes here + }; + $coro->ready; + return $coro; + +=item 2. Startup + +When a new coro thread is created, only a copy of the code reference +and the arguments are stored, no extra memory for stacks and so on is +allocated, keeping the coro thread in a low-memory state. + +Only when it actually starts executing will all the resources be finally +allocated. + +The optional arguments specified at coro creation are available in C<@_>, +similar to function calls. + +=item 3. Running / Blocking + +A lot can happen after the coro thread has started running. Quite usually, +it will not run to the end in one go (because you could use a function +instead), but it will give up the CPU regularly because it waits for +external events. + +As long as a coro thread runs, its Coro object is available in the global +variable C<$Coro::current>. + +The low-level way to give up the CPU is to call the scheduler, which +selects a new coro thread to run: + + Coro::schedule; + +Since running threads are not in the ready queue, calling the scheduler +without doing anything else will block the coro thread forever - you need +to arrange either for the coro to put woken up (readied) by some other +event or some other thread, or you can put it into the ready queue before +scheduling: + + # this is exactly what Coro::cede does + $Coro::current->ready; + Coro::schedule; + +All the higher-level synchronisation methods (Coro::Semaphore, +Coro::rouse_*...) are actually implemented via C<< ->ready >> and C<< +Coro::schedule >>. + +While the coro thread is running it also might get assigned a C-level +thread, or the C-level thread might be unassigned from it, as the Coro +runtime wishes. A C-level thread needs to be assigned when your perl +thread calls into some C-level function and that function in turn calls +perl and perl then wants to switch coroutines. This happens most often +when you run an event loop and block in the callback, or when perl +itself calls some function such as C or methods via the C +mechanism. + +=item 4. Termination + +Many threads actually terminate after some time. There are a number of +ways to terminate a coro thread, the simplest is returning from the +top-level code reference: + + async { + # after returning from here, the coro thread is terminated + }; + + async { + return if 0.5 < rand; # terminate a little earlier, maybe + print "got a chance to print this\n"; + # or here + }; + +Any values returned from the coroutine can be recovered using C<< ->join +>>: + + my $coro = async { + "hello, world\n" # return a string + }; + + my $hello_world = $coro->join; + + print $hello_world; + +Another way to terminate is to call C<< Coro::terminate >>, which at any +subroutine call nesting level: + + async { + Coro::terminate "return value 1", "return value 2"; + }; + +Yet another way is to C<< ->cancel >> (or C<< ->safe_cancel >>) the coro +thread from another thread: + + my $coro = async { + exit 1; + }; + + $coro->cancel; # also accepts values for ->join to retrieve + +Cancellation I be dangerous - it's a bit like calling C without +actually exiting, and might leave C libraries and XS modules in a weird +state. Unlike other thread implementations, however, Coro is exceptionally +safe with regards to cancellation, as perl will always be in a consistent +state, and for those cases where you want to do truly marvellous things +with your coro while it is being cancelled - that is, make sure all +cleanup code is executed from the thread being cancelled - there is even a +C<< ->safe_cancel >> method. + +So, cancelling a thread that runs in an XS event loop might not be the +best idea, but any other combination that deals with perl only (cancelling +when a thread is in a C method or an C for example) is +safe. + +Last not least, a coro thread object that isn't referenced is C<< +->cancel >>'ed automatically - just like other objects in Perl. This +is not such a common case, however - a running thread is referencedy by +C<$Coro::current>, a thread ready to run is referenced by the ready queue, +a thread waiting on a lock or semaphore is referenced by being in some +wait list and so on. But a thread that isn't in any of those queues gets +cancelled: + + async { + schedule; # cede to other coros, don't go into the ready queue + }; + + cede; + # now the async above is destroyed, as it is not referenced by anything. + +A slightly embellished example might make it clearer: + + async { + my $guard = Guard::guard { print "destroyed\n" }; + schedule while 1; + }; + + cede; + +Superficially one might not expect any output - since the C +implements an endless loop, the C<$guard> will not be cleaned up. However, +since the thread object returned by C is not stored anywhere, the +thread is initially referenced because it is in the ready queue, when it +runs it is referenced by C<$Coro::current>, but when it calls C, +it gets Ced causing the guard object to be destroyed (see the next +section), and printing it's message. + +If this seems a bit drastic, remember that this only happens when nothing +references the thread anymore, which means there is no way to further +execute it, ever. The only options at this point are leaking the thread, +or cleaning it up, which brings us to... + +=item 5. Cleanup + +Threads will allocate various resources. Most but not all will be returned +when a thread terminates, during clean-up. + +Cleanup is quite similar to throwing an uncaught exception: perl will +work it's way up through all subroutine calls and blocks. On it's way, it +will release all C variables, undo all C's and free any other +resources truly local to the thread. + +So, a common way to free resources is to keep them referenced only by my +variables: + + async { + my $big_cache = new Cache ...; + }; + +If there are no other references, then the C<$big_cache> object will be +freed when the thread terminates, regardless of how it does so. + +What it does C do is unlock any Coro::Semaphores or similar +resources, but that's where the C methods come in handy: + + my $sem = new Coro::Semaphore; + + async { + my $lock_guard = $sem->guard; + # if we return, or die or get cancelled, here, + # then the semaphore will be "up"ed. + }; + +The C function comes in handy for any custom cleanup you +might want to do (but you cannot switch to other coroutines from those +code blocks): + + async { + my $window = new Gtk2::Window "toplevel"; + # The window will not be cleaned up automatically, even when $window + # gets freed, so use a guard to ensure it's destruction + # in case of an error: + my $window_guard = Guard::guard { $window->destroy }; + + # we are safe here + }; + +Last not least, C can often be handy, too, e.g. when temporarily +replacing the coro thread description: + + sub myfunction { + local $Coro::current->{desc} = "inside myfunction(@_)"; + + # if we return or die here, the description will be restored + } + +=item 6. Viva La Zombie Muerte + +Even after a thread has terminated and cleaned up its resources, the Coro +object still is there and stores the return values of the thread. + +When there are no other references, it will simply be cleaned up and +freed. + +If there areany references, the Coro object will stay around, and you +can call C<< ->join >> as many times as you wish to retrieve the result +values: + + async { + print "hi\n"; + 1 + }; + + # run the async above, and free everything before returning + # from Coro::cede: + Coro::cede; + + { + my $coro = async { + print "hi\n"; + 1 + }; + + # run the async above, and clean up, but do not free the coro + # object: + Coro::cede; + + # optionally retrieve the result values + my @results = $coro->join; + + # now $coro goes out of scope, and presumably gets freed + }; + +=back + =cut package Coro; -use strict qw(vars subs); -no warnings "uninitialized"; +use common::sense; + +use Carp (); + +use Guard (); use Coro::State; use base qw(Coro::State Exporter); our $idle; # idle handler -our $main; # main coroutine -our $current; # current coroutine +our $main; # main coro +our $current; # current coro -our $VERSION = 5.12; +our $VERSION = 6.511; -our @EXPORT = qw(async async_pool cede schedule terminate current unblock_sub); +our @EXPORT = qw(async async_pool cede schedule terminate current unblock_sub rouse_cb rouse_wait); our %EXPORT_TAGS = ( prio => [qw(PRIO_MAX PRIO_HIGH PRIO_NORMAL PRIO_LOW PRIO_IDLE PRIO_MIN)], ); @@ -90,9 +382,9 @@ =item $Coro::main -This variable stores the coroutine object that represents the main -program. While you cna C it and do most other things you can do to -coroutines, it is mainly useful to compare again C<$Coro::current>, to see +This variable stores the Coro object that represents the main +program. While you can C it and do most other things you can do to +coro, it is mainly useful to compare again C<$Coro::current>, to see whether you are running in the main program or not. =cut @@ -101,12 +393,12 @@ =item $Coro::current -The coroutine object representing the current coroutine (the last -coroutine that the Coro scheduler switched to). The initial value is +The Coro object representing the current coro (the last +coro that the Coro scheduler switched to). The initial value is C<$Coro::main> (of course). This variable is B I. You can take copies of the -value stored in it and use it as any other coroutine object, but you must +value stored in it and use it as any other Coro object, but you must not otherwise modify the variable itself. =cut @@ -119,48 +411,35 @@ usually better to rely on L or L, as this is pretty low-level functionality. -This variable stores either a coroutine or a callback. +This variable stores a Coro object that is put into the ready queue when +there are no other ready threads (without invoking any ready hooks). -If it is a callback, the it is called whenever the scheduler finds no -ready coroutines to run. The default implementation prints "FATAL: -deadlock detected" and exits, because the program has no other way to -continue. - -If it is a coroutine object, then this object will be readied (without -invoking any ready hooks, however) when the scheduler finds no other ready -coroutines to run. +The default implementation dies with "FATAL: deadlock detected.", followed +by a thread listing, because the program has no other way to continue. This hook is overwritten by modules such as C and -C to wait on an external event that hopefully wake up a -coroutine so the scheduler can run it. - -Note that the callback I, under any circumstances, block -the current coroutine. Normally, this is achieved by having an "idle -coroutine" that calls the event loop and then blocks again, and then -readying that coroutine in the idle handler, or by simply placing the idle -coroutine in this variable. - -See L or L for examples of using this -technique. +C to wait on an external event that hopefully wakes up a +coro so the scheduler can run it. -Please note that if your callback recursively invokes perl (e.g. for event -handlers), then it must be prepared to be called recursively itself. +See L or L for examples of using this technique. =cut -$idle = sub { - require Carp; - Carp::croak ("FATAL: deadlock detected"); +# ||= because other modules could have provided their own by now +$idle ||= new Coro sub { + require Coro::Debug; + die "FATAL: deadlock detected.\n" + . Coro::Debug::ps_listing (); }; -# this coroutine is necessary because a coroutine +# this coro is necessary because a coro # cannot destroy itself. our @destroy; our $manager; $manager = new Coro sub { while () { - Coro::_cancel shift @destroy + _destroy shift @destroy while @destroy; &schedule; @@ -171,76 +450,69 @@ =back -=head1 SIMPLE COROUTINE CREATION +=head1 SIMPLE CORO CREATION =over 4 =item async { ... } [@args...] -Create a new coroutine and return its coroutine object (usually -unused). The coroutine will be put into the ready queue, so +Create a new coro and return its Coro object (usually +unused). The coro will be put into the ready queue, so it will start running automatically on the next scheduler run. The first argument is a codeblock/closure that should be executed in the -coroutine. When it returns argument returns the coroutine is automatically +coro. When it returns argument returns the coro is automatically terminated. The remaining arguments are passed as arguments to the closure. -See the C constructor for info about the coroutine -environment in which coroutines are executed. +See the C constructor for info about the coro +environment in which coro are executed. -Calling C in a coroutine will do the same as calling exit outside -the coroutine. Likewise, when the coroutine dies, the program will exit, +Calling C in a coro will do the same as calling exit outside +the coro. Likewise, when the coro dies, the program will exit, just as it would in the main program. If you do not want that, you can provide a default C handler, or simply avoid dieing (by use of C). -Example: Create a new coroutine that just prints its arguments. +Example: Create a new coro that just prints its arguments. async { print "@_\n"; } 1,2,3,4; -=cut - -sub async(&@) { - my $coro = new Coro @_; - $coro->ready; - $coro -} - =item async_pool { ... } [@args...] -Similar to C, but uses a coroutine pool, so you should not call +Similar to C, but uses a coro pool, so you should not call terminate or join on it (although you are allowed to), and you get a -coroutine that might have executed other code already (which can be good +coro that might have executed other code already (which can be good or bad :). On the plus side, this function is about twice as fast as creating (and -destroying) a completely new coroutine, so if you need a lot of generic -coroutines in quick successsion, use C, not C. +destroying) a completely new coro, so if you need a lot of generic +coros in quick successsion, use C, not C. The code block is executed in an C context and a warning will be issued in case of an exception instead of terminating the program, as -C does. As the coroutine is being reused, stuff like C +C does. As the coro is being reused, stuff like C will not work in the expected way, unless you call terminate or cancel, which somehow defeats the purpose of pooling (but is fine in the exceptional case). -The priority will be reset to C<0> after each run, tracing will be -disabled, the description will be reset and the default output filehandle -gets restored, so you can change all these. Otherwise the coroutine will -be re-used "as-is": most notably if you change other per-coroutine global -stuff such as C<$/> you I revert that change, which is most -simply done by using local as in: C<< local $/ >>. +The priority will be reset to C<0> after each run, all C calls +will be undone, tracing will be disabled, the description will be reset +and the default output filehandle gets restored, so you can change all +these. Otherwise the coro will be re-used "as-is": most notably if you +change other per-coro global stuff such as C<$/> you I revert +that change, which is most simply done by using local as in: C<< local $/ +>>. -The idle pool size is limited to C<8> idle coroutines (this can be +The idle pool size is limited to C<8> idle coros (this can be adjusted by changing $Coro::POOL_SIZE), but there can be as many non-idle coros as required. -If you are concerned about pooled coroutines growing a lot because a +If you are concerned about pooled coros growing a lot because a single C used a lot of stackspace you can e.g. C once per second or so to slowly replenish the pool. In addition to that, when the stacks used by a handler grows larger than 32kb @@ -267,28 +539,28 @@ =head1 STATIC METHODS Static methods are actually functions that implicitly operate on the -current coroutine. +current coro. =over 4 =item schedule -Calls the scheduler. The scheduler will find the next coroutine that is -to be run from the ready queue and switches to it. The next coroutine +Calls the scheduler. The scheduler will find the next coro that is +to be run from the ready queue and switches to it. The next coro to be run is simply the one with the highest priority that is longest -in its ready queue. If there is no coroutine ready, it will clal the +in its ready queue. If there is no coro ready, it will call the C<$Coro::idle> hook. -Please note that the current coroutine will I be put into the ready +Please note that the current coro will I be put into the ready queue, so calling this function usually means you will never be called again unless something else (e.g. an event handler) calls C<< ->ready >>, thus waking you up. This makes C I generic method to use to block the current -coroutine and wait for events: first you remember the current coroutine in +coro and wait for events: first you remember the current coro in a variable, then arrange for some callback of yours to call C<< ->ready >> on that once some event happens, and last you call C to put -yourself to sleep. Note that a lot of things can wake your coroutine up, +yourself to sleep. Note that a lot of things can wake your coro up, so you need to check whether the event indeed happened, e.g. by storing the status in a variable. @@ -296,10 +568,10 @@ =item cede -"Cede" to other coroutines. This function puts the current coroutine into +"Cede" to other coros. This function puts the current coro into the ready queue and calls C, which has the effect of giving -up the current "timeslice" to other coroutines of the same or higher -priority. Once your coroutine gets its turn again it will automatically be +up the current "timeslice" to other coros of the same or higher +priority. Once your coro gets its turn again it will automatically be resumed. This function is often called C in other languages. @@ -307,22 +579,108 @@ =item Coro::cede_notself Works like cede, but is not exported by default and will cede to I -coroutine, regardless of priority. This is useful sometimes to ensure +coro, regardless of priority. This is useful sometimes to ensure progress is made. =item terminate [arg...] -Terminates the current coroutine with the given status values (see L). +Terminates the current coro with the given status values (see +L). The values will not be copied, but referenced directly. + +=item Coro::on_enter BLOCK, Coro::on_leave BLOCK + +These function install enter and leave winders in the current scope. The +enter block will be executed when on_enter is called and whenever the +current coro is re-entered by the scheduler, while the leave block is +executed whenever the current coro is blocked by the scheduler, and +also when the containing scope is exited (by whatever means, be it exit, +die, last etc.). + +I. That means: do not even think about calling C without an +eval, and do not even think of entering the scheduler in any way. + +Since both BLOCKs are tied to the current scope, they will automatically +be removed when the current scope exits. + +These functions implement the same concept as C in scheme +does, and are useful when you want to localise some resource to a specific +coro. + +They slow down thread switching considerably for coros that use them +(about 40% for a BLOCK with a single assignment, so thread switching is +still reasonably fast if the handlers are fast). + +These functions are best understood by an example: The following function +will change the current timezone to "Antarctica/South_Pole", which +requires a call to C, but by using C and C, +which remember/change the current timezone and restore the previous +value, respectively, the timezone is only changed for the coro that +installed those handlers. + + use POSIX qw(tzset); + + async { + my $old_tz; # store outside TZ value here + + Coro::on_enter { + $old_tz = $ENV{TZ}; # remember the old value + + $ENV{TZ} = "Antarctica/South_Pole"; + tzset; # enable new value + }; + + Coro::on_leave { + $ENV{TZ} = $old_tz; + tzset; # restore old value + }; + + # at this place, the timezone is Antarctica/South_Pole, + # without disturbing the TZ of any other coro. + }; + +This can be used to localise about any resource (locale, uid, current +working directory etc.) to a block, despite the existance of other +coros. + +Another interesting example implements time-sliced multitasking using +interval timers (this could obviously be optimised, but does the job): + + # "timeslice" the given block + sub timeslice(&) { + use Time::HiRes (); + + Coro::on_enter { + # on entering the thread, we set an VTALRM handler to cede + $SIG{VTALRM} = sub { cede }; + # and then start the interval timer + Time::HiRes::setitimer &Time::HiRes::ITIMER_VIRTUAL, 0.01, 0.01; + }; + Coro::on_leave { + # on leaving the thread, we stop the interval timer again + Time::HiRes::setitimer &Time::HiRes::ITIMER_VIRTUAL, 0, 0; + }; + + &{+shift}; + } + + # use like this: + timeslice { + # The following is an endless loop that would normally + # monopolise the process. Since it runs in a timesliced + # environment, it will regularly cede to other threads. + while () { } + }; + =item killall -Kills/terminates/cancels all coroutines except the currently running -one. This is useful after a fork, either in the child or the parent, as -usually only one of them should inherit the running coroutines. - -Note that while this will try to free some of the main programs resources, -you cannot free all of them, so if a coroutine that is not the main -program calls this function, there will be some one-time resource leak. +Kills/terminates/cancels all coros except the currently running one. + +Note that while this will try to free some of the main interpreter +resources if the calling coro isn't the main coro, but one +cannot free all of them, so if a coro that is not the main coro +calls this function, there will be some one-time resource leak. =cut @@ -335,22 +693,22 @@ =back -=head1 COROUTINE OBJECT METHODS +=head1 CORO OBJECT METHODS -These are the methods you can call on coroutine objects (or to create +These are the methods you can call on coro objects (or to create them). =over 4 =item new Coro \&sub [, @args...] -Create a new coroutine and return it. When the sub returns, the coroutine +Create a new coro and return it. When the sub returns, the coro automatically terminates as if C with the returned values were -called. To make the coroutine run you must first put it into the ready +called. To make the coro run you must first put it into the ready queue by calling the ready method. See C and C for additional info about the -coroutine environment. +coro environment. =cut @@ -358,126 +716,219 @@ terminate &{+shift}; } -=item $success = $coroutine->ready +=item $success = $coro->ready -Put the given coroutine into the end of its ready queue (there is one -queue for each priority) and return true. If the coroutine is already in +Put the given coro into the end of its ready queue (there is one +queue for each priority) and return true. If the coro is already in the ready queue, do nothing and return false. -This ensures that the scheduler will resume this coroutine automatically -once all the coroutines of higher priority and all coroutines of the same +This ensures that the scheduler will resume this coro automatically +once all the coro of higher priority and all coro of the same priority that were put into the ready queue earlier have been resumed. -=item $is_ready = $coroutine->is_ready +=item $coro->suspend -Return whether the coroutine is currently the ready queue or not, +Suspends the specified coro. A suspended coro works just like any other +coro, except that the scheduler will not select a suspended coro for +execution. -=item $coroutine->cancel (arg...) +Suspending a coro can be useful when you want to keep the coro from +running, but you don't want to destroy it, or when you want to temporarily +freeze a coro (e.g. for debugging) to resume it later. -Terminates the given coroutine and makes it return the given arguments as -status (default: the empty list). Never returns if the coroutine is the -current coroutine. +A scenario for the former would be to suspend all (other) coros after a +fork and keep them alive, so their destructors aren't called, but new +coros can be created. -=cut +=item $coro->resume + +If the specified coro was suspended, it will be resumed. Note that when +the coro was in the ready queue when it was suspended, it might have been +unreadied by the scheduler, so an activation might have been lost. + +To avoid this, it is best to put a suspended coro into the ready queue +unconditionally, as every synchronisation mechanism must protect itself +against spurious wakeups, and the one in the Coro family certainly do +that. + +=item $state->is_new + +Returns true iff this Coro object is "new", i.e. has never been run +yet. Those states basically consist of only the code reference to call and +the arguments, but consumes very little other resources. New states will +automatically get assigned a perl interpreter when they are transfered to. + +=item $state->is_zombie + +Returns true iff the Coro object has been cancelled, i.e. +it's resources freed because they were C'ed, C'd, +C'ed or simply went out of scope. + +The name "zombie" stems from UNIX culture, where a process that has +exited and only stores and exit status and no other resources is called a +"zombie". + +=item $is_ready = $coro->is_ready + +Returns true iff the Coro object is in the ready queue. Unless the Coro +object gets destroyed, it will eventually be scheduled by the scheduler. + +=item $is_running = $coro->is_running + +Returns true iff the Coro object is currently running. Only one Coro object +can ever be in the running state (but it currently is possible to have +multiple running Coro::States). + +=item $is_suspended = $coro->is_suspended + +Returns true iff this Coro object has been suspended. Suspended Coros will +not ever be scheduled. + +=item $coro->cancel (arg...) -sub cancel { - my $self = shift; +Terminates the given Coro thread and makes it return the given arguments as +status (default: an empty list). Never returns if the Coro is the +current Coro. - if ($current == $self) { - terminate @_; - } else { - $self->{_status} = [@_]; - $self->_cancel; +This is a rather brutal way to free a coro, with some limitations - if +the thread is inside a C callback that doesn't expect to be canceled, +bad things can happen, or if the cancelled thread insists on running +complicated cleanup handlers that rely on its thread context, things will +not work. + +Any cleanup code being run (e.g. from C blocks, destructors and so +on) will be run without a thread context, and is not allowed to switch +to other threads. A common mistake is to call C<< ->cancel >> from a +destructor called by die'ing inside the thread to be cancelled for +example. + +On the plus side, C<< ->cancel >> will always clean up the thread, no +matter what. If your cleanup code is complex or you want to avoid +cancelling a C-thread that doesn't know how to clean up itself, it can be +better to C<< ->throw >> an exception, or use C<< ->safe_cancel >>. + +The arguments to C<< ->cancel >> are not copied, but instead will +be referenced directly (e.g. if you pass C<$var> and after the call +change that variable, then you might change the return values passed to +e.g. C, so don't do that). + +The resources of the Coro are usually freed (or destructed) before this +call returns, but this can be delayed for an indefinite amount of time, as +in some cases the manager thread has to run first to actually destruct the +Coro object. + +=item $coro->safe_cancel ($arg...) + +Works mostly like C<< ->cancel >>, but is inherently "safer", and +consequently, can fail with an exception in cases the thread is not in a +cancellable state. Essentially, C<< ->safe_cancel >> is a C<< ->cancel >> +with extra checks before canceling. + +It works a bit like throwing an exception that cannot be caught - +specifically, it will clean up the thread from within itself, so all +cleanup handlers (e.g. C blocks) are run with full thread +context and can block if they wish. The downside is that there is no +guarantee that the thread can be cancelled when you call this method, and +therefore, it might fail. It is also considerably slower than C or +C. + +A thread is in a safe-cancellable state if it either hasn't been run yet, +or it has no C context attached and is inside an SLF function. + +The latter two basically mean that the thread isn't currently inside a +perl callback called from some C function (usually via some XS modules) +and isn't currently executing inside some C function itself (via Coro's XS +API). + +This call returns true when it could cancel the thread, or croaks with an +error otherwise (i.e. it either returns true or doesn't return at all). + +Why the weird interface? Well, there are two common models on how and +when to cancel things. In the first, you have the expectation that your +coro thread can be cancelled when you want to cancel it - if the thread +isn't cancellable, this would be a bug somewhere, so C<< ->safe_cancel >> +croaks to notify of the bug. + +In the second model you sometimes want to ask nicely to cancel a thread, +but if it's not a good time, well, then don't cancel. This can be done +relatively easy like this: + + if (! eval { $coro->safe_cancel }) { + warn "unable to cancel thread: $@"; } -} -=item $coroutine->schedule_to +However, what you never should do is first try to cancel "safely" and +if that fails, cancel the "hard" way with C<< ->cancel >>. That makes +no sense: either you rely on being able to execute cleanup code in your +thread context, or you don't. If you do, then C<< ->safe_cancel >> is the +only way, and if you don't, then C<< ->cancel >> is always faster and more +direct. + +=item $coro->schedule_to -Puts the current coroutine to sleep (like C), but instead +Puts the current coro to sleep (like C), but instead of continuing with the next coro from the ready queue, always switch to -the given coroutine object (regardless of priority etc.). The readyness -state of that coroutine isn't changed. +the given coro object (regardless of priority etc.). The readyness +state of that coro isn't changed. This is an advanced method for special cases - I'd love to hear about any uses for this one. -=item $coroutine->cede_to +=item $coro->cede_to -Like C, but puts the current coroutine into the ready +Like C, but puts the current coro into the ready queue. This has the effect of temporarily switching to the given -coroutine, and continuing some time later. +coro, and continuing some time later. This is an advanced method for special cases - I'd love to hear about any uses for this one. -=item $coroutine->throw ([$scalar]) +=item $coro->throw ([$scalar]) If C<$throw> is specified and defined, it will be thrown as an exception -inside the coroutine at the next convenient point in time. Otherwise +inside the coro at the next convenient point in time. Otherwise clears the exception object. Coro will check for the exception each time a schedule-like-function returns, i.e. after each C, C, C<< Coro::Semaphore->down ->>, C<< Coro::Handle->readable >> and so on. Most of these functions -detect this case and return early in case an exception is pending. +>>, C<< Coro::Handle->readable >> and so on. Most of those functions (all +that are part of Coro itself) detect this case and return early in case an +exception is pending. The exception object will be thrown "as is" with the specified scalar in C<$@>, i.e. if it is a string, no line number or newline will be appended (unlike with C). -This can be used as a softer means than C to ask a coroutine to -end itself, although there is no guarantee that the exception will lead to -termination, and if the exception isn't caught it might well end the whole -program. +This can be used as a softer means than either C or Cto ask a coro to end itself, although there is no guarantee that the +exception will lead to termination, and if the exception isn't caught it +might well end the whole program. You might also think of C as being the moral equivalent of -Cing a coroutine with a signal (in this case, a scalar). +Cing a coro with a signal (in this case, a scalar). -=item $coroutine->join +=item $coro->join -Wait until the coroutine terminates and return any values given to the +Wait until the coro terminates and return any values given to the C or C functions. C can be called concurrently -from multiple coroutines, and all will be resumed and given the status -return once the C<$coroutine> terminates. +from multiple threads, and all will be resumed and given the status +return once the C<$coro> terminates. -=cut +=item $coro->on_destroy (\&cb) -sub join { - my $self = shift; +Registers a callback that is called when this coro thread gets destroyed, +that is, after it's resources have been freed but before it is joined. The +callback gets passed the terminate/cancel arguments, if any, and I die, under any circumstances. - unless ($self->{_status}) { - my $current = $current; +There can be any number of C callbacks per coro, and there is +currently no way to remove a callback once added. - push @{$self->{_on_destroy}}, sub { - $current->ready; - undef $current; - }; - - &schedule while $current; - } - - wantarray ? @{$self->{_status}} : $self->{_status}[0]; -} - -=item $coroutine->on_destroy (\&cb) - -Registers a callback that is called when this coroutine gets destroyed, -but before it is joined. The callback gets passed the terminate arguments, -if any, and I die, under any circumstances. - -=cut - -sub on_destroy { - my ($self, $cb) = @_; - - push @{ $self->{_on_destroy} }, $cb; -} - -=item $oldprio = $coroutine->prio ($newprio) +=item $oldprio = $coro->prio ($newprio) Sets (or gets, if the argument is missing) the priority of the -coroutine. Higher priority coroutines get run before lower priority -coroutines. Priorities are small signed integers (currently -4 .. +3), +coro thread. Higher priority coro get run before lower priority +coros. Priorities are small signed integers (currently -4 .. +3), that you can refer to using PRIO_xxx constants (use the import tag :prio to get then): @@ -485,29 +936,40 @@ 3 > 1 > 0 > -1 > -3 > -4 # set priority to HIGH - current->prio(PRIO_HIGH); + current->prio (PRIO_HIGH); -The idle coroutine ($Coro::idle) always has a lower priority than any -existing coroutine. +The idle coro thread ($Coro::idle) always has a lower priority than any +existing coro. -Changing the priority of the current coroutine will take effect immediately, -but changing the priority of coroutines in the ready queue (but not -running) will only take effect after the next schedule (of that -coroutine). This is a bug that will be fixed in some future version. +Changing the priority of the current coro will take effect immediately, +but changing the priority of a coro in the ready queue (but not running) +will only take effect after the next schedule (of that coro). This is a +bug that will be fixed in some future version. -=item $newprio = $coroutine->nice ($change) +=item $newprio = $coro->nice ($change) Similar to C, but subtract the given value from the priority (i.e. -higher values mean lower priority, just as in unix). +higher values mean lower priority, just as in UNIX's nice command). -=item $olddesc = $coroutine->desc ($newdesc) +=item $olddesc = $coro->desc ($newdesc) Sets (or gets in case the argument is missing) the description for this -coroutine. This is just a free-form string you can associate with a -coroutine. +coro thread. This is just a free-form string you can associate with a +coro. -This method simply sets the C<< $coroutine->{desc} >> member to the given -string. You can modify this member directly if you wish. +This method simply sets the C<< $coro->{desc} >> member to the given +string. You can modify this member directly if you wish, and in fact, this +is often preferred to indicate major processing states that can then be +seen for example in a L session: + + sub my_long_function { + local $Coro::current->{desc} = "now in my_long_function"; + ... + $Coro::current->{desc} = "my_long_function: phase 1"; + ... + $Coro::current->{desc} = "my_long_function: phase 2"; + ... + } =cut @@ -530,12 +992,12 @@ =item Coro::nready -Returns the number of coroutines that are currently in the ready state, +Returns the number of coro that are currently in the ready state, i.e. that can be switched to by calling C directory or -indirectly. The value C<0> means that the only runnable coroutine is the +indirectly. The value C<0> means that the only runnable coro is the currently running one, so C would have no effect, and C would cause a deadlock unless there is an idle handler that wakes up some -coroutines. +coro. =item my $guard = Coro::guard { ... } @@ -552,16 +1014,21 @@ returning a new coderef. Unblocking means that calling the new coderef will return immediately without blocking, returning nothing, while the original code ref will be called (with parameters) from within another -coroutine. +coro. -The reason this function exists is that many event libraries (such as the -venerable L module) are not coroutine-safe (a weaker form +The reason this function exists is that many event libraries (such as +the venerable L module) are not thread-safe (a weaker form of reentrancy). This means you must not block within event callbacks, otherwise you might suffer from crashes or worse. The only event library -currently known that is safe to use without C is L. +currently known that is safe to use without C is L (but +you might still run into deadlocks if all event loops are blocked). + +Coro will try to catch you when you block in the event loop +("FATAL: $Coro::idle blocked itself"), but this is just best effort and +only works when you do not run your own event loop. This function allows your callbacks to block by executing them in another -coroutine where it is safe to block. One example where blocking is handy +coro where it is safe to block. One example where blocking is handy is when you use the L functions to save results to disk, for example. @@ -569,8 +1036,8 @@ creating event callbacks that want to block. If your handler does not plan to block (e.g. simply sends a message to -another coroutine, or puts some other coroutine into the ready queue), -there is no reason to use C. +another coro, or puts some other coro into the ready queue), there is +no reason to use C. Note that you also need to use C for any other callbacks that are indirectly executed by any C-based event loop. For example, when you @@ -610,22 +1077,24 @@ } } -=item $cb = Coro::rouse_cb +=item $cb = rouse_cb Create and return a "rouse callback". That's a code reference that, when called, will remember a copy of its arguments and notify the owner -coroutine of the callback. +coro of the callback. See the next function. -=item @args = Coro::rouse_wait [$cb] +=item @args = rouse_wait [$cb] Wait for the specified rouse callback (or the last one that was created in -this coroutine). +this coro). As soon as the callback is invoked (or when the callback was invoked before C), it will return the arguments originally passed to -the rouse callback. +the rouse callback. In scalar context, that means you get the I +argument, just as if C had a C +statement at the end. See the section B for an actual usage example. @@ -633,16 +1102,30 @@ =cut +for my $module (qw(Channel RWLock Semaphore SemaphoreSet Signal Specific)) { + my $old = defined &{"Coro::$module\::new"} && \&{"Coro::$module\::new"}; + + *{"Coro::$module\::new"} = sub { + require "Coro/$module.pm"; + + # some modules have their new predefined in State.xs, some don't + *{"Coro::$module\::new"} = $old + if $old; + + goto &{"Coro::$module\::new"}; + }; +} + 1; =head1 HOW TO WAIT FOR A CALLBACK -It is very common for a coroutine to wait for some callback to be -called. This occurs naturally when you use coroutines in an otherwise +It is very common for a coro to wait for some callback to be +called. This occurs naturally when you use coro in an otherwise event-based program, or when you use event-based libraries. These typically register a callback for some event, and call that callback -when the event occured. In a coroutine, however, you typically want to +when the event occured. In a coro, however, you typically want to just wait for the event, simplyifying things. For example C<< AnyEvent->child >> registers a callback to be called when @@ -650,15 +1133,15 @@ my $child_watcher = AnyEvent->child (pid => $pid, cb => sub { ... }); -But from withina coroutine, you often just want to write this: +But from within a coro, you often just want to write this: my $status = wait_for_child $pid; Coro offers two functions specifically designed to make this easy, -C and C. +C and C. The first function, C, generates and returns a callback that, -when invoked, will save its arguments and notify the coroutine that +when invoked, will save its arguments and notify the coro that created the callback. The second function, C, waits for the callback to be called @@ -671,19 +1154,19 @@ sub wait_for_child($) { my ($pid) = @_; - my $watcher = AnyEvent->child (pid => $pid, cb => Coro::rouse_cb); + my $watcher = AnyEvent->child (pid => $pid, cb => rouse_cb); - my ($rpid, $rstatus) = Coro::rouse_wait; + my ($rpid, $rstatus) = rouse_wait; $rstatus } In the case where C and C are not flexible enough, -you can roll your own, using C: +you can roll your own, using C and C: sub wait_for_child($) { my ($pid) = @_; - # store the current coroutine in $current, + # store the current coro in $current, # and provide result variables for the closure passed to ->child my $current = $Coro::current; my ($done, $rstatus); @@ -691,7 +1174,8 @@ # pass a closure to ->child my $watcher = AnyEvent->child (pid => $pid, cb => sub { $rstatus = $_[1]; # remember rstatus - $done = 1; # mark $rstatus as valud + $done = 1; # mark $rstatus as valid + $current->ready; # wake up the waiting thread }); # wait until the closure has been called @@ -709,7 +1193,7 @@ When Coro is compiled using the pthread backend (which isn't recommended but required on many BSDs as their libcs are completely broken), then -coroutines will not survive a fork. There is no known workaround except to +coro will not survive a fork. There is no known workaround except to fix your libc and use a saner backend. =item perl process emulation ("threads") @@ -721,19 +1205,84 @@ the windows process emulation enabled under unix roughly halves perl performance, even when not used. -=item coroutine switching not signal safe +Attempts to use threads created in another emulated process will crash +("cleanly", with a null pointer exception). + +=item coro switching is not signal safe -You must not switch to another coroutine from within a signal handler -(only relevant with %SIG - most event libraries provide safe signals). +You must not switch to another coro from within a signal handler (only +relevant with %SIG - most event libraries provide safe signals), I +you are sure you are not interrupting a Coro function. That means you I call any function that might "block" the -current coroutine - C, C C<< Coro::Semaphore->down >> or +current coro - C, C C<< Coro::Semaphore->down >> or anything that calls those. Everything else, including calling C, works. =back +=head1 WINDOWS PROCESS EMULATION + +A great many people seem to be confused about ithreads (for example, Chip +Salzenberg called me unintelligent, incapable, stupid and gullible, +while in the same mail making rather confused statements about perl +ithreads (for example, that memory or files would be shared), showing his +lack of understanding of this area - if it is hard to understand for Chip, +it is probably not obvious to everybody). + +What follows is an ultra-condensed version of my talk about threads in +scripting languages given on the perl workshop 2009: + +The so-called "ithreads" were originally implemented for two reasons: +first, to (badly) emulate unix processes on native win32 perls, and +secondly, to replace the older, real thread model ("5.005-threads"). + +It does that by using threads instead of OS processes. The difference +between processes and threads is that threads share memory (and other +state, such as files) between threads within a single process, while +processes do not share anything (at least not semantically). That +means that modifications done by one thread are seen by others, while +modifications by one process are not seen by other processes. + +The "ithreads" work exactly like that: when creating a new ithreads +process, all state is copied (memory is copied physically, files and code +is copied logically). Afterwards, it isolates all modifications. On UNIX, +the same behaviour can be achieved by using operating system processes, +except that UNIX typically uses hardware built into the system to do this +efficiently, while the windows process emulation emulates this hardware in +software (rather efficiently, but of course it is still much slower than +dedicated hardware). + +As mentioned before, loading code, modifying code, modifying data +structures and so on is only visible in the ithreads process doing the +modification, not in other ithread processes within the same OS process. + +This is why "ithreads" do not implement threads for perl at all, only +processes. What makes it so bad is that on non-windows platforms, you can +actually take advantage of custom hardware for this purpose (as evidenced +by the forks module, which gives you the (i-) threads API, just much +faster). + +Sharing data is in the i-threads model is done by transfering data +structures between threads using copying semantics, which is very slow - +shared data simply does not exist. Benchmarks using i-threads which are +communication-intensive show extremely bad behaviour with i-threads (in +fact, so bad that Coro, which cannot take direct advantage of multiple +CPUs, is often orders of magnitude faster because it shares data using +real threads, refer to my talk for details). + +As summary, i-threads *use* threads to implement processes, while +the compatible forks module *uses* processes to emulate, uhm, +processes. I-threads slow down every perl program when enabled, and +outside of windows, serve no (or little) practical purpose, but +disadvantages every single-threaded Perl program. + +This is the reason that I try to avoid the name "ithreads", as it is +misleading as it implies that it implements some kind of thread model for +perl, and prefer the name "windows process emulation", which describes the +actual use and behaviour of it much better. + =head1 SEE ALSO Event-Loop integration: L, L, L. @@ -755,10 +1304,10 @@ Low level Configuration, Thread Environment, Continuations: L. -=head1 AUTHOR +=head1 AUTHOR/SUPPORT/CONTACT - Marc Lehmann - http://home.schmorp.de/ + Marc A. Lehmann + http://software.schmorp.de/pkg/Coro.html =cut