[ViewVC] Diff of: cvs/Coro/README

Comparing Coro/README (file contents):
Revision 1.21 by root, Mon Dec 15 20:52:04 2008 UTC vs.
Revision 1.27 by root, Fri Dec 25 07:17:12 2009 UTC

 NAME
     Coro - the only real threads in perl
 SYNOPSIS
       use Coro;
-  async {
+      async {
          # some asynchronous thread of execution
          print "2\n";
          cede; # yield back to main
          print "4\n";
       };
       print "1\n";
       cede; # yield to coro
       print "3\n";
       cede; # and again
-  # use locking
+      # use locking
       use Coro::Semaphore;
       my $lock = new Coro::Semaphore;
       my $locked;
-  $lock->down;
+      $lock->down;
       $locked = 1;
       $lock->up;
 DESCRIPTION
     For a tutorial-style introduction, please read the Coro::Intro manpage.
     easily-identified points in your program, so locking and parallel access
     are rarely an issue, making thread programming much safer and easier
     than using other thread models.
     Unlike the so-called "Perl threads" (which are not actually real threads
-    but only the windows process emulation ported to unix, and as such act
+    but only the windows process emulation (see section of same name for
-    as processes), Coro provides a full shared address space, which makes
+    more details) ported to unix, and as such act as processes), Coro
-    communication between threads very easy. And Coro's threads are fast,
+    provides a full shared address space, which makes communication between
+    threads very easy. And Coro's threads are fast, too: disabling the
-    too: disabling the Windows process emulation code in your perl and using
+    Windows process emulation code in your perl and using Coro can easily
-    Coro can easily result in a two to four times speed increase for your
+    result in a two to four times speed increase for your programs. A
-    programs. A parallel matrix multiplication benchmark runs over 300 times
+    parallel matrix multiplication benchmark runs over 300 times faster on a
-    faster on a single core than perl's pseudo-threads on a quad core using
+    single core than perl's pseudo-threads on a quad core using all four
-    all four cores.
+    cores.
     Coro achieves that by supporting multiple running interpreters that
     share data, which is especially useful to code pseudo-parallel processes
     and for event-based programming, such as multiple HTTP-GET requests
     running concurrently. See Coro::AnyEvent to learn more on how to
     $Coro::idle
         This variable is mainly useful to integrate Coro into event loops.
         It is usually better to rely on Coro::AnyEvent or Coro::EV, as this
         is pretty low-level functionality.
-        This variable stores either a Coro object or a callback.
+        This variable stores a Coro object that is put into the ready queue
+        when there are no other ready threads (without invoking any ready
+        hooks).
-        If it is a callback, the it is called whenever the scheduler finds
+        The default implementation dies with "FATAL: deadlock detected.",
-        no ready coros to run. The default implementation prints "FATAL:
+        followed by a thread listing, because the program has no other way
-        deadlock detected" and exits, because the program has no other way
         to continue.
-        If it is a coro object, then this object will be readied (without
-        invoking any ready hooks, however) when the scheduler finds no other
-        ready coros to run.
         This hook is overwritten by modules such as "Coro::EV" and
         "Coro::AnyEvent" to wait on an external event that hopefully wake up
         a coro so the scheduler can run it.
-        Note that the callback *must not*, under any circumstances, block
-        the current coro. Normally, this is achieved by having an "idle
-        coro" that calls the event loop and then blocks again, and then
-        readying that coro in the idle handler, or by simply placing the
-        idle coro in this variable.
-        See Coro::Event or Coro::AnyEvent for examples of using this
+        See Coro::EV or Coro::AnyEvent for examples of using this technique.
-        technique.
-        Please note that if your callback recursively invokes perl (e.g. for
-        event handlers), then it must be prepared to be called recursively
-        itself.
 SIMPLE CORO CREATION
     async { ... } [@args...]
         Create a new coro and return its Coro object (usually unused). The
         coro will be put into the ready queue, so it will start running
     schedule
         Calls the scheduler. The scheduler will find the next coro that is
         to be run from the ready queue and switches to it. The next coro to
         be run is simply the one with the highest priority that is longest
-        in its ready queue. If there is no coro ready, it will clal the
+        in its ready queue. If there is no coro ready, it will call the
         $Coro::idle hook.
         Please note that the current coro will *not* be put into the ready
         queue, so calling this function usually means you will never be
         called again unless something else (e.g. an event handler) calls
         These functions implement the same concept as "dynamic-wind" in
         scheme does, and are useful when you want to localise some resource
         to a specific coro.
-        They slow down coro switching considerably for coros that use them
+        They slow down thread switching considerably for coros that use them
+        (about 40% for a BLOCK with a single assignment, so thread switching
-        (But coro switching is still reasonably fast if the handlers are
+        is still reasonably fast if the handlers are fast).
-        fast).
         These functions are best understood by an example: The following
         function will change the current timezone to
         "Antarctica/South_Pole", which requires a call to "tzset", but by
         using "on_enter" and "on_leave", which remember/change the current
         timezone and restore the previous value, respectively, the timezone
-        is only changes for the coro that installed those handlers.
+        is only changed for the coro that installed those handlers.
            use POSIX qw(tzset);
            async {
               my $old_tz; # store outside TZ value here
            };
         This can be used to localise about any resource (locale, uid,
         current working directory etc.) to a block, despite the existance of
         other coros.
+        Another interesting example implements time-sliced multitasking
+        using interval timers (this could obviously be optimised, but does
+        the job):
+           # "timeslice" the given block
+           sub timeslice(&) {
+              use Time::HiRes ();
+              Coro::on_enter {
+                 # on entering the thread, we set an VTALRM handler to cede
+                 $SIG{VTALRM} = sub { cede };
+                 # and then start the interval timer
+                 Time::HiRes::setitimer &Time::HiRes::ITIMER_VIRTUAL, 0.01, 0.01;
+              };
+              Coro::on_leave {
+                 # on leaving the thread, we stop the interval timer again
+                 Time::HiRes::setitimer &Time::HiRes::ITIMER_VIRTUAL, 0, 0;
+              };
+              &{+shift};
+           }
+           # use like this:
+           timeslice {
+              # The following is an endless loop that would normally
+              # monopolise the process. Since it runs in a timesliced
+              # environment, it will regularly cede to other threads.
+              while () { }
+           };
     killall
         Kills/terminates/cancels all coros except the currently running one.
         Note that while this will try to free some of the main interpreter
         This ensures that the scheduler will resume this coro automatically
         once all the coro of higher priority and all coro of the same
         priority that were put into the ready queue earlier have been
         resumed.
+    $coro->suspend
+        Suspends the specified coro. A suspended coro works just like any
+        other coro, except that the scheduler will not select a suspended
+        coro for execution.
+        Suspending a coro can be useful when you want to keep the coro from
+        running, but you don't want to destroy it, or when you want to
+        temporarily freeze a coro (e.g. for debugging) to resume it later.
+        A scenario for the former would be to suspend all (other) coros
+        after a fork and keep them alive, so their destructors aren't
+        called, but new coros can be created.
+    $coro->resume
+        If the specified coro was suspended, it will be resumed. Note that
+        when the coro was in the ready queue when it was suspended, it might
+        have been unreadied by the scheduler, so an activation might have
+        been lost.
+        To avoid this, it is best to put a suspended coro into the ready
+        queue unconditionally, as every synchronisation mechanism must
+        protect itself against spurious wakeups, and the one in the Coro
+        family certainly do that.
     $is_ready = $coro->is_ready
         Returns true iff the Coro object is in the ready queue. Unless the
         Coro object gets destroyed, it will eventually be scheduled by the
         scheduler.
         reentrancy). This means you must not block within event callbacks,
         otherwise you might suffer from crashes or worse. The only event
         library currently known that is safe to use without "unblock_sub" is
         EV.
+        Coro will try to catch you when you block in the event loop
+        ("FATAL:$Coro::IDLE blocked itself"), but this is just best effort
+        and only works when you do not run your own event loop.
         This function allows your callbacks to block by executing them in
         another coro where it is safe to block. One example where blocking
         is handy is when you use the Coro::AIO functions to save results to
         disk, for example.
         when you use a module that uses AnyEvent (and you use
         Coro::AnyEvent) and it provides callbacks that are the result of
         some event callback, then you must not block either, or use
         "unblock_sub".
-    $cb = Coro::rouse_cb
+    $cb = rouse_cb
         Create and return a "rouse callback". That's a code reference that,
         when called, will remember a copy of its arguments and notify the
         owner coro of the callback.
         See the next function.
-    @args = Coro::rouse_wait [$cb]
+    @args = rouse_wait [$cb]
         Wait for the specified rouse callback (or the last one that was
         created in this coro).
         As soon as the callback is invoked (or when the callback was invoked
         before "rouse_wait"), it will return the arguments originally passed
-        to the rouse callback.
+        to the rouse callback. In scalar context, that means you get the
+        *last* argument, just as if "rouse_wait" had a "return ($a1, $a2,
+        $a3...)" statement at the end.
         See the section HOW TO WAIT FOR A CALLBACK for an actual usage
         example.
 HOW TO WAIT FOR A CALLBACK
         unix roughly halves perl performance, even when not used.
     coro switching is not signal safe
         You must not switch to another coro from within a signal handler
         (only relevant with %SIG - most event libraries provide safe
-        signals).
+        signals), *unless* you are sure you are not interrupting a Coro
+        function.
         That means you *MUST NOT* call any function that might "block" the
         current coro - "cede", "schedule" "Coro::Semaphore->down" or
         anything that calls those. Everything else, including calling
         "ready", works.
+WINDOWS PROCESS EMULATION
+    A great many people seem to be confused about ithreads (for example,
+    Chip Salzenberg called me unintelligent, incapable, stupid and gullible,
+    while in the same mail making rather confused statements about perl
+    ithreads (for example, that memory or files would be shared), showing
+    his lack of understanding of this area - if it is hard to understand for
+    Chip, it is probably not obvious to everybody).
+    What follows is an ultra-condensed version of my talk about threads in
+    scripting languages given onthe perl workshop 2009:
+    The so-called "ithreads" were originally implemented for two reasons:
+    first, to (badly) emulate unix processes on native win32 perls, and
+    secondly, to replace the older, real thread model ("5.005-threads").
+    It does that by using threads instead of OS processes. The difference
+    between processes and threads is that threads share memory (and other
+    state, such as files) between threads within a single process, while
+    processes do not share anything (at least not semantically). That means
+    that modifications done by one thread are seen by others, while
+    modifications by one process are not seen by other processes.
+    The "ithreads" work exactly like that: when creating a new ithreads
+    process, all state is copied (memory is copied physically, files and
+    code is copied logically). Afterwards, it isolates all modifications. On
+    UNIX, the same behaviour can be achieved by using operating system
+    processes, except that UNIX typically uses hardware built into the
+    system to do this efficiently, while the windows process emulation
+    emulates this hardware in software (rather efficiently, but of course it
+    is still much slower than dedicated hardware).
+    As mentioned before, loading code, modifying code, modifying data
+    structures and so on is only visible in the ithreads process doing the
+    modification, not in other ithread processes within the same OS process.
+    This is why "ithreads" do not implement threads for perl at all, only
+    processes. What makes it so bad is that on non-windows platforms, you
+    can actually take advantage of custom hardware for this purpose (as
+    evidenced by the forks module, which gives you the (i-) threads API,
+    just much faster).
+    Sharing data is in the i-threads model is done by transfering data
+    structures between threads using copying semantics, which is very slow -
+    shared data simply does not exist. Benchmarks using i-threads which are
+    communication-intensive show extremely bad behaviour with i-threads (in
+    fact, so bad that Coro, which cannot take direct advantage of multiple
+    CPUs, is often orders of magnitude faster because it shares data using
+    real threads, refer to my talk for details).
+    As summary, i-threads *use* threads to implement processes, while the
+    compatible forks module *uses* processes to emulate, uhm, processes.
+    I-threads slow down every perl program when enabled, and outside of
+    windows, serve no (or little) practical purpose, but disadvantages every
+    single-threaded Perl program.
+    This is the reason that I try to avoid the name "ithreads", as it is
+    misleading as it implies that it implements some kind of thread model
+    for perl, and prefer the name "windows process emulation", which
+    describes the actual use and behaviour of it much better.
 SEE ALSO
     Event-Loop integration: Coro::AnyEvent, Coro::EV, Coro::Event.
     Debugging: Coro::Debug.

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing Coro/README (file contents): Revision 1.21 by root, Mon Dec 15 20:52:04 2008 UTC vs. Revision 1.27 by root, Fri Dec 25 07:17:12 2009 UTC

Diff Legend

Comparing Coro/README (file contents):
Revision 1.21 by root, Mon Dec 15 20:52:04 2008 UTC vs.
Revision 1.27 by root, Fri Dec 25 07:17:12 2009 UTC