[ViewVC] Annotation of: cvs/Coro-Multicore/README

NAME
    Coro::Multicore - make coro threads on multiple cores with specially
    supported modules

SYNOPSIS
     # when you DO control the main event loop, e.g. in the main program

     use Coro::Multicore; # enable by default

     Coro::Multicore::scoped_disable;
     AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ...

     # when you DO NOT control the event loop, e.g. in a module on CPAN
     # do nothing (see HOW TO USE IT) or something like this:

     use Coro::Multicore (); # disable by default

     async {
        Coro::Multicore::scoped_enable;

        # blocking is safe in your own threads
        ...
     };

DESCRIPTION
    While Coro threads (unlike ithreads) provide real threads similar to
    pthreads, python threads and so on, they do not run in parallel to each
    other even on machines with multiple CPUs or multiple CPU cores.

    This module lifts this restriction under two very specific but useful
    conditions: firstly, the coro thread executes in XS code and does not
    touch any perl data structures, and secondly, the XS code is specially
    prepared to allow this.

    This means that, when you call an XS function of a module prepared for
    it, this XS function can execute in parallel to any other Coro threads.
    This is useful for both CPU bound tasks (such as cryptography) as well
    as I/O bound tasks (such as loading an image from disk). It can also be
    used to do stuff in parallel via APIs that were not meant for this, such
    as database accesses via DBI.

    The mechanism to support this is easily added to existing modules and is
    independent of Coro or Coro::Multicore, and therefore could be used,
    without changes, with other, similar, modules, or even the perl core,
    should it gain real thread support anytime soon. See
    <http://perlmulticore.schmorp.de/> for more info on how to prepare a
    module to allow parallel execution. Preparing an existing module is
    easy, doesn't add much overhead and no dependencies.

    This module is an AnyEvent user (and also, if not obvious, uses Coro).

HOW TO USE IT
    Quick explanation: decide whether you control the main program/the event
    loop and choose one of the two styles from the SYNOPSIS.

    Longer explanation: There are two major modes this module can used in -
    supported operations run asynchronously either by default, or only when
    requested. The reason you might not want to enable this module for all
    operations by default is compatibility with existing code:

    Since this module integrates into an event loop and you must not
    normally block and wait for something in an event loop callbacks. Now
    imagine somebody patches your favourite module (e.g. Digest::MD5) to
    take advantage of of the Perl Multicore API.

    Then code that runs in an event loop callback and executes
    Digest::MD5::md5 would work fine without "Coro::Multicore" - it would
    simply calculate the MD5 digest and block execution of anything else.
    But with "Coro::Multicore" enabled, the same operation would try to run
    other threads. And when those wait for events, there is no event loop
    anymore, as the event loop thread is busy doing the MD5 calculation,
    leading to a deadlock.

  USE IT IN THE MAIN PROGRAM
    One way to avoid this is to not run perlmulticore enabled functions in
    any callbacks. A simpler way to ensure it works is to disable
    "Coro::Multicore" thread switching in event loop callbacks, and enable
    it everywhere else.

    Therefore, if you control the event loop, as is usually the case when
    you write *program* and not a *module*, then you can enable
    "Coro::Multicore" by default, and disable it in your event loop thread:

       # example 1, separate thread for event loop

       use EV;
       use Coro;
       use Coro::Multicore;

       async {
          Coro::Multicore::scoped_disable;
          EV::run;
       };

       # do something else

       # example 2, run event loop as main program

       use EV;
       use Coro;
       use Coro::Multicore;

       Coro::Multicore::scoped_disable;

       ... initialisation

       EV::run;

    The latter form is usually better and more idiomatic - the main thread
    is the best place to run the event loop.

    Often you want to do some initialisation before running the event loop.
    The most efficient way to do that is to put your intialisation code (and
    main program) into its own thread and run the event loop in your main
    program:

       use AnyEvent::Loop;
       use Coro::Multicore; # enable by default

       async {
          load_data;
          do_other_init;
          bind_socket;
          ...
       };

       Coro::Multicore::scoped_disable;
       AnyEvent::Loop::run;

    This has the effect of running the event loop first, so the
    initialisation code can block if it wants to.

    If this is too cumbersome but you still want to make sure you can call
    blocking functions before entering the event loop, you can keep
    "Coro::Multicore" disabled till you cna run the event loop:

       use AnyEvent::Loop;
       use Coro::Multicore (); # disable by default

       load_data;
       do_other_init;
       bind_socket;
       ...

       Coro::Multicore::scoped_disable; # disable for event loop
       Coro::Multicore::enable 1; # enable for the rest of the program
       AnyEvent::Loop::run;

  USE IT IN A MODULE
    When you *do not* control the event loop, for example, because you want
    to use this from a module you published on CPAN, then the previous
    method doesn't work.

    However, this is not normally a problem in practise - most modules only
    do work at request of the caller. In that case, you might not care
    whether it does block other threads or not, as this would be the callers
    responsibility (or decision), and by extension, a decision for the main
    program.

    So unless you use XS and want your XS functions to run asynchronously,
    you don't have to worry about "Coro::Multicore" at all - if you happen
    to call XS functions that are multicore-enabled and your caller has
    configured things correctly, they will automatically run asynchronously.
    Or in other words: nothing needs to be done at all, which also means
    that this method works fine for existing pure-perl modules, without
    having to change them at all.

    Only if your module runs it's own Coro threads could it be an issue -
    maybe your module implements some kind of job pool and relies on certain
    operations to run asynchronously. Then you can still use
    "Coro::Multicore" by not enabling it be default and only enabling it in
    your own threads:

       use Coro;
       use Coro::Multicore (); # note the () to disable by default

       async {
          Coro::Multicore::scoped_enable;

          # do things asynchronously by calling perlmulticore-enabled functions
       };

  EXPORTS
    This module does not (at the moment) export any symbols. It does,
    however, export "behaviour" - if you use the default import, then
    Coro::Multicore will be enabled for all threads and all callers in the
    whole program:

       use Coro::Multicore;

    In a module where you don't control what else might be loaded and run,
    you might want to be more conservative, and not import anything. This
    has the effect of not enabling the functionality by default, so you have
    to enable it per scope:

       use Coro::Multicore ();

       sub myfunc {
          Coro::Multicore::scoped_enable;

          # from here to the end of this function, and in any functions
          # called from this function, tasks will be executed asynchronously.
       }

API FUNCTIONS
    $previous = Coro::Multicore::enable [$enable]
        This function enables (if $enable is true) or disables (if $enable
        is false) the multicore functionality globally. By default, it is
        enabled.

        This can be used to effectively disable this module's functionality
        by default, and enable it only for selected threads or scopes, by
        calling "Coro::Multicore::scoped_enable".

        The function returns the previous value of the enable flag.

    Coro::Multicore::scoped_enable
        This function instructs Coro::Multicore to handle all requests
        executed in the current coro thread, from the call to the end of the
        current scope.

        Calls to "scoped_enable" and "scoped_disable" don't nest very well
        at the moment, so don't nest them.

    Coro::Multicore::scoped_disable
        The opposite of "Coro::Multicore::scope_disable": instructs
        Coro::Multicore to *not* handle the next multicore-enabled request.

THREAD SAFETY OF SUPPORTING XS MODULES
    Just because an XS module supports perlmulticore might not immediately
    make it reentrant. For example, while you can (try to) call "execute" on
    the same database handle for the patched "DBD::mysql" (see the registry
    <http://perlmulticore.schmorp.de/registry>), this will almost certainly
    not work, despite "DBD::mysql" and "libmysqlclient" being thread safe
    and reentrant - just not on the same database handle.

    Many modules have limitations such as these - some can only be called
    concurrently from a single thread as they use global variables, some can
    only be called concurrently on different *handles* (e.g. database
    connections for DBD modules, or digest objects for Digest modules), and
    some can be called at any time (such as the "md5" function in
    "Digest::MD5").

    Generally, you only have to be careful with the very few modules that
    use global variables or rely on C libraries that aren't thread-safe,
    which should be documented clearly in the module documentation.

    Most modules are either perfectly reentrant, or at least reentrant as
    long as you give every thread it's own *handle* object.

EXCEPTIONS AND THREAD CANCELLATION
    Coro allows you to cancel threads even when they execute within an XS
    function ("cancel" vs. "cancel" methods). Similarly, Coro allows you to
    send exceptions (e.g. via the "throw" method) to threads executing
    inside an XS function.

    While doing this is questionable and dangerous with normal Coro threads
    already, they are both supported in this module, although with
    potentially unwanted effects. The following describes the current
    implementation and is subject to change. It is described primarily so
    you can understand what went wrong, if things go wrong.

    EXCEPTIONS
        When a thread that has currently released the perl interpreter (e.g.
        because it is executing a perlmulticore enabled XS function)
        receives an exception, it will at first continue normally.

        After acquiring the perl interpreter again, it will throw the
        exception it previously received. More specifically, when a thread
        calls "perlinterp_acquire ()" and has received an exception, then
        "perlinterp_acquire ()" will not return but instead "die".

        Most code that has been updated for perlmulticore support will not
        expect this, and might leave internal state corrupted to some
        extent.

    CANCELLATION
        Unsafe cancellation on a thread that has released the perl
        interpreter frees its resources, but let's the XS code continue at
        first. This should not lead to corruption on the perl level, as the
        code isn't allowed to touch perl data structures until it reacquires
        the interpreter.

        The call to "perlinterp_acquire ()" will then block indefinitely,
        leaking the (OS level) thread.

        Safe cancellation will simply fail in this case, so is still "safe"
        to call.

INTERACTION WITH OTHER SOFTWARE
    This module is very similar to other environments where perl
    interpreters are moved between threads, such as mod_perl2, and the same
    caveats apply.

    I want to spell out the most important ones:

    pthreads usage
        Any creation of pthreads make it impossible to fork portably from a
        perl program, as forking from within a threaded program will leave
        the program in a state similar to a signal handler. While it might
        work on some platforms (as an extension), this might also result in
        silent data corruption. It also seems to work most of the time, so
        it's hard to test for this.

        I recommend using something like AnyEvent::Fork, which can create
        subprocesses safely (via Proc::FastSpawn).

        Similar issues exist for signal handlers, although this module works
        hard to keep safe perl signals safe.

    module support
        This module moves the same perl interpreter between different
        threads. Some modules might get confused by that (although this can
        usually be considered a bug). This is a rare case though.

    event loop reliance
        To be able to wake up programs waiting for results, this module
        relies on an active event loop (via AnyEvent). This is used to
        notify the perl interpreter when the asynchronous task is done.

        Since event loops typically fail to work properly after a fork, this
        means that some operations that were formerly working will now hang
        after fork.

        A workaround is to call "Coro::Multicore::enable 0" after a fork to
        disable the module.

        Future versions of this module might do this automatically.

BUGS
    (OS-) threads are never released
        At the moment, threads that were created once will never be freed.
        They will be reused for asynchronous requests, though, so as long as
        you limit the maximum number of concurrent asynchronous tasks, this
        will also limit the maximum number of threads created.

        The idle threads are not necessarily using a lot of resources: on
        GNU/Linux + glibc, each thread takes about 8KiB of userspace memory
        + whatever the kernel needs (probably less than 8KiB).

        Future versions will likely lift this limitation.

    AnyEvent is initalised at module load time
        AnyEvent is initialised on module load, as opposed to at a later
        time.

        Future versions will likely change this.

AUTHOR
     Marc Lehmann <schmorp@schmorp.de>
     http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html

    Additional thanks to Zsb�n Ambrus, who gave considerable desing input
    for this module and the perl multicore specification.

Revision:	1.4
Committed:	Thu Jan 18 16:46:22 2018 UTC (6 years, 3 months ago) by root
Branch:	MAIN
CVS Tags:	rel-1_0, rel-1_02, rel-1_03, rel-1_01, rel-0_03
Changes since 1.3:	+214 -14 lines
Log Message:	0.03
#	User	Rev	Content
1	root	1.2	NAME
2			Coro::Multicore - make coro threads on multiple cores with specially
3			supported modules
4
5			SYNOPSIS
6	root	1.4	# when you DO control the main event loop, e.g. in the main program
7	root	1.2
8	root	1.4	use Coro::Multicore; # enable by default
9
10			Coro::Multicore::scoped_disable;
11			AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ...
12
13			# when you DO NOT control the event loop, e.g. in a module on CPAN
14			# do nothing (see HOW TO USE IT) or something like this:
15
16			use Coro::Multicore (); # disable by default
17
18			async {
19			Coro::Multicore::scoped_enable;
20
21			# blocking is safe in your own threads
22			...
23			};
24	root	1.2
25			DESCRIPTION
26			While Coro threads (unlike ithreads) provide real threads similar to
27			pthreads, python threads and so on, they do not run in parallel to each
28			other even on machines with multiple CPUs or multiple CPU cores.
29
30			This module lifts this restriction under two very specific but useful
31			conditions: firstly, the coro thread executes in XS code and does not
32			touch any perl data structures, and secondly, the XS code is specially
33			prepared to allow this.
34
35			This means that, when you call an XS function of a module prepared for
36			it, this XS function can execute in parallel to any other Coro threads.
37	root	1.4	This is useful for both CPU bound tasks (such as cryptography) as well
38			as I/O bound tasks (such as loading an image from disk). It can also be
39			used to do stuff in parallel via APIs that were not meant for this, such
40			as database accesses via DBI.
41	root	1.2
42			The mechanism to support this is easily added to existing modules and is
43			independent of Coro or Coro::Multicore, and therefore could be used,
44			without changes, with other, similar, modules, or even the perl core,
45			should it gain real thread support anytime soon. See
46			<http://perlmulticore.schmorp.de/> for more info on how to prepare a
47			module to allow parallel execution. Preparing an existing module is
48			easy, doesn't add much overhead and no dependencies.
49
50			This module is an AnyEvent user (and also, if not obvious, uses Coro).
51
52			HOW TO USE IT
53	root	1.4	Quick explanation: decide whether you control the main program/the event
54			loop and choose one of the two styles from the SYNOPSIS.
55
56			Longer explanation: There are two major modes this module can used in -
57			supported operations run asynchronously either by default, or only when
58			requested. The reason you might not want to enable this module for all
59			operations by default is compatibility with existing code:
60
61			Since this module integrates into an event loop and you must not
62			normally block and wait for something in an event loop callbacks. Now
63			imagine somebody patches your favourite module (e.g. Digest::MD5) to
64			take advantage of of the Perl Multicore API.
65
66			Then code that runs in an event loop callback and executes
67			Digest::MD5::md5 would work fine without "Coro::Multicore" - it would
68			simply calculate the MD5 digest and block execution of anything else.
69			But with "Coro::Multicore" enabled, the same operation would try to run
70			other threads. And when those wait for events, there is no event loop
71			anymore, as the event loop thread is busy doing the MD5 calculation,
72			leading to a deadlock.
73
74			USE IT IN THE MAIN PROGRAM
75			One way to avoid this is to not run perlmulticore enabled functions in
76			any callbacks. A simpler way to ensure it works is to disable
77			"Coro::Multicore" thread switching in event loop callbacks, and enable
78			it everywhere else.
79
80			Therefore, if you control the event loop, as is usually the case when
81			you write program and not a module, then you can enable
82			"Coro::Multicore" by default, and disable it in your event loop thread:
83
84			# example 1, separate thread for event loop
85
86			use EV;
87			use Coro;
88			use Coro::Multicore;
89
90			async {
91			Coro::Multicore::scoped_disable;
92			EV::run;
93			};
94
95			# do something else
96
97			# example 2, run event loop as main program
98	root	1.2
99	root	1.4	use EV;
100			use Coro;
101	root	1.2	use Coro::Multicore;
102
103	root	1.4	Coro::Multicore::scoped_disable;
104
105			... initialisation
106
107			EV::run;
108
109			The latter form is usually better and more idiomatic - the main thread
110			is the best place to run the event loop.
111
112			Often you want to do some initialisation before running the event loop.
113			The most efficient way to do that is to put your intialisation code (and
114			main program) into its own thread and run the event loop in your main
115			program:
116
117			use AnyEvent::Loop;
118			use Coro::Multicore; # enable by default
119
120			async {
121			load_data;
122			do_other_init;
123			bind_socket;
124			...
125			};
126
127			Coro::Multicore::scoped_disable;
128			AnyEvent::Loop::run;
129
130			This has the effect of running the event loop first, so the
131			initialisation code can block if it wants to.
132
133			If this is too cumbersome but you still want to make sure you can call
134			blocking functions before entering the event loop, you can keep
135			"Coro::Multicore" disabled till you cna run the event loop:
136
137			use AnyEvent::Loop;
138			use Coro::Multicore (); # disable by default
139
140			load_data;
141			do_other_init;
142			bind_socket;
143			...
144
145			Coro::Multicore::scoped_disable; # disable for event loop
146			Coro::Multicore::enable 1; # enable for the rest of the program
147			AnyEvent::Loop::run;
148
149			USE IT IN A MODULE
150			When you do not control the event loop, for example, because you want
151			to use this from a module you published on CPAN, then the previous
152			method doesn't work.
153
154			However, this is not normally a problem in practise - most modules only
155			do work at request of the caller. In that case, you might not care
156			whether it does block other threads or not, as this would be the callers
157			responsibility (or decision), and by extension, a decision for the main
158			program.
159
160			So unless you use XS and want your XS functions to run asynchronously,
161			you don't have to worry about "Coro::Multicore" at all - if you happen
162			to call XS functions that are multicore-enabled and your caller has
163			configured things correctly, they will automatically run asynchronously.
164			Or in other words: nothing needs to be done at all, which also means
165			that this method works fine for existing pure-perl modules, without
166			having to change them at all.
167
168			Only if your module runs it's own Coro threads could it be an issue -
169			maybe your module implements some kind of job pool and relies on certain
170			operations to run asynchronously. Then you can still use
171			"Coro::Multicore" by not enabling it be default and only enabling it in
172			your own threads:
173
174			use Coro;
175			use Coro::Multicore (); # note the () to disable by default
176
177			async {
178			Coro::Multicore::scoped_enable;
179
180			# do things asynchronously by calling perlmulticore-enabled functions
181			};
182	root	1.2
183			EXPORTS
184			This module does not (at the moment) export any symbols. It does,
185			however, export "behaviour" - if you use the default import, then
186			Coro::Multicore will be enabled for all threads and all callers in the
187			whole program:
188
189			use Coro::Multicore;
190
191			In a module where you don't control what else might be loaded and run,
192			you might want to be more conservative, and not import anything. This
193			has the effect of not enabling the functionality by default, so you have
194			to enable it per scope:
195
196			use Coro::Multicore ();
197
198			sub myfunc {
199			Coro::Multicore::scoped_enable;
200
201			# from here to the end of this function, and in any functions
202			# called from this function, tasks will be executed asynchronously.
203			}
204
205			API FUNCTIONS
206			$previous = Coro::Multicore::enable [$enable]
207			This function enables (if $enable is true) or disables (if $enable
208			is false) the multicore functionality globally. By default, it is
209			enabled.
210
211			This can be used to effectively disable this module's functionality
212			by default, and enable it only for selected threads or scopes, by
213	root	1.4	calling "Coro::Multicore::scoped_enable".
214	root	1.2
215			The function returns the previous value of the enable flag.
216
217			Coro::Multicore::scoped_enable
218			This function instructs Coro::Multicore to handle all requests
219			executed in the current coro thread, from the call to the end of the
220			current scope.
221
222			Calls to "scoped_enable" and "scoped_disable" don't nest very well
223			at the moment, so don't nest them.
224
225			Coro::Multicore::scoped_disable
226			The opposite of "Coro::Multicore::scope_disable": instructs
227			Coro::Multicore to not handle the next multicore-enabled request.
228
229	root	1.4	THREAD SAFETY OF SUPPORTING XS MODULES
230			Just because an XS module supports perlmulticore might not immediately
231			make it reentrant. For example, while you can (try to) call "execute" on
232			the same database handle for the patched "DBD::mysql" (see the registry
233			<http://perlmulticore.schmorp.de/registry>), this will almost certainly
234			not work, despite "DBD::mysql" and "libmysqlclient" being thread safe
235			and reentrant - just not on the same database handle.
236
237			Many modules have limitations such as these - some can only be called
238			concurrently from a single thread as they use global variables, some can
239			only be called concurrently on different handles (e.g. database
240			connections for DBD modules, or digest objects for Digest modules), and
241			some can be called at any time (such as the "md5" function in
242			"Digest::MD5").
243
244			Generally, you only have to be careful with the very few modules that
245			use global variables or rely on C libraries that aren't thread-safe,
246			which should be documented clearly in the module documentation.
247
248			Most modules are either perfectly reentrant, or at least reentrant as
249			long as you give every thread it's own handle object.
250
251			EXCEPTIONS AND THREAD CANCELLATION
252			Coro allows you to cancel threads even when they execute within an XS
253			function ("cancel" vs. "cancel" methods). Similarly, Coro allows you to
254			send exceptions (e.g. via the "throw" method) to threads executing
255			inside an XS function.
256
257			While doing this is questionable and dangerous with normal Coro threads
258			already, they are both supported in this module, although with
259			potentially unwanted effects. The following describes the current
260			implementation and is subject to change. It is described primarily so
261			you can understand what went wrong, if things go wrong.
262
263			EXCEPTIONS
264			When a thread that has currently released the perl interpreter (e.g.
265			because it is executing a perlmulticore enabled XS function)
266			receives an exception, it will at first continue normally.
267
268			After acquiring the perl interpreter again, it will throw the
269			exception it previously received. More specifically, when a thread
270			calls "perlinterp_acquire ()" and has received an exception, then
271			"perlinterp_acquire ()" will not return but instead "die".
272
273			Most code that has been updated for perlmulticore support will not
274			expect this, and might leave internal state corrupted to some
275			extent.
276
277			CANCELLATION
278			Unsafe cancellation on a thread that has released the perl
279			interpreter frees its resources, but let's the XS code continue at
280			first. This should not lead to corruption on the perl level, as the
281			code isn't allowed to touch perl data structures until it reacquires
282			the interpreter.
283
284			The call to "perlinterp_acquire ()" will then block indefinitely,
285			leaking the (OS level) thread.
286
287			Safe cancellation will simply fail in this case, so is still "safe"
288			to call.
289
290	root	1.2	INTERACTION WITH OTHER SOFTWARE
291	root	1.3	This module is very similar to other environments where perl
292			interpreters are moved between threads, such as mod_perl2, and the same
293			caveats apply.
294
295			I want to spell out the most important ones:
296
297			pthreads usage
298			Any creation of pthreads make it impossible to fork portably from a
299			perl program, as forking from within a threaded program will leave
300			the program in a state similar to a signal handler. While it might
301			work on some platforms (as an extension), this might also result in
302			silent data corruption. It also seems to work most of the time, so
303			it's hard to test for this.
304
305			I recommend using something like AnyEvent::Fork, which can create
306			subprocesses safely (via Proc::FastSpawn).
307
308			Similar issues exist for signal handlers, although this module works
309			hard to keep safe perl signals safe.
310
311			module support
312			This module moves the same perl interpreter between different
313			threads. Some modules might get confused by that (although this can
314			usually be considered a bug). This is a rare case though.
315
316			event loop reliance
317			To be able to wake up programs waiting for results, this module
318			relies on an active event loop (via AnyEvent). This is used to
319			notify the perl interpreter when the asynchronous task is done.
320
321			Since event loops typically fail to work properly after a fork, this
322			means that some operations that were formerly working will now hang
323			after fork.
324
325			A workaround is to call "Coro::Multicore::enable 0" after a fork to
326			disable the module.
327
328			Future versions of this module might do this automatically.
329	root	1.2
330			BUGS
331			(OS-) threads are never released
332			At the moment, threads that were created once will never be freed.
333	root	1.4	They will be reused for asynchronous requests, though, so as long as
334	root	1.2	you limit the maximum number of concurrent asynchronous tasks, this
335			will also limit the maximum number of threads created.
336
337	root	1.4	The idle threads are not necessarily using a lot of resources: on
338			GNU/Linux + glibc, each thread takes about 8KiB of userspace memory
339			+ whatever the kernel needs (probably less than 8KiB).
340
341	root	1.2	Future versions will likely lift this limitation.
342
343	root	1.3	AnyEvent is initalised at module load time
344	root	1.2	AnyEvent is initialised on module load, as opposed to at a later
345			time.
346
347			Future versions will likely change this.
348
349			AUTHOR
350			Marc Lehmann <schmorp@schmorp.de>
351			http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html
352
353			Additional thanks to Zsb�n Ambrus, who gave considerable desing input
354			for this module and the perl multicore specification.
355