[ViewVC] Contents of: cvs/Coro-Multicore/README

NAME
    Coro::Multicore - make coro threads on multiple cores with specially
    supported modules

SYNOPSIS
     # when you DO control the main event loop, e.g. in the main program

     use Coro::Multicore; # enable by default

     Coro::Multicore::scoped_disable;
     AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ...

     # when you DO NOT control the event loop, e.g. in a module on CPAN
     # do nothing (see HOW TO USE IT) or something like this:

     use Coro::Multicore (); # disable by default

     async {
        Coro::Multicore::scoped_enable;

        # blocking is safe in your own threads
        ...
     };

DESCRIPTION
    While Coro threads (unlike ithreads) provide real threads similar to
    pthreads, python threads and so on, they do not run in parallel to each
    other even on machines with multiple CPUs or multiple CPU cores.

    This module lifts this restriction under two very specific but useful
    conditions: firstly, the coro thread executes in XS code and does not
    touch any perl data structures, and secondly, the XS code is specially
    prepared to allow this.

    This means that, when you call an XS function of a module prepared for
    it, this XS function can execute in parallel to any other Coro threads.
    This is useful for both CPU bound tasks (such as cryptography) as well
    as I/O bound tasks (such as loading an image from disk). It can also be
    used to do stuff in parallel via APIs that were not meant for this, such
    as database accesses via DBI.

    The mechanism to support this is easily added to existing modules and is
    independent of Coro or Coro::Multicore, and therefore could be used,
    without changes, with other, similar, modules, or even the perl core,
    should it gain real thread support anytime soon. See
    <http://perlmulticore.schmorp.de/> for more info on how to prepare a
    module to allow parallel execution. Preparing an existing module is
    easy, doesn't add much overhead and no dependencies.

    This module is an AnyEvent user (and also, if not obvious, uses Coro).

HOW TO USE IT
    Quick explanation: decide whether you control the main program/the event
    loop and choose one of the two styles from the SYNOPSIS.

    Longer explanation: There are two major modes this module can used in -
    supported operations run asynchronously either by default, or only when
    requested. The reason you might not want to enable this module for all
    operations by default is compatibility with existing code:

    Since this module integrates into an event loop and you must not
    normally block and wait for something in an event loop callbacks. Now
    imagine somebody patches your favourite module (e.g. Digest::MD5) to
    take advantage of of the Perl Multicore API.

    Then code that runs in an event loop callback and executes
    Digest::MD5::md5 would work fine without "Coro::Multicore" - it would
    simply calculate the MD5 digest and block execution of anything else.
    But with "Coro::Multicore" enabled, the same operation would try to run
    other threads. And when those wait for events, there is no event loop
    anymore, as the event loop thread is busy doing the MD5 calculation,
    leading to a deadlock.

  USE IT IN THE MAIN PROGRAM
    One way to avoid this is to not run perlmulticore enabled functions in
    any callbacks. A simpler way to ensure it works is to disable
    "Coro::Multicore" thread switching in event loop callbacks, and enable
    it everywhere else.

    Therefore, if you control the event loop, as is usually the case when
    you write *program* and not a *module*, then you can enable
    "Coro::Multicore" by default, and disable it in your event loop thread:

       # example 1, separate thread for event loop

       use EV;
       use Coro;
       use Coro::Multicore;

       async {
          Coro::Multicore::scoped_disable;
          EV::run;
       };

       # do something else

       # example 2, run event loop as main program

       use EV;
       use Coro;
       use Coro::Multicore;

       Coro::Multicore::scoped_disable;

       ... initialisation

       EV::run;

    The latter form is usually better and more idiomatic - the main thread
    is the best place to run the event loop.

    Often you want to do some initialisation before running the event loop.
    The most efficient way to do that is to put your intialisation code (and
    main program) into its own thread and run the event loop in your main
    program:

       use AnyEvent::Loop;
       use Coro::Multicore; # enable by default

       async {
          load_data;
          do_other_init;
          bind_socket;
          ...
       };

       Coro::Multicore::scoped_disable;
       AnyEvent::Loop::run;

    This has the effect of running the event loop first, so the
    initialisation code can block if it wants to.

    If this is too cumbersome but you still want to make sure you can call
    blocking functions before entering the event loop, you can keep
    "Coro::Multicore" disabled till you cna run the event loop:

       use AnyEvent::Loop;
       use Coro::Multicore (); # disable by default

       load_data;
       do_other_init;
       bind_socket;
       ...

       Coro::Multicore::scoped_disable; # disable for event loop
       Coro::Multicore::enable 1; # enable for the rest of the program
       AnyEvent::Loop::run;

  USE IT IN A MODULE
    When you *do not* control the event loop, for example, because you want
    to use this from a module you published on CPAN, then the previous
    method doesn't work.

    However, this is not normally a problem in practise - most modules only
    do work at request of the caller. In that case, you might not care
    whether it does block other threads or not, as this would be the callers
    responsibility (or decision), and by extension, a decision for the main
    program.

    So unless you use XS and want your XS functions to run asynchronously,
    you don't have to worry about "Coro::Multicore" at all - if you happen
    to call XS functions that are multicore-enabled and your caller has
    configured things correctly, they will automatically run asynchronously.
    Or in other words: nothing needs to be done at all, which also means
    that this method works fine for existing pure-perl modules, without
    having to change them at all.

    Only if your module runs it's own Coro threads could it be an issue -
    maybe your module implements some kind of job pool and relies on certain
    operations to run asynchronously. Then you can still use
    "Coro::Multicore" by not enabling it be default and only enabling it in
    your own threads:

       use Coro;
       use Coro::Multicore (); # note the () to disable by default

       async {
          Coro::Multicore::scoped_enable;

          # do things asynchronously by calling perlmulticore-enabled functions
       };

  EXPORTS
    This module does not (at the moment) export any symbols. It does,
    however, export "behaviour" - if you use the default import, then
    Coro::Multicore will be enabled for all threads and all callers in the
    whole program:

       use Coro::Multicore;

    In a module where you don't control what else might be loaded and run,
    you might want to be more conservative, and not import anything. This
    has the effect of not enabling the functionality by default, so you have
    to enable it per scope:

       use Coro::Multicore ();

       sub myfunc {
          Coro::Multicore::scoped_enable;

          # from here to the end of this function, and in any functions
          # called from this function, tasks will be executed asynchronously.
       }

API FUNCTIONS
    $previous = Coro::Multicore::enable [$enable]
        This function enables (if $enable is true) or disables (if $enable
        is false) the multicore functionality globally. By default, it is
        enabled.

        This can be used to effectively disable this module's functionality
        by default, and enable it only for selected threads or scopes, by
        calling "Coro::Multicore::scoped_enable".

        The function returns the previous value of the enable flag.

    Coro::Multicore::scoped_enable
        This function instructs Coro::Multicore to handle all requests
        executed in the current coro thread, from the call to the end of the
        current scope.

        Calls to "scoped_enable" and "scoped_disable" don't nest very well
        at the moment, so don't nest them.

    Coro::Multicore::scoped_disable
        The opposite of "Coro::Multicore::scope_disable": instructs
        Coro::Multicore to *not* handle the next multicore-enabled request.

THREAD SAFETY OF SUPPORTING XS MODULES
    Just because an XS module supports perlmulticore might not immediately
    make it reentrant. For example, while you can (try to) call "execute" on
    the same database handle for the patched "DBD::mysql" (see the registry
    <http://perlmulticore.schmorp.de/registry>), this will almost certainly
    not work, despite "DBD::mysql" and "libmysqlclient" being thread safe
    and reentrant - just not on the same database handle.

    Many modules have limitations such as these - some can only be called
    concurrently from a single thread as they use global variables, some can
    only be called concurrently on different *handles* (e.g. database
    connections for DBD modules, or digest objects for Digest modules), and
    some can be called at any time (such as the "md5" function in
    "Digest::MD5").

    Generally, you only have to be careful with the very few modules that
    use global variables or rely on C libraries that aren't thread-safe,
    which should be documented clearly in the module documentation.

    Most modules are either perfectly reentrant, or at least reentrant as
    long as you give every thread it's own *handle* object.

EXCEPTIONS AND THREAD CANCELLATION
    Coro allows you to cancel threads even when they execute within an XS
    function ("cancel" vs. "cancel" methods). Similarly, Coro allows you to
    send exceptions (e.g. via the "throw" method) to threads executing
    inside an XS function.

    While doing this is questionable and dangerous with normal Coro threads
    already, they are both supported in this module, although with
    potentially unwanted effects. The following describes the current
    implementation and is subject to change. It is described primarily so
    you can understand what went wrong, if things go wrong.

    EXCEPTIONS
        When a thread that has currently released the perl interpreter (e.g.
        because it is executing a perlmulticore enabled XS function)
        receives an exception, it will at first continue normally.

        After acquiring the perl interpreter again, it will throw the
        exception it previously received. More specifically, when a thread
        calls "perlinterp_acquire ()" and has received an exception, then
        "perlinterp_acquire ()" will not return but instead "die".

        Most code that has been updated for perlmulticore support will not
        expect this, and might leave internal state corrupted to some
        extent.

    CANCELLATION
        Unsafe cancellation on a thread that has released the perl
        interpreter frees its resources, but let's the XS code continue at
        first. This should not lead to corruption on the perl level, as the
        code isn't allowed to touch perl data structures until it reacquires
        the interpreter.

        The call to "perlinterp_acquire ()" will then block indefinitely,
        leaking the (OS level) thread.

        Safe cancellation will simply fail in this case, so is still "safe"
        to call.

INTERACTION WITH OTHER SOFTWARE
    This module is very similar to other environments where perl
    interpreters are moved between threads, such as mod_perl2, and the same
    caveats apply.

    I want to spell out the most important ones:

    pthreads usage
        Any creation of pthreads make it impossible to fork portably from a
        perl program, as forking from within a threaded program will leave
        the program in a state similar to a signal handler. While it might
        work on some platforms (as an extension), this might also result in
        silent data corruption. It also seems to work most of the time, so
        it's hard to test for this.

        I recommend using something like AnyEvent::Fork, which can create
        subprocesses safely (via Proc::FastSpawn).

        Similar issues exist for signal handlers, although this module works
        hard to keep safe perl signals safe.

    module support
        This module moves the same perl interpreter between different
        threads. Some modules might get confused by that (although this can
        usually be considered a bug). This is a rare case though.

    event loop reliance
        To be able to wake up programs waiting for results, this module
        relies on an active event loop (via AnyEvent). This is used to
        notify the perl interpreter when the asynchronous task is done.

        Since event loops typically fail to work properly after a fork, this
        means that some operations that were formerly working will now hang
        after fork.

        A workaround is to call "Coro::Multicore::enable 0" after a fork to
        disable the module.

        Future versions of this module might do this automatically.

BUGS
    (OS-) threads are never released
        At the moment, threads that were created once will never be freed.
        They will be reused for asynchronous requests, though, so as long as
        you limit the maximum number of concurrent asynchronous tasks, this
        will also limit the maximum number of threads created.

        The idle threads are not necessarily using a lot of resources: on
        GNU/Linux + glibc, each thread takes about 8KiB of userspace memory
        + whatever the kernel needs (probably less than 8KiB).

        Future versions will likely lift this limitation.

    AnyEvent is initalised at module load time
        AnyEvent is initialised on module load, as opposed to at a later
        time.

        Future versions will likely change this.

AUTHOR
     Marc Lehmann <schmorp@schmorp.de>
     http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html

    Additional thanks to Zsb�n Ambrus, who gave considerable desing input
    for this module and the perl multicore specification.

Revision:	1.4
Committed:	Thu Jan 18 16:46:22 2018 UTC (6 years, 4 months ago) by root
Branch:	MAIN
CVS Tags:	rel-1_0, rel-1_02, rel-1_03, rel-1_01, rel-0_03
Changes since 1.3:	+214 -14 lines
Log Message:	0.03
#	Content
1	NAME
2	Coro::Multicore - make coro threads on multiple cores with specially
3	supported modules
4
5	SYNOPSIS
6	# when you DO control the main event loop, e.g. in the main program
7
8	use Coro::Multicore; # enable by default
9
10	Coro::Multicore::scoped_disable;
11	AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ...
12
13	# when you DO NOT control the event loop, e.g. in a module on CPAN
14	# do nothing (see HOW TO USE IT) or something like this:
15
16	use Coro::Multicore (); # disable by default
17
18	async {
19	Coro::Multicore::scoped_enable;
20
21	# blocking is safe in your own threads
22	...
23	};
24
25	DESCRIPTION
26	While Coro threads (unlike ithreads) provide real threads similar to
27	pthreads, python threads and so on, they do not run in parallel to each
28	other even on machines with multiple CPUs or multiple CPU cores.
29
30	This module lifts this restriction under two very specific but useful
31	conditions: firstly, the coro thread executes in XS code and does not
32	touch any perl data structures, and secondly, the XS code is specially
33	prepared to allow this.
34
35	This means that, when you call an XS function of a module prepared for
36	it, this XS function can execute in parallel to any other Coro threads.
37	This is useful for both CPU bound tasks (such as cryptography) as well
38	as I/O bound tasks (such as loading an image from disk). It can also be
39	used to do stuff in parallel via APIs that were not meant for this, such
40	as database accesses via DBI.
41
42	The mechanism to support this is easily added to existing modules and is
43	independent of Coro or Coro::Multicore, and therefore could be used,
44	without changes, with other, similar, modules, or even the perl core,
45	should it gain real thread support anytime soon. See
46	<http://perlmulticore.schmorp.de/> for more info on how to prepare a
47	module to allow parallel execution. Preparing an existing module is
48	easy, doesn't add much overhead and no dependencies.
49
50	This module is an AnyEvent user (and also, if not obvious, uses Coro).
51
52	HOW TO USE IT
53	Quick explanation: decide whether you control the main program/the event
54	loop and choose one of the two styles from the SYNOPSIS.
55
56	Longer explanation: There are two major modes this module can used in -
57	supported operations run asynchronously either by default, or only when
58	requested. The reason you might not want to enable this module for all
59	operations by default is compatibility with existing code:
60
61	Since this module integrates into an event loop and you must not
62	normally block and wait for something in an event loop callbacks. Now
63	imagine somebody patches your favourite module (e.g. Digest::MD5) to
64	take advantage of of the Perl Multicore API.
65
66	Then code that runs in an event loop callback and executes
67	Digest::MD5::md5 would work fine without "Coro::Multicore" - it would
68	simply calculate the MD5 digest and block execution of anything else.
69	But with "Coro::Multicore" enabled, the same operation would try to run
70	other threads. And when those wait for events, there is no event loop
71	anymore, as the event loop thread is busy doing the MD5 calculation,
72	leading to a deadlock.
73
74	USE IT IN THE MAIN PROGRAM
75	One way to avoid this is to not run perlmulticore enabled functions in
76	any callbacks. A simpler way to ensure it works is to disable
77	"Coro::Multicore" thread switching in event loop callbacks, and enable
78	it everywhere else.
79
80	Therefore, if you control the event loop, as is usually the case when
81	you write program and not a module, then you can enable
82	"Coro::Multicore" by default, and disable it in your event loop thread:
83
84	# example 1, separate thread for event loop
85
86	use EV;
87	use Coro;
88	use Coro::Multicore;
89
90	async {
91	Coro::Multicore::scoped_disable;
92	EV::run;
93	};
94
95	# do something else
96
97	# example 2, run event loop as main program
98
99	use EV;
100	use Coro;
101	use Coro::Multicore;
102
103	Coro::Multicore::scoped_disable;
104
105	... initialisation
106
107	EV::run;
108
109	The latter form is usually better and more idiomatic - the main thread
110	is the best place to run the event loop.
111
112	Often you want to do some initialisation before running the event loop.
113	The most efficient way to do that is to put your intialisation code (and
114	main program) into its own thread and run the event loop in your main
115	program:
116
117	use AnyEvent::Loop;
118	use Coro::Multicore; # enable by default
119
120	async {
121	load_data;
122	do_other_init;
123	bind_socket;
124	...
125	};
126
127	Coro::Multicore::scoped_disable;
128	AnyEvent::Loop::run;
129
130	This has the effect of running the event loop first, so the
131	initialisation code can block if it wants to.
132
133	If this is too cumbersome but you still want to make sure you can call
134	blocking functions before entering the event loop, you can keep
135	"Coro::Multicore" disabled till you cna run the event loop:
136
137	use AnyEvent::Loop;
138	use Coro::Multicore (); # disable by default
139
140	load_data;
141	do_other_init;
142	bind_socket;
143	...
144
145	Coro::Multicore::scoped_disable; # disable for event loop
146	Coro::Multicore::enable 1; # enable for the rest of the program
147	AnyEvent::Loop::run;
148
149	USE IT IN A MODULE
150	When you do not control the event loop, for example, because you want
151	to use this from a module you published on CPAN, then the previous
152	method doesn't work.
153
154	However, this is not normally a problem in practise - most modules only
155	do work at request of the caller. In that case, you might not care
156	whether it does block other threads or not, as this would be the callers
157	responsibility (or decision), and by extension, a decision for the main
158	program.
159
160	So unless you use XS and want your XS functions to run asynchronously,
161	you don't have to worry about "Coro::Multicore" at all - if you happen
162	to call XS functions that are multicore-enabled and your caller has
163	configured things correctly, they will automatically run asynchronously.
164	Or in other words: nothing needs to be done at all, which also means
165	that this method works fine for existing pure-perl modules, without
166	having to change them at all.
167
168	Only if your module runs it's own Coro threads could it be an issue -
169	maybe your module implements some kind of job pool and relies on certain
170	operations to run asynchronously. Then you can still use
171	"Coro::Multicore" by not enabling it be default and only enabling it in
172	your own threads:
173
174	use Coro;
175	use Coro::Multicore (); # note the () to disable by default
176
177	async {
178	Coro::Multicore::scoped_enable;
179
180	# do things asynchronously by calling perlmulticore-enabled functions
181	};
182
183	EXPORTS
184	This module does not (at the moment) export any symbols. It does,
185	however, export "behaviour" - if you use the default import, then
186	Coro::Multicore will be enabled for all threads and all callers in the
187	whole program:
188
189	use Coro::Multicore;
190
191	In a module where you don't control what else might be loaded and run,
192	you might want to be more conservative, and not import anything. This
193	has the effect of not enabling the functionality by default, so you have
194	to enable it per scope:
195
196	use Coro::Multicore ();
197
198	sub myfunc {
199	Coro::Multicore::scoped_enable;
200
201	# from here to the end of this function, and in any functions
202	# called from this function, tasks will be executed asynchronously.
203	}
204
205	API FUNCTIONS
206	$previous = Coro::Multicore::enable [$enable]
207	This function enables (if $enable is true) or disables (if $enable
208	is false) the multicore functionality globally. By default, it is
209	enabled.
210
211	This can be used to effectively disable this module's functionality
212	by default, and enable it only for selected threads or scopes, by
213	calling "Coro::Multicore::scoped_enable".
214
215	The function returns the previous value of the enable flag.
216
217	Coro::Multicore::scoped_enable
218	This function instructs Coro::Multicore to handle all requests
219	executed in the current coro thread, from the call to the end of the
220	current scope.
221
222	Calls to "scoped_enable" and "scoped_disable" don't nest very well
223	at the moment, so don't nest them.
224
225	Coro::Multicore::scoped_disable
226	The opposite of "Coro::Multicore::scope_disable": instructs
227	Coro::Multicore to not handle the next multicore-enabled request.
228
229	THREAD SAFETY OF SUPPORTING XS MODULES
230	Just because an XS module supports perlmulticore might not immediately
231	make it reentrant. For example, while you can (try to) call "execute" on
232	the same database handle for the patched "DBD::mysql" (see the registry
233	<http://perlmulticore.schmorp.de/registry>), this will almost certainly
234	not work, despite "DBD::mysql" and "libmysqlclient" being thread safe
235	and reentrant - just not on the same database handle.
236
237	Many modules have limitations such as these - some can only be called
238	concurrently from a single thread as they use global variables, some can
239	only be called concurrently on different handles (e.g. database
240	connections for DBD modules, or digest objects for Digest modules), and
241	some can be called at any time (such as the "md5" function in
242	"Digest::MD5").
243
244	Generally, you only have to be careful with the very few modules that
245	use global variables or rely on C libraries that aren't thread-safe,
246	which should be documented clearly in the module documentation.
247
248	Most modules are either perfectly reentrant, or at least reentrant as
249	long as you give every thread it's own handle object.
250
251	EXCEPTIONS AND THREAD CANCELLATION
252	Coro allows you to cancel threads even when they execute within an XS
253	function ("cancel" vs. "cancel" methods). Similarly, Coro allows you to
254	send exceptions (e.g. via the "throw" method) to threads executing
255	inside an XS function.
256
257	While doing this is questionable and dangerous with normal Coro threads
258	already, they are both supported in this module, although with
259	potentially unwanted effects. The following describes the current
260	implementation and is subject to change. It is described primarily so
261	you can understand what went wrong, if things go wrong.
262
263	EXCEPTIONS
264	When a thread that has currently released the perl interpreter (e.g.
265	because it is executing a perlmulticore enabled XS function)
266	receives an exception, it will at first continue normally.
267
268	After acquiring the perl interpreter again, it will throw the
269	exception it previously received. More specifically, when a thread
270	calls "perlinterp_acquire ()" and has received an exception, then
271	"perlinterp_acquire ()" will not return but instead "die".
272
273	Most code that has been updated for perlmulticore support will not
274	expect this, and might leave internal state corrupted to some
275	extent.
276
277	CANCELLATION
278	Unsafe cancellation on a thread that has released the perl
279	interpreter frees its resources, but let's the XS code continue at
280	first. This should not lead to corruption on the perl level, as the
281	code isn't allowed to touch perl data structures until it reacquires
282	the interpreter.
283
284	The call to "perlinterp_acquire ()" will then block indefinitely,
285	leaking the (OS level) thread.
286
287	Safe cancellation will simply fail in this case, so is still "safe"
288	to call.
289
290	INTERACTION WITH OTHER SOFTWARE
291	This module is very similar to other environments where perl
292	interpreters are moved between threads, such as mod_perl2, and the same
293	caveats apply.
294
295	I want to spell out the most important ones:
296
297	pthreads usage
298	Any creation of pthreads make it impossible to fork portably from a
299	perl program, as forking from within a threaded program will leave
300	the program in a state similar to a signal handler. While it might
301	work on some platforms (as an extension), this might also result in
302	silent data corruption. It also seems to work most of the time, so
303	it's hard to test for this.
304
305	I recommend using something like AnyEvent::Fork, which can create
306	subprocesses safely (via Proc::FastSpawn).
307
308	Similar issues exist for signal handlers, although this module works
309	hard to keep safe perl signals safe.
310
311	module support
312	This module moves the same perl interpreter between different
313	threads. Some modules might get confused by that (although this can
314	usually be considered a bug). This is a rare case though.
315
316	event loop reliance
317	To be able to wake up programs waiting for results, this module
318	relies on an active event loop (via AnyEvent). This is used to
319	notify the perl interpreter when the asynchronous task is done.
320
321	Since event loops typically fail to work properly after a fork, this
322	means that some operations that were formerly working will now hang
323	after fork.
324
325	A workaround is to call "Coro::Multicore::enable 0" after a fork to
326	disable the module.
327
328	Future versions of this module might do this automatically.
329
330	BUGS
331	(OS-) threads are never released
332	At the moment, threads that were created once will never be freed.
333	They will be reused for asynchronous requests, though, so as long as
334	you limit the maximum number of concurrent asynchronous tasks, this
335	will also limit the maximum number of threads created.
336
337	The idle threads are not necessarily using a lot of resources: on
338	GNU/Linux + glibc, each thread takes about 8KiB of userspace memory
339	+ whatever the kernel needs (probably less than 8KiB).
340
341	Future versions will likely lift this limitation.
342
343	AnyEvent is initalised at module load time
344	AnyEvent is initialised on module load, as opposed to at a later
345	time.
346
347	Future versions will likely change this.
348
349	AUTHOR
350	Marc Lehmann <schmorp@schmorp.de>
351	http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html
352
353	Additional thanks to Zsb�n Ambrus, who gave considerable desing input
354	for this module and the perl multicore specification.
355