1 |
=head1 NAME |
2 |
|
3 |
Coro::Multicore - make coro threads on multiple cores with specially supported modules |
4 |
|
5 |
=head1 SYNOPSIS |
6 |
|
7 |
# when you DO control the main event loop, e.g. in the main program |
8 |
|
9 |
use Coro::Multicore; # enable by default |
10 |
|
11 |
Coro::Multicore::scoped_disable; |
12 |
AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ... |
13 |
|
14 |
# when you DO NOT control the event loop, e.g. in a module on CPAN |
15 |
# do nothing (see HOW TO USE IT) or something like this: |
16 |
|
17 |
use Coro::Multicore (); # disable by default |
18 |
|
19 |
async { |
20 |
Coro::Multicore::scoped_enable; |
21 |
|
22 |
# blocking is safe in your own threads |
23 |
... |
24 |
}; |
25 |
|
26 |
=head1 DESCRIPTION |
27 |
|
28 |
While L<Coro> threads (unlike ithreads) provide real threads similar to |
29 |
pthreads, python threads and so on, they do not run in parallel to each |
30 |
other even on machines with multiple CPUs or multiple CPU cores. |
31 |
|
32 |
This module lifts this restriction under two very specific but useful |
33 |
conditions: firstly, the coro thread executes in XS code and does not |
34 |
touch any perl data structures, and secondly, the XS code is specially |
35 |
prepared to allow this. |
36 |
|
37 |
This means that, when you call an XS function of a module prepared for it, |
38 |
this XS function can execute in parallel to any other Coro threads. This |
39 |
is useful for both CPU bound tasks (such as cryptography) as well as I/O |
40 |
bound tasks (such as loading an image from disk). It can also be used |
41 |
to do stuff in parallel via APIs that were not meant for this, such as |
42 |
database accesses via DBI. |
43 |
|
44 |
The mechanism to support this is easily added to existing modules |
45 |
and is independent of L<Coro> or L<Coro::Multicore>, and therefore |
46 |
could be used, without changes, with other, similar, modules, or even |
47 |
the perl core, should it gain real thread support anytime soon. See |
48 |
L<http://perlmulticore.schmorp.de/> for more info on how to prepare a |
49 |
module to allow parallel execution. Preparing an existing module is easy, |
50 |
doesn't add much overhead and no dependencies. |
51 |
|
52 |
This module is an L<AnyEvent> user (and also, if not obvious, uses |
53 |
L<Coro>). |
54 |
|
55 |
=head1 HOW TO USE IT |
56 |
|
57 |
Quick explanation: decide whether you control the main program/the event |
58 |
loop and choose one of the two styles from the SYNOPSIS. |
59 |
|
60 |
Longer explanation: There are two major modes this module can used in - |
61 |
supported operations run asynchronously either by default, or only when |
62 |
requested. The reason you might not want to enable this module for all |
63 |
operations by default is compatibility with existing code: |
64 |
|
65 |
Since this module integrates into an event loop and you must not normally |
66 |
block and wait for something in an event loop callbacks. Now imagine |
67 |
somebody patches your favourite module (e.g. Digest::MD5) to take |
68 |
advantage of of the Perl Multicore API. |
69 |
|
70 |
Then code that runs in an event loop callback and executes |
71 |
Digest::MD5::md5 would work fine without C<Coro::Multicore> - it would |
72 |
simply calculate the MD5 digest and block execution of anything else. But |
73 |
with C<Coro::Multicore> enabled, the same operation would try to run other |
74 |
threads. And when those wait for events, there is no event loop anymore, |
75 |
as the event loop thread is busy doing the MD5 calculation, leading to a |
76 |
deadlock. |
77 |
|
78 |
=head2 USE IT IN THE MAIN PROGRAM |
79 |
|
80 |
One way to avoid this is to not run perlmulticore enabled functions |
81 |
in any callbacks. A simpler way to ensure it works is to disable |
82 |
C<Coro::Multicore> thread switching in event loop callbacks, and enable it |
83 |
everywhere else. |
84 |
|
85 |
Therefore, if you control the event loop, as is usually the case when |
86 |
you write I<program> and not a I<module>, then you can enable C<Coro::Multicore> |
87 |
by default, and disable it in your event loop thread: |
88 |
|
89 |
# example 1, separate thread for event loop |
90 |
|
91 |
use EV; |
92 |
use Coro; |
93 |
use Coro::Multicore; |
94 |
|
95 |
async { |
96 |
Coro::Multicore::scoped_disable; |
97 |
EV::run; |
98 |
}; |
99 |
|
100 |
# do something else |
101 |
|
102 |
# example 2, run event loop as main program |
103 |
|
104 |
use EV; |
105 |
use Coro; |
106 |
use Coro::Multicore; |
107 |
|
108 |
Coro::Multicore::scoped_disable; |
109 |
|
110 |
... initialisation |
111 |
|
112 |
EV::run; |
113 |
|
114 |
The latter form is usually better and more idiomatic - the main thread is |
115 |
the best place to run the event loop. |
116 |
|
117 |
Often you want to do some initialisation before running the event |
118 |
loop. The most efficient way to do that is to put your intialisation code |
119 |
(and main program) into its own thread and run the event loop in your main |
120 |
program: |
121 |
|
122 |
use AnyEvent::Loop; |
123 |
use Coro::Multicore; # enable by default |
124 |
|
125 |
async { |
126 |
load_data; |
127 |
do_other_init; |
128 |
bind_socket; |
129 |
... |
130 |
}; |
131 |
|
132 |
Coro::Multicore::scoped_disable; |
133 |
AnyEvent::Loop::run; |
134 |
|
135 |
This has the effect of running the event loop first, so the initialisation |
136 |
code can block if it wants to. |
137 |
|
138 |
If this is too cumbersome but you still want to make sure you can |
139 |
call blocking functions before entering the event loop, you can keep |
140 |
C<Coro::Multicore> disabled till you cna run the event loop: |
141 |
|
142 |
use AnyEvent::Loop; |
143 |
use Coro::Multicore (); # disable by default |
144 |
|
145 |
load_data; |
146 |
do_other_init; |
147 |
bind_socket; |
148 |
... |
149 |
|
150 |
Coro::Multicore::scoped_disable; # disable for event loop |
151 |
Coro::Multicore::enable 1; # enable for the rest of the program |
152 |
AnyEvent::Loop::run; |
153 |
|
154 |
=head2 USE IT IN A MODULE |
155 |
|
156 |
When you I<do not> control the event loop, for example, because you want |
157 |
to use this from a module you published on CPAN, then the previous method |
158 |
doesn't work. |
159 |
|
160 |
However, this is not normally a problem in practise - most modules only |
161 |
do work at request of the caller. In that case, you might not care |
162 |
whether it does block other threads or not, as this would be the callers |
163 |
responsibility (or decision), and by extension, a decision for the main |
164 |
program. |
165 |
|
166 |
So unless you use XS and want your XS functions to run asynchronously, |
167 |
you don't have to worry about C<Coro::Multicore> at all - if you |
168 |
happen to call XS functions that are multicore-enabled and your |
169 |
caller has configured things correctly, they will automatically run |
170 |
asynchronously. Or in other words: nothing needs to be done at all, which |
171 |
also means that this method works fine for existing pure-perl modules, |
172 |
without having to change them at all. |
173 |
|
174 |
Only if your module runs it's own L<Coro> threads could it be an |
175 |
issue - maybe your module implements some kind of job pool and relies |
176 |
on certain operations to run asynchronously. Then you can still use |
177 |
C<Coro::Multicore> by not enabling it be default and only enabling it in |
178 |
your own threads: |
179 |
|
180 |
use Coro; |
181 |
use Coro::Multicore (); # note the () to disable by default |
182 |
|
183 |
async { |
184 |
Coro::Multicore::scoped_enable; |
185 |
|
186 |
# do things asynchronously by calling perlmulticore-enabled functions |
187 |
}; |
188 |
|
189 |
=head2 EXPORTS |
190 |
|
191 |
This module does not (at the moment) export any symbols. It does, however, |
192 |
export "behaviour" - if you use the default import, then Coro::Multicore |
193 |
will be enabled for all threads and all callers in the whole program: |
194 |
|
195 |
use Coro::Multicore; |
196 |
|
197 |
In a module where you don't control what else might be loaded and run, you |
198 |
might want to be more conservative, and not import anything. This has the |
199 |
effect of not enabling the functionality by default, so you have to enable |
200 |
it per scope: |
201 |
|
202 |
use Coro::Multicore (); |
203 |
|
204 |
sub myfunc { |
205 |
Coro::Multicore::scoped_enable; |
206 |
|
207 |
# from here to the end of this function, and in any functions |
208 |
# called from this function, tasks will be executed asynchronously. |
209 |
} |
210 |
|
211 |
=head1 API FUNCTIONS |
212 |
|
213 |
=over 4 |
214 |
|
215 |
=item $previous = Coro::Multicore::enable [$enable] |
216 |
|
217 |
This function enables (if C<$enable> is true) or disables (if C<$enable> |
218 |
is false) the multicore functionality globally. By default, it is enabled. |
219 |
|
220 |
This can be used to effectively disable this module's functionality by |
221 |
default, and enable it only for selected threads or scopes, by calling |
222 |
C<Coro::Multicore::scoped_enable>. |
223 |
|
224 |
Note that this setting nonly affects the I<global default> - it will not |
225 |
reflect whether multicore functionality is enabled for the current thread. |
226 |
|
227 |
The function returns the previous value of the enable flag. |
228 |
|
229 |
=item Coro::Multicore::scoped_enable |
230 |
|
231 |
This function instructs Coro::Multicore to handle all requests executed |
232 |
in the current coro thread, from the call to the end of the current scope. |
233 |
|
234 |
Calls to C<scoped_enable> and C<scoped_disable> don't nest very well at |
235 |
the moment, so don't nest them. |
236 |
|
237 |
=item Coro::Multicore::scoped_disable |
238 |
|
239 |
The opposite of C<Coro::Multicore::scope_disable>: instructs Coro::Multicore to |
240 |
I<not> handle the next multicore-enabled request. |
241 |
|
242 |
=back |
243 |
|
244 |
=cut |
245 |
|
246 |
package Coro::Multicore; |
247 |
|
248 |
use Coro (); |
249 |
use AnyEvent (); |
250 |
|
251 |
BEGIN { |
252 |
our $VERSION = '1.03'; |
253 |
|
254 |
use XSLoader; |
255 |
XSLoader::load __PACKAGE__, $VERSION; |
256 |
} |
257 |
|
258 |
|
259 |
sub import { |
260 |
if (@_ > 1) { |
261 |
require Carp; |
262 |
Carp::croak ("Coro::Multicore does not export any symbols"); |
263 |
} |
264 |
|
265 |
enable 1; |
266 |
} |
267 |
|
268 |
our $WATCHER = AE::io fd, 0, \&poll; |
269 |
|
270 |
=head1 THREAD SAFETY OF SUPPORTING XS MODULES |
271 |
|
272 |
Just because an XS module supports perlmulticore might not immediately |
273 |
make it reentrant. For example, while you can (try to) call C<execute> |
274 |
on the same database handle for the patched C<DBD::mysql> (see the |
275 |
L<registry|http://perlmulticore.schmorp.de/registry>), this will almost |
276 |
certainly not work, despite C<DBD::mysql> and C<libmysqlclient> being |
277 |
thread safe and reentrant - just not on the same database handle. |
278 |
|
279 |
Many modules have limitations such as these - some can only be called |
280 |
concurrently from a single thread as they use global variables, some |
281 |
can only be called concurrently on different I<handles> (e.g. database |
282 |
connections for DBD modules, or digest objects for Digest modules), |
283 |
and some can be called at any time (such as the C<md5> function in |
284 |
C<Digest::MD5>). |
285 |
|
286 |
Generally, you only have to be careful with the very few modules that use |
287 |
global variables or rely on C libraries that aren't thread-safe, which |
288 |
should be documented clearly in the module documentation. |
289 |
|
290 |
Most modules are either perfectly reentrant, or at least reentrant as long |
291 |
as you give every thread it's own I<handle> object. |
292 |
|
293 |
=head1 EXCEPTIONS AND THREAD CANCELLATION |
294 |
|
295 |
L<Coro> allows you to cancel threads even when they execute within an XS |
296 |
function (C<cancel> vs. C<cancel> methods). Similarly, L<Coro> allows you |
297 |
to send exceptions (e.g. via the C<throw> method) to threads executing |
298 |
inside an XS function. |
299 |
|
300 |
While doing this is questionable and dangerous with normal Coro threads |
301 |
already, they are both supported in this module, although with potentially |
302 |
unwanted effects. The following describes the current implementation and |
303 |
is subject to change. It is described primarily so you can understand what |
304 |
went wrong, if things go wrong. |
305 |
|
306 |
=over 4 |
307 |
|
308 |
=item EXCEPTIONS |
309 |
|
310 |
When a thread that has currently released the perl interpreter (e.g. |
311 |
because it is executing a perlmulticore enabled XS function) receives an exception, it will |
312 |
at first continue normally. |
313 |
|
314 |
After acquiring the perl interpreter again, it will throw the |
315 |
exception it previously received. More specifically, when a thread |
316 |
calls C<perlinterp_acquire ()> and has received an exception, then |
317 |
C<perlinterp_acquire ()> will not return but instead C<die>. |
318 |
|
319 |
Most code that has been updated for perlmulticore support will not expect |
320 |
this, and might leave internal state corrupted to some extent. |
321 |
|
322 |
=item CANCELLATION |
323 |
|
324 |
Unsafe cancellation on a thread that has released the perl interpreter |
325 |
frees its resources, but let's the XS code continue at first. This should |
326 |
not lead to corruption on the perl level, as the code isn't allowed to |
327 |
touch perl data structures until it reacquires the interpreter. |
328 |
|
329 |
The call to C<perlinterp_acquire ()> will then block indefinitely, leaking |
330 |
the (OS level) thread. |
331 |
|
332 |
Safe cancellation will simply fail in this case, so is still "safe" to |
333 |
call. |
334 |
|
335 |
=back |
336 |
|
337 |
=head1 INTERACTION WITH OTHER SOFTWARE |
338 |
|
339 |
This module is very similar to other environments where perl interpreters |
340 |
are moved between threads, such as mod_perl2, and the same caveats apply. |
341 |
|
342 |
I want to spell out the most important ones: |
343 |
|
344 |
=over 4 |
345 |
|
346 |
=item pthreads usage |
347 |
|
348 |
Any creation of pthreads make it impossible to fork portably from a |
349 |
perl program, as forking from within a threaded program will leave the |
350 |
program in a state similar to a signal handler. While it might work on |
351 |
some platforms (as an extension), this might also result in silent data |
352 |
corruption. It also seems to work most of the time, so it's hard to test |
353 |
for this. |
354 |
|
355 |
I recommend using something like L<AnyEvent::Fork>, which can create |
356 |
subprocesses safely (via L<Proc::FastSpawn>). |
357 |
|
358 |
Similar issues exist for signal handlers, although this module works hard |
359 |
to keep safe perl signals safe. |
360 |
|
361 |
=item module support |
362 |
|
363 |
This module moves the same perl interpreter between different |
364 |
threads. Some modules might get confused by that (although this can |
365 |
usually be considered a bug). This is a rare case though. |
366 |
|
367 |
=item event loop reliance |
368 |
|
369 |
To be able to wake up programs waiting for results, this module relies on |
370 |
an active event loop (via L<AnyEvent>). This is used to notify the perl |
371 |
interpreter when the asynchronous task is done. |
372 |
|
373 |
Since event loops typically fail to work properly after a fork, this means |
374 |
that some operations that were formerly working will now hang after fork. |
375 |
|
376 |
A workaround is to call C<Coro::Multicore::enable 0> after a fork to |
377 |
disable the module. |
378 |
|
379 |
Future versions of this module might do this automatically. |
380 |
|
381 |
=back |
382 |
|
383 |
=head1 BUGS |
384 |
|
385 |
=over 4 |
386 |
|
387 |
=item (OS-) threads are never released |
388 |
|
389 |
At the moment, threads that were created once will never be freed. They |
390 |
will be reused for asynchronous requests, though, so as long as you limit |
391 |
the maximum number of concurrent asynchronous tasks, this will also limit |
392 |
the maximum number of threads created. |
393 |
|
394 |
The idle threads are not necessarily using a lot of resources: on |
395 |
GNU/Linux + glibc, each thread takes about 8KiB of userspace memory + |
396 |
whatever the kernel needs (probably less than 8KiB). |
397 |
|
398 |
Future versions will likely lift this limitation. |
399 |
|
400 |
=item AnyEvent is initalised at module load time |
401 |
|
402 |
AnyEvent is initialised on module load, as opposed to at a later time. |
403 |
|
404 |
Future versions will likely change this. |
405 |
|
406 |
=back |
407 |
|
408 |
=head1 AUTHOR |
409 |
|
410 |
Marc Lehmann <schmorp@schmorp.de> |
411 |
http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html |
412 |
|
413 |
Additional thanks to Zsbán Ambrus, who gave considerable desing input for |
414 |
this module and the perl multicore specification. |
415 |
|
416 |
=cut |
417 |
|
418 |
1 |
419 |
|