1 |
NAME |
2 |
Coro::Multicore - make coro threads on multiple cores with specially |
3 |
supported modules |
4 |
|
5 |
SYNOPSIS |
6 |
# when you DO control the main event loop, e.g. in the main program |
7 |
|
8 |
use Coro::Multicore; # enable by default |
9 |
|
10 |
Coro::Multicore::scoped_disable; |
11 |
AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ... |
12 |
|
13 |
# when you DO NOT control the event loop, e.g. in a module on CPAN |
14 |
# do nothing (see HOW TO USE IT) or something like this: |
15 |
|
16 |
use Coro::Multicore (); # disable by default |
17 |
|
18 |
async { |
19 |
Coro::Multicore::scoped_enable; |
20 |
|
21 |
# blocking is safe in your own threads |
22 |
... |
23 |
}; |
24 |
|
25 |
DESCRIPTION |
26 |
While Coro threads (unlike ithreads) provide real threads similar to |
27 |
pthreads, python threads and so on, they do not run in parallel to each |
28 |
other even on machines with multiple CPUs or multiple CPU cores. |
29 |
|
30 |
This module lifts this restriction under two very specific but useful |
31 |
conditions: firstly, the coro thread executes in XS code and does not |
32 |
touch any perl data structures, and secondly, the XS code is specially |
33 |
prepared to allow this. |
34 |
|
35 |
This means that, when you call an XS function of a module prepared for |
36 |
it, this XS function can execute in parallel to any other Coro threads. |
37 |
This is useful for both CPU bound tasks (such as cryptography) as well |
38 |
as I/O bound tasks (such as loading an image from disk). It can also be |
39 |
used to do stuff in parallel via APIs that were not meant for this, such |
40 |
as database accesses via DBI. |
41 |
|
42 |
The mechanism to support this is easily added to existing modules and is |
43 |
independent of Coro or Coro::Multicore, and therefore could be used, |
44 |
without changes, with other, similar, modules, or even the perl core, |
45 |
should it gain real thread support anytime soon. See |
46 |
<http://perlmulticore.schmorp.de/> for more info on how to prepare a |
47 |
module to allow parallel execution. Preparing an existing module is |
48 |
easy, doesn't add much overhead and no dependencies. |
49 |
|
50 |
This module is an AnyEvent user (and also, if not obvious, uses Coro). |
51 |
|
52 |
HOW TO USE IT |
53 |
Quick explanation: decide whether you control the main program/the event |
54 |
loop and choose one of the two styles from the SYNOPSIS. |
55 |
|
56 |
Longer explanation: There are two major modes this module can used in - |
57 |
supported operations run asynchronously either by default, or only when |
58 |
requested. The reason you might not want to enable this module for all |
59 |
operations by default is compatibility with existing code: |
60 |
|
61 |
Since this module integrates into an event loop and you must not |
62 |
normally block and wait for something in an event loop callbacks. Now |
63 |
imagine somebody patches your favourite module (e.g. Digest::MD5) to |
64 |
take advantage of of the Perl Multicore API. |
65 |
|
66 |
Then code that runs in an event loop callback and executes |
67 |
Digest::MD5::md5 would work fine without "Coro::Multicore" - it would |
68 |
simply calculate the MD5 digest and block execution of anything else. |
69 |
But with "Coro::Multicore" enabled, the same operation would try to run |
70 |
other threads. And when those wait for events, there is no event loop |
71 |
anymore, as the event loop thread is busy doing the MD5 calculation, |
72 |
leading to a deadlock. |
73 |
|
74 |
USE IT IN THE MAIN PROGRAM |
75 |
One way to avoid this is to not run perlmulticore enabled functions in |
76 |
any callbacks. A simpler way to ensure it works is to disable |
77 |
"Coro::Multicore" thread switching in event loop callbacks, and enable |
78 |
it everywhere else. |
79 |
|
80 |
Therefore, if you control the event loop, as is usually the case when |
81 |
you write *program* and not a *module*, then you can enable |
82 |
"Coro::Multicore" by default, and disable it in your event loop thread: |
83 |
|
84 |
# example 1, separate thread for event loop |
85 |
|
86 |
use EV; |
87 |
use Coro; |
88 |
use Coro::Multicore; |
89 |
|
90 |
async { |
91 |
Coro::Multicore::scoped_disable; |
92 |
EV::run; |
93 |
}; |
94 |
|
95 |
# do something else |
96 |
|
97 |
# example 2, run event loop as main program |
98 |
|
99 |
use EV; |
100 |
use Coro; |
101 |
use Coro::Multicore; |
102 |
|
103 |
Coro::Multicore::scoped_disable; |
104 |
|
105 |
... initialisation |
106 |
|
107 |
EV::run; |
108 |
|
109 |
The latter form is usually better and more idiomatic - the main thread |
110 |
is the best place to run the event loop. |
111 |
|
112 |
Often you want to do some initialisation before running the event loop. |
113 |
The most efficient way to do that is to put your intialisation code (and |
114 |
main program) into its own thread and run the event loop in your main |
115 |
program: |
116 |
|
117 |
use AnyEvent::Loop; |
118 |
use Coro::Multicore; # enable by default |
119 |
|
120 |
async { |
121 |
load_data; |
122 |
do_other_init; |
123 |
bind_socket; |
124 |
... |
125 |
}; |
126 |
|
127 |
Coro::Multicore::scoped_disable; |
128 |
AnyEvent::Loop::run; |
129 |
|
130 |
This has the effect of running the event loop first, so the |
131 |
initialisation code can block if it wants to. |
132 |
|
133 |
If this is too cumbersome but you still want to make sure you can call |
134 |
blocking functions before entering the event loop, you can keep |
135 |
"Coro::Multicore" disabled till you cna run the event loop: |
136 |
|
137 |
use AnyEvent::Loop; |
138 |
use Coro::Multicore (); # disable by default |
139 |
|
140 |
load_data; |
141 |
do_other_init; |
142 |
bind_socket; |
143 |
... |
144 |
|
145 |
Coro::Multicore::scoped_disable; # disable for event loop |
146 |
Coro::Multicore::enable 1; # enable for the rest of the program |
147 |
AnyEvent::Loop::run; |
148 |
|
149 |
USE IT IN A MODULE |
150 |
When you *do not* control the event loop, for example, because you want |
151 |
to use this from a module you published on CPAN, then the previous |
152 |
method doesn't work. |
153 |
|
154 |
However, this is not normally a problem in practise - most modules only |
155 |
do work at request of the caller. In that case, you might not care |
156 |
whether it does block other threads or not, as this would be the callers |
157 |
responsibility (or decision), and by extension, a decision for the main |
158 |
program. |
159 |
|
160 |
So unless you use XS and want your XS functions to run asynchronously, |
161 |
you don't have to worry about "Coro::Multicore" at all - if you happen |
162 |
to call XS functions that are multicore-enabled and your caller has |
163 |
configured things correctly, they will automatically run asynchronously. |
164 |
Or in other words: nothing needs to be done at all, which also means |
165 |
that this method works fine for existing pure-perl modules, without |
166 |
having to change them at all. |
167 |
|
168 |
Only if your module runs it's own Coro threads could it be an issue - |
169 |
maybe your module implements some kind of job pool and relies on certain |
170 |
operations to run asynchronously. Then you can still use |
171 |
"Coro::Multicore" by not enabling it be default and only enabling it in |
172 |
your own threads: |
173 |
|
174 |
use Coro; |
175 |
use Coro::Multicore (); # note the () to disable by default |
176 |
|
177 |
async { |
178 |
Coro::Multicore::scoped_enable; |
179 |
|
180 |
# do things asynchronously by calling perlmulticore-enabled functions |
181 |
}; |
182 |
|
183 |
EXPORTS |
184 |
This module does not (at the moment) export any symbols. It does, |
185 |
however, export "behaviour" - if you use the default import, then |
186 |
Coro::Multicore will be enabled for all threads and all callers in the |
187 |
whole program: |
188 |
|
189 |
use Coro::Multicore; |
190 |
|
191 |
In a module where you don't control what else might be loaded and run, |
192 |
you might want to be more conservative, and not import anything. This |
193 |
has the effect of not enabling the functionality by default, so you have |
194 |
to enable it per scope: |
195 |
|
196 |
use Coro::Multicore (); |
197 |
|
198 |
sub myfunc { |
199 |
Coro::Multicore::scoped_enable; |
200 |
|
201 |
# from here to the end of this function, and in any functions |
202 |
# called from this function, tasks will be executed asynchronously. |
203 |
} |
204 |
|
205 |
API FUNCTIONS |
206 |
$previous = Coro::Multicore::enable [$enable] |
207 |
This function enables (if $enable is true) or disables (if $enable |
208 |
is false) the multicore functionality globally. By default, it is |
209 |
enabled. |
210 |
|
211 |
This can be used to effectively disable this module's functionality |
212 |
by default, and enable it only for selected threads or scopes, by |
213 |
calling "Coro::Multicore::scoped_enable". |
214 |
|
215 |
The function returns the previous value of the enable flag. |
216 |
|
217 |
Coro::Multicore::scoped_enable |
218 |
This function instructs Coro::Multicore to handle all requests |
219 |
executed in the current coro thread, from the call to the end of the |
220 |
current scope. |
221 |
|
222 |
Calls to "scoped_enable" and "scoped_disable" don't nest very well |
223 |
at the moment, so don't nest them. |
224 |
|
225 |
Coro::Multicore::scoped_disable |
226 |
The opposite of "Coro::Multicore::scope_disable": instructs |
227 |
Coro::Multicore to *not* handle the next multicore-enabled request. |
228 |
|
229 |
THREAD SAFETY OF SUPPORTING XS MODULES |
230 |
Just because an XS module supports perlmulticore might not immediately |
231 |
make it reentrant. For example, while you can (try to) call "execute" on |
232 |
the same database handle for the patched "DBD::mysql" (see the registry |
233 |
<http://perlmulticore.schmorp.de/registry>), this will almost certainly |
234 |
not work, despite "DBD::mysql" and "libmysqlclient" being thread safe |
235 |
and reentrant - just not on the same database handle. |
236 |
|
237 |
Many modules have limitations such as these - some can only be called |
238 |
concurrently from a single thread as they use global variables, some can |
239 |
only be called concurrently on different *handles* (e.g. database |
240 |
connections for DBD modules, or digest objects for Digest modules), and |
241 |
some can be called at any time (such as the "md5" function in |
242 |
"Digest::MD5"). |
243 |
|
244 |
Generally, you only have to be careful with the very few modules that |
245 |
use global variables or rely on C libraries that aren't thread-safe, |
246 |
which should be documented clearly in the module documentation. |
247 |
|
248 |
Most modules are either perfectly reentrant, or at least reentrant as |
249 |
long as you give every thread it's own *handle* object. |
250 |
|
251 |
EXCEPTIONS AND THREAD CANCELLATION |
252 |
Coro allows you to cancel threads even when they execute within an XS |
253 |
function ("cancel" vs. "cancel" methods). Similarly, Coro allows you to |
254 |
send exceptions (e.g. via the "throw" method) to threads executing |
255 |
inside an XS function. |
256 |
|
257 |
While doing this is questionable and dangerous with normal Coro threads |
258 |
already, they are both supported in this module, although with |
259 |
potentially unwanted effects. The following describes the current |
260 |
implementation and is subject to change. It is described primarily so |
261 |
you can understand what went wrong, if things go wrong. |
262 |
|
263 |
EXCEPTIONS |
264 |
When a thread that has currently released the perl interpreter (e.g. |
265 |
because it is executing a perlmulticore enabled XS function) |
266 |
receives an exception, it will at first continue normally. |
267 |
|
268 |
After acquiring the perl interpreter again, it will throw the |
269 |
exception it previously received. More specifically, when a thread |
270 |
calls "perlinterp_acquire ()" and has received an exception, then |
271 |
"perlinterp_acquire ()" will not return but instead "die". |
272 |
|
273 |
Most code that has been updated for perlmulticore support will not |
274 |
expect this, and might leave internal state corrupted to some |
275 |
extent. |
276 |
|
277 |
CANCELLATION |
278 |
Unsafe cancellation on a thread that has released the perl |
279 |
interpreter frees its resources, but let's the XS code continue at |
280 |
first. This should not lead to corruption on the perl level, as the |
281 |
code isn't allowed to touch perl data structures until it reacquires |
282 |
the interpreter. |
283 |
|
284 |
The call to "perlinterp_acquire ()" will then block indefinitely, |
285 |
leaking the (OS level) thread. |
286 |
|
287 |
Safe cancellation will simply fail in this case, so is still "safe" |
288 |
to call. |
289 |
|
290 |
INTERACTION WITH OTHER SOFTWARE |
291 |
This module is very similar to other environments where perl |
292 |
interpreters are moved between threads, such as mod_perl2, and the same |
293 |
caveats apply. |
294 |
|
295 |
I want to spell out the most important ones: |
296 |
|
297 |
pthreads usage |
298 |
Any creation of pthreads make it impossible to fork portably from a |
299 |
perl program, as forking from within a threaded program will leave |
300 |
the program in a state similar to a signal handler. While it might |
301 |
work on some platforms (as an extension), this might also result in |
302 |
silent data corruption. It also seems to work most of the time, so |
303 |
it's hard to test for this. |
304 |
|
305 |
I recommend using something like AnyEvent::Fork, which can create |
306 |
subprocesses safely (via Proc::FastSpawn). |
307 |
|
308 |
Similar issues exist for signal handlers, although this module works |
309 |
hard to keep safe perl signals safe. |
310 |
|
311 |
module support |
312 |
This module moves the same perl interpreter between different |
313 |
threads. Some modules might get confused by that (although this can |
314 |
usually be considered a bug). This is a rare case though. |
315 |
|
316 |
event loop reliance |
317 |
To be able to wake up programs waiting for results, this module |
318 |
relies on an active event loop (via AnyEvent). This is used to |
319 |
notify the perl interpreter when the asynchronous task is done. |
320 |
|
321 |
Since event loops typically fail to work properly after a fork, this |
322 |
means that some operations that were formerly working will now hang |
323 |
after fork. |
324 |
|
325 |
A workaround is to call "Coro::Multicore::enable 0" after a fork to |
326 |
disable the module. |
327 |
|
328 |
Future versions of this module might do this automatically. |
329 |
|
330 |
BUGS |
331 |
(OS-) threads are never released |
332 |
At the moment, threads that were created once will never be freed. |
333 |
They will be reused for asynchronous requests, though, so as long as |
334 |
you limit the maximum number of concurrent asynchronous tasks, this |
335 |
will also limit the maximum number of threads created. |
336 |
|
337 |
The idle threads are not necessarily using a lot of resources: on |
338 |
GNU/Linux + glibc, each thread takes about 8KiB of userspace memory |
339 |
+ whatever the kernel needs (probably less than 8KiB). |
340 |
|
341 |
Future versions will likely lift this limitation. |
342 |
|
343 |
AnyEvent is initalised at module load time |
344 |
AnyEvent is initialised on module load, as opposed to at a later |
345 |
time. |
346 |
|
347 |
Future versions will likely change this. |
348 |
|
349 |
AUTHOR |
350 |
Marc Lehmann <schmorp@schmorp.de> |
351 |
http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html |
352 |
|
353 |
Additional thanks to Zsbán Ambrus, who gave considerable desing input |
354 |
for this module and the perl multicore specification. |
355 |
|