| 1 |
NAME |
| 2 |
Coro::Multicore - make coro threads on multiple cores with specially |
| 3 |
supported modules |
| 4 |
|
| 5 |
SYNOPSIS |
| 6 |
# when you DO control the main event loop, e.g. in the main program |
| 7 |
|
| 8 |
use Coro::Multicore; # enable by default |
| 9 |
|
| 10 |
Coro::Multicore::scoped_disable; |
| 11 |
AE::cv->recv; # or EV::run, AnyEvent::Loop::run, Event::loop, ... |
| 12 |
|
| 13 |
# when you DO NOT control the event loop, e.g. in a module on CPAN |
| 14 |
# do nothing (see HOW TO USE IT) or something like this: |
| 15 |
|
| 16 |
use Coro::Multicore (); # disable by default |
| 17 |
|
| 18 |
async { |
| 19 |
Coro::Multicore::scoped_enable; |
| 20 |
|
| 21 |
# blocking is safe in your own threads |
| 22 |
... |
| 23 |
}; |
| 24 |
|
| 25 |
DESCRIPTION |
| 26 |
While Coro threads (unlike ithreads) provide real threads similar to |
| 27 |
pthreads, python threads and so on, they do not run in parallel to each |
| 28 |
other even on machines with multiple CPUs or multiple CPU cores. |
| 29 |
|
| 30 |
This module lifts this restriction under two very specific but useful |
| 31 |
conditions: firstly, the coro thread executes in XS code and does not |
| 32 |
touch any perl data structures, and secondly, the XS code is specially |
| 33 |
prepared to allow this. |
| 34 |
|
| 35 |
This means that, when you call an XS function of a module prepared for |
| 36 |
it, this XS function can execute in parallel to any other Coro threads. |
| 37 |
This is useful for both CPU bound tasks (such as cryptography) as well |
| 38 |
as I/O bound tasks (such as loading an image from disk). It can also be |
| 39 |
used to do stuff in parallel via APIs that were not meant for this, such |
| 40 |
as database accesses via DBI. |
| 41 |
|
| 42 |
The mechanism to support this is easily added to existing modules and is |
| 43 |
independent of Coro or Coro::Multicore, and therefore could be used, |
| 44 |
without changes, with other, similar, modules, or even the perl core, |
| 45 |
should it gain real thread support anytime soon. See |
| 46 |
<http://perlmulticore.schmorp.de/> for more info on how to prepare a |
| 47 |
module to allow parallel execution. Preparing an existing module is |
| 48 |
easy, doesn't add much overhead and no dependencies. |
| 49 |
|
| 50 |
This module is an AnyEvent user (and also, if not obvious, uses Coro). |
| 51 |
|
| 52 |
HOW TO USE IT |
| 53 |
Quick explanation: decide whether you control the main program/the event |
| 54 |
loop and choose one of the two styles from the SYNOPSIS. |
| 55 |
|
| 56 |
Longer explanation: There are two major modes this module can used in - |
| 57 |
supported operations run asynchronously either by default, or only when |
| 58 |
requested. The reason you might not want to enable this module for all |
| 59 |
operations by default is compatibility with existing code: |
| 60 |
|
| 61 |
Since this module integrates into an event loop and you must not |
| 62 |
normally block and wait for something in an event loop callbacks. Now |
| 63 |
imagine somebody patches your favourite module (e.g. Digest::MD5) to |
| 64 |
take advantage of of the Perl Multicore API. |
| 65 |
|
| 66 |
Then code that runs in an event loop callback and executes |
| 67 |
Digest::MD5::md5 would work fine without "Coro::Multicore" - it would |
| 68 |
simply calculate the MD5 digest and block execution of anything else. |
| 69 |
But with "Coro::Multicore" enabled, the same operation would try to run |
| 70 |
other threads. And when those wait for events, there is no event loop |
| 71 |
anymore, as the event loop thread is busy doing the MD5 calculation, |
| 72 |
leading to a deadlock. |
| 73 |
|
| 74 |
USE IT IN THE MAIN PROGRAM |
| 75 |
One way to avoid this is to not run perlmulticore enabled functions in |
| 76 |
any callbacks. A simpler way to ensure it works is to disable |
| 77 |
"Coro::Multicore" thread switching in event loop callbacks, and enable |
| 78 |
it everywhere else. |
| 79 |
|
| 80 |
Therefore, if you control the event loop, as is usually the case when |
| 81 |
you write *program* and not a *module*, then you can enable |
| 82 |
"Coro::Multicore" by default, and disable it in your event loop thread: |
| 83 |
|
| 84 |
# example 1, separate thread for event loop |
| 85 |
|
| 86 |
use EV; |
| 87 |
use Coro; |
| 88 |
use Coro::Multicore; |
| 89 |
|
| 90 |
async { |
| 91 |
Coro::Multicore::scoped_disable; |
| 92 |
EV::run; |
| 93 |
}; |
| 94 |
|
| 95 |
# do something else |
| 96 |
|
| 97 |
# example 2, run event loop as main program |
| 98 |
|
| 99 |
use EV; |
| 100 |
use Coro; |
| 101 |
use Coro::Multicore; |
| 102 |
|
| 103 |
Coro::Multicore::scoped_disable; |
| 104 |
|
| 105 |
... initialisation |
| 106 |
|
| 107 |
EV::run; |
| 108 |
|
| 109 |
The latter form is usually better and more idiomatic - the main thread |
| 110 |
is the best place to run the event loop. |
| 111 |
|
| 112 |
Often you want to do some initialisation before running the event loop. |
| 113 |
The most efficient way to do that is to put your intialisation code (and |
| 114 |
main program) into its own thread and run the event loop in your main |
| 115 |
program: |
| 116 |
|
| 117 |
use AnyEvent::Loop; |
| 118 |
use Coro::Multicore; # enable by default |
| 119 |
|
| 120 |
async { |
| 121 |
load_data; |
| 122 |
do_other_init; |
| 123 |
bind_socket; |
| 124 |
... |
| 125 |
}; |
| 126 |
|
| 127 |
Coro::Multicore::scoped_disable; |
| 128 |
AnyEvent::Loop::run; |
| 129 |
|
| 130 |
This has the effect of running the event loop first, so the |
| 131 |
initialisation code can block if it wants to. |
| 132 |
|
| 133 |
If this is too cumbersome but you still want to make sure you can call |
| 134 |
blocking functions before entering the event loop, you can keep |
| 135 |
"Coro::Multicore" disabled till you cna run the event loop: |
| 136 |
|
| 137 |
use AnyEvent::Loop; |
| 138 |
use Coro::Multicore (); # disable by default |
| 139 |
|
| 140 |
load_data; |
| 141 |
do_other_init; |
| 142 |
bind_socket; |
| 143 |
... |
| 144 |
|
| 145 |
Coro::Multicore::scoped_disable; # disable for event loop |
| 146 |
Coro::Multicore::enable 1; # enable for the rest of the program |
| 147 |
AnyEvent::Loop::run; |
| 148 |
|
| 149 |
USE IT IN A MODULE |
| 150 |
When you *do not* control the event loop, for example, because you want |
| 151 |
to use this from a module you published on CPAN, then the previous |
| 152 |
method doesn't work. |
| 153 |
|
| 154 |
However, this is not normally a problem in practise - most modules only |
| 155 |
do work at request of the caller. In that case, you might not care |
| 156 |
whether it does block other threads or not, as this would be the callers |
| 157 |
responsibility (or decision), and by extension, a decision for the main |
| 158 |
program. |
| 159 |
|
| 160 |
So unless you use XS and want your XS functions to run asynchronously, |
| 161 |
you don't have to worry about "Coro::Multicore" at all - if you happen |
| 162 |
to call XS functions that are multicore-enabled and your caller has |
| 163 |
configured things correctly, they will automatically run asynchronously. |
| 164 |
Or in other words: nothing needs to be done at all, which also means |
| 165 |
that this method works fine for existing pure-perl modules, without |
| 166 |
having to change them at all. |
| 167 |
|
| 168 |
Only if your module runs it's own Coro threads could it be an issue - |
| 169 |
maybe your module implements some kind of job pool and relies on certain |
| 170 |
operations to run asynchronously. Then you can still use |
| 171 |
"Coro::Multicore" by not enabling it be default and only enabling it in |
| 172 |
your own threads: |
| 173 |
|
| 174 |
use Coro; |
| 175 |
use Coro::Multicore (); # note the () to disable by default |
| 176 |
|
| 177 |
async { |
| 178 |
Coro::Multicore::scoped_enable; |
| 179 |
|
| 180 |
# do things asynchronously by calling perlmulticore-enabled functions |
| 181 |
}; |
| 182 |
|
| 183 |
EXPORTS |
| 184 |
This module does not (at the moment) export any symbols. It does, |
| 185 |
however, export "behaviour" - if you use the default import, then |
| 186 |
Coro::Multicore will be enabled for all threads and all callers in the |
| 187 |
whole program: |
| 188 |
|
| 189 |
use Coro::Multicore; |
| 190 |
|
| 191 |
In a module where you don't control what else might be loaded and run, |
| 192 |
you might want to be more conservative, and not import anything. This |
| 193 |
has the effect of not enabling the functionality by default, so you have |
| 194 |
to enable it per scope: |
| 195 |
|
| 196 |
use Coro::Multicore (); |
| 197 |
|
| 198 |
sub myfunc { |
| 199 |
Coro::Multicore::scoped_enable; |
| 200 |
|
| 201 |
# from here to the end of this function, and in any functions |
| 202 |
# called from this function, tasks will be executed asynchronously. |
| 203 |
} |
| 204 |
|
| 205 |
API FUNCTIONS |
| 206 |
$previous = Coro::Multicore::enable [$enable] |
| 207 |
This function enables (if $enable is true) or disables (if $enable |
| 208 |
is false) the multicore functionality globally. By default, it is |
| 209 |
enabled. |
| 210 |
|
| 211 |
This can be used to effectively disable this module's functionality |
| 212 |
by default, and enable it only for selected threads or scopes, by |
| 213 |
calling "Coro::Multicore::scoped_enable". |
| 214 |
|
| 215 |
Note that this setting nonly affects the *global default* - it will |
| 216 |
not reflect whether multicore functionality is enabled for the |
| 217 |
current thread. |
| 218 |
|
| 219 |
The function returns the previous value of the enable flag. |
| 220 |
|
| 221 |
Coro::Multicore::scoped_enable |
| 222 |
This function instructs Coro::Multicore to handle all requests |
| 223 |
executed in the current coro thread, from the call to the end of the |
| 224 |
current scope. |
| 225 |
|
| 226 |
Calls to "scoped_enable" and "scoped_disable" don't nest very well |
| 227 |
at the moment, so don't nest them. |
| 228 |
|
| 229 |
Coro::Multicore::scoped_disable |
| 230 |
The opposite of "Coro::Multicore::scope_disable": instructs |
| 231 |
Coro::Multicore to *not* handle the next multicore-enabled request. |
| 232 |
|
| 233 |
THREAD SAFETY OF SUPPORTING XS MODULES |
| 234 |
Just because an XS module supports perlmulticore might not immediately |
| 235 |
make it reentrant. For example, while you can (try to) call "execute" on |
| 236 |
the same database handle for the patched "DBD::mysql" (see the registry |
| 237 |
<http://perlmulticore.schmorp.de/registry>), this will almost certainly |
| 238 |
not work, despite "DBD::mysql" and "libmysqlclient" being thread safe |
| 239 |
and reentrant - just not on the same database handle. |
| 240 |
|
| 241 |
Many modules have limitations such as these - some can only be called |
| 242 |
concurrently from a single thread as they use global variables, some can |
| 243 |
only be called concurrently on different *handles* (e.g. database |
| 244 |
connections for DBD modules, or digest objects for Digest modules), and |
| 245 |
some can be called at any time (such as the "md5" function in |
| 246 |
"Digest::MD5"). |
| 247 |
|
| 248 |
Generally, you only have to be careful with the very few modules that |
| 249 |
use global variables or rely on C libraries that aren't thread-safe, |
| 250 |
which should be documented clearly in the module documentation. |
| 251 |
|
| 252 |
Most modules are either perfectly reentrant, or at least reentrant as |
| 253 |
long as you give every thread it's own *handle* object. |
| 254 |
|
| 255 |
EXCEPTIONS AND THREAD CANCELLATION |
| 256 |
Coro allows you to cancel threads even when they execute within an XS |
| 257 |
function ("cancel" vs. "cancel" methods). Similarly, Coro allows you to |
| 258 |
send exceptions (e.g. via the "throw" method) to threads executing |
| 259 |
inside an XS function. |
| 260 |
|
| 261 |
While doing this is questionable and dangerous with normal Coro threads |
| 262 |
already, they are both supported in this module, although with |
| 263 |
potentially unwanted effects. The following describes the current |
| 264 |
implementation and is subject to change. It is described primarily so |
| 265 |
you can understand what went wrong, if things go wrong. |
| 266 |
|
| 267 |
EXCEPTIONS |
| 268 |
When a thread that has currently released the perl interpreter (e.g. |
| 269 |
because it is executing a perlmulticore enabled XS function) |
| 270 |
receives an exception, it will at first continue normally. |
| 271 |
|
| 272 |
After acquiring the perl interpreter again, it will throw the |
| 273 |
exception it previously received. More specifically, when a thread |
| 274 |
calls "perlinterp_acquire ()" and has received an exception, then |
| 275 |
"perlinterp_acquire ()" will not return but instead "die". |
| 276 |
|
| 277 |
Most code that has been updated for perlmulticore support will not |
| 278 |
expect this, and might leave internal state corrupted to some |
| 279 |
extent. |
| 280 |
|
| 281 |
CANCELLATION |
| 282 |
Unsafe cancellation on a thread that has released the perl |
| 283 |
interpreter frees its resources, but let's the XS code continue at |
| 284 |
first. This should not lead to corruption on the perl level, as the |
| 285 |
code isn't allowed to touch perl data structures until it reacquires |
| 286 |
the interpreter. |
| 287 |
|
| 288 |
The call to "perlinterp_acquire ()" will then block indefinitely, |
| 289 |
leaking the (OS level) thread. |
| 290 |
|
| 291 |
Safe cancellation will simply fail in this case, so is still "safe" |
| 292 |
to call. |
| 293 |
|
| 294 |
INTERACTION WITH OTHER SOFTWARE |
| 295 |
This module is very similar to other environments where perl |
| 296 |
interpreters are moved between threads, such as mod_perl2, and the same |
| 297 |
caveats apply. |
| 298 |
|
| 299 |
I want to spell out the most important ones: |
| 300 |
|
| 301 |
pthreads usage |
| 302 |
Any creation of pthreads make it impossible to fork portably from a |
| 303 |
perl program, as forking from within a threaded program will leave |
| 304 |
the program in a state similar to a signal handler. While it might |
| 305 |
work on some platforms (as an extension), this might also result in |
| 306 |
silent data corruption. It also seems to work most of the time, so |
| 307 |
it's hard to test for this. |
| 308 |
|
| 309 |
I recommend using something like AnyEvent::Fork, which can create |
| 310 |
subprocesses safely (via Proc::FastSpawn). |
| 311 |
|
| 312 |
Similar issues exist for signal handlers, although this module works |
| 313 |
hard to keep safe perl signals safe. |
| 314 |
|
| 315 |
module support |
| 316 |
This module moves the same perl interpreter between different |
| 317 |
threads. Some modules might get confused by that (although this can |
| 318 |
usually be considered a bug). This is a rare case though. |
| 319 |
|
| 320 |
event loop reliance |
| 321 |
To be able to wake up programs waiting for results, this module |
| 322 |
relies on an active event loop (via AnyEvent). This is used to |
| 323 |
notify the perl interpreter when the asynchronous task is done. |
| 324 |
|
| 325 |
Since event loops typically fail to work properly after a fork, this |
| 326 |
means that some operations that were formerly working will now hang |
| 327 |
after fork. |
| 328 |
|
| 329 |
A workaround is to call "Coro::Multicore::enable 0" after a fork to |
| 330 |
disable the module. |
| 331 |
|
| 332 |
Future versions of this module might do this automatically. |
| 333 |
|
| 334 |
BUGS & LIMITATIONS |
| 335 |
(OS-) threads are never released |
| 336 |
At the moment, threads that were created once will never be freed. |
| 337 |
They will be reused for asynchronous requests, though, so as long as |
| 338 |
you limit the maximum number of concurrent asynchronous tasks, this |
| 339 |
will also limit the maximum number of threads created. |
| 340 |
|
| 341 |
The idle threads are not necessarily using a lot of resources: on |
| 342 |
GNU/Linux + glibc, each thread takes about 8KiB of userspace memory |
| 343 |
+ whatever the kernel needs (probably less than 8KiB). |
| 344 |
|
| 345 |
Future versions will likely lift this limitation. |
| 346 |
|
| 347 |
The enable_times feature of Coro is messed up |
| 348 |
The enable_times feature uses the per-thread timer to measure |
| 349 |
per-thread execution time, but since Coro::Multicore runs threads on |
| 350 |
different pthreads it will get the wrong times. Real times are not |
| 351 |
affected. |
| 352 |
|
| 353 |
Fork support |
| 354 |
Due to the nature of threads, you are not allowed to use this module |
| 355 |
in a forked child normally, with one exception: If you don't create |
| 356 |
any threads in the parent, then it is safe to start using it in a |
| 357 |
forked child. |
| 358 |
|
| 359 |
AUTHOR |
| 360 |
Marc Lehmann <schmorp@schmorp.de> |
| 361 |
http://software.schmorp.de/pkg/AnyEvent-XSThreadPool.html |
| 362 |
|
| 363 |
Additional thanks to Zsbán Ambrus, who gave considerable desing input |
| 364 |
for this module and the perl multicore specification. |
| 365 |
|