ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/perlmulticore/perlmulticore.pod
Revision: 1.1
Committed: Thu Jul 2 22:39:56 2015 UTC (9 years, 3 months ago) by root
Branch: MAIN
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 NAME
2
3 The Perl Multicore Specification and Implementation
4
5 =head1 SYNOPSIS
6
7 #include "perlmultiore.h"
8
9 // in your XS function:
10
11 perlinterp_release ();
12 do_the_C_thing ();
13 perlinterp_acquire ();
14
15 =head1 DESCRIPTION
16
17 This header file implements a simple mechanism for XS modules to allow
18 re-use of the perl interpreter for other threads while doing some lengthy
19 operation, such as cryptography, SQL queries, disk I/O and so on.
20
21 The design goals for this mechanism were to be simple to use, very
22 efficient when not needed, low code and data size overhead and broad
23 applicability.
24
25 The newest version of this document can be found at
26 L<http://perlmulticore.schmorp.de/>.
27
28 The newest version of the header file itself, can be downloaded from
29 L<http://perlmulticore.schmorp.de/perlmulticore.h>.
30
31 =head1 HOW DO I USE THIS IN MY MODULES?
32
33 The usage is very simple - you include this header file in your XS module. Then, before you
34 do your lengthy operation, you release the perl interpreter:
35
36 perlinterp_release ();
37
38 And when you are done with your computation, you acquire it again:
39
40 perlinterp_acquire ();
41
42 And that's it. This doesn't load any modules and consists of only a few
43 machine instructions when no module to take advantage of it is loaded.
44
45 Here is a simple example, an C<flock> wrapper implemented in XS. Unlike
46 perl's built-in C<flock>, it allows other threads (for example, those
47 provided by L<Coro>) to execute, instead of blocking the whole perl
48 interpreter. For the sake of this example, it requires a file descriptor
49 instead of a handle.
50
51 #include "perlmulticore.h" // this header file
52
53 // and in the XS portion
54 int flock (int fd, int operation)
55 CODE:
56 perlinterp_release ();
57 RETVAL = flock (fd, operation);
58 perlinterp_acquire ();
59 OUTPUT:
60 RETVAL
61
62 Another example would be to modify L<DBD::mysql> to allow other
63 threads to execute while executing SQL queries. One way to do this
64 is find all C<mysql_st_internal_execute> and similar calls (such as
65 C<mysql_st_internal_execute41>), and adorn them with release/acquire
66 calls:
67
68 {
69 perlinterp_release ();
70 imp_sth->row_num= mysql_st_internal_execute(sth, ...);
71 perlinterp_acquire ();
72 }
73
74 =head2 HOW ABOUT NOT-SO LONG WORK?
75
76 Sometimes you don't know how long your code will take - in a compression
77 library for example, compressing a few hundred Kilobyte of data can take
78 a while, while 50 Bytes will compress so fast that even attempting to do
79 something else could be more costly than just doing it.
80
81 This is a very hard problem to solve. The best you can do at the moment is
82 to release the perl interpreter only when you think the work to be done
83 justifies the expense.
84
85 As a rule of thumb, if you expect to need more than a few thousand cycles,
86 you should release the interpreter, else you shouldn't. When in doubt,
87 release.
88
89 For example, in a compression library, you might want to do this:
90
91 if (bytes_to_be_compressed > 2000) perlinterp_release ();
92 do_compress (...);
93 if (bytes_to_be_compressed > 2000) perlinterp_acquire ();
94
95 Make sure the if conditions are exactly the same and don't change, so you
96 always call acquire when you release, and vice versa.
97
98 When you don't have a handy indicator, you might still do something
99 useful. For example, if you do some file locking with C<fcntl> and you
100 expect the lock to be available immediately in most cases, you could try
101 with C<F_SETLK> (which doesn't wait), and only release/wait/acquire when
102 the lock couldn't be set:
103
104 int res = fcntl (fd, F_SETLK, &flock);
105
106 if (res)
107 {
108 // error, assume lock is held by another process and do it the slow way
109 perlinterp_release ();
110 res = fcntl (fd, F_SETLKW, &flock);
111 perlinterp_acquire ();
112 }
113
114 =head1 THE HARD AND FAST RULES
115
116 As with everything, there are a number of rules to follow.
117
118 =over 4
119
120 =item I<Never> touch any perl data structures after calling C<perlinterp_release>.
121
122 Possibly the most important rule of them all, anything perl is
123 completely off-limits after C<perlinterp_release>, until you call
124 C<perlinterp_acquire>, after which you can access perl stuff again.
125
126 That includes anything in the perl interpreter that you didn't prove to be
127 safe, and didn't prove to be safe in older and future versions of perl:
128 global variables, local perl scalars, even if you are sure nobody accesses
129 them and you only try to "read" their value, and so on.
130
131 If you need to access perl things, do it before releasing the
132 interpreter with C<perlinterp_release>, or after acquiring it again with
133 C<perlinterp_acquire>.
134
135 =item I<Always> call C<perlinterp_release> and C<perlinterp_acquire> in pairs.
136
137 For each C<perlinterp_release> call there must be a C<perlinterp_acquire>
138 call. They don't have to be in the same function, and you can have
139 multiple calls to them, as long as every C<perlinterp_release> call is
140 followed by exactly one C<perlinterp_acquire> call.
141
142 For example., this would be fine:
143
144 perlinterp_release ();
145
146 if (!function_that_fails_with_0_return_value ())
147 {
148 perlinterp_acquire ();
149 croak ("error");
150 // croak doesn't return
151 }
152
153 perlinterp_acquire ();
154 // do other stuff
155
156 =item I<Never> nest calls to C<perlinterp_release> and C<perlinterp_acquire>.
157
158 That simply means that after calling C<perlinterp_release>, you must
159 call C<perlinterp_acquire> before calling C<perlinterp_release>
160 again. Likewise, after C<perlinterp_acquire>, you can call
161 C<perlinterp_release> but not another C<perlinterp_acquire>.
162
163 =item I<Always> call C<perlinterp_release> first.
164
165 Also simple: you I<must not> call C<perlinterp_acquire> without having
166 called C<perlinterp_release> before.
167
168 =item I<Never> underestimate threads.
169
170 While it's easy to add parallel execution ability to your XS module, it
171 doesn't mean it is safe. After you release the perl interpreter, it's
172 perfectly possible that it will call your XS function in another thread,
173 even while your original function still executes. In other words: your C
174 code must be thread safe, and if you use any library, that library must be
175 thread-safe, too.
176
177 Always assume that the code between C<perlinterp_release> and
178 C<perlinterp_acquire> is executed in parallel on multiple CPUs at the same
179 time. If your code can't cope with that, you could consider using a mutex
180 to only allow one such execution, which is still better than blocking
181 everybody else from doing anything:
182
183 static pthread_mutex_t my_mutex = PTHREAD_MUTEX_INITIALIZER;
184
185 perlinterp_release ();
186 pthread_mutex_lock (&my_mutex);
187 do_your_non_thread_safe_thing ();
188 pthread_mutex_unlock (&my_mutex);
189 perlinterp_acquire ();
190
191 =item I<Don't> get confused by having to release first.
192
193 In many real world scenarios, you acquire a resource, do something, then
194 release it again. Don't let this confuse you, with this, you already own
195 the resource (the perl interpreter) so you have to I<release> first, and
196 I<acquire> it again later, not the other way around.
197
198 =back
199
200
201 =head1 DESIGN PRINCIPLES
202
203 This section discusses how the design goals were reached (you be the
204 judge), how it is implemented, and what overheads this implies.
205
206 =over 4
207
208 =item Simple to Use
209
210 All you have to do is identify the place in your existing code where you
211 stop touching perl stuff, do your actual work, and start touching perl
212 stuff again.
213
214 Then slap C<perlinterp_release ()> and C<perlinterp_acquire ()> around the
215 actual work code.
216
217 You have to include F<perlmulticore.h> and distribute it with your XS
218 code, but all these things border on the trivial.
219
220 =item Very Efficient
221
222 The definition for C<perlinterp_release> and C<perlinterp_release> is very
223 short:
224
225 #define perlinterp_release() perl_multicore_api->pmapi_release ()
226 #define perlinterp_acquire() perl_multicore_api->pmapi_acquire ()
227
228 Both are macros that read a pointer from memory (perl_multicore_api),
229 dereference a function pointer stored at that place, and call the
230 function, which takes no arguments and returns nothing.
231
232 The first call to C<perlinterp_release> will check for the presence
233 of any supporting module, and if none is loaded, will create a dummy
234 implementation where both C<pmapi_release> and C<pmapi_acquire> execute
235 this function:
236
237 static void perl_multicore_nop (void) { }
238
239 So in the case of no magical module being loaded, all calls except the
240 first are two memory accesses and a predictable function call of an empty
241 function.
242
243 Of course, the overhead is much higher when these functions actually
244 implement anything useful, but you always get what you pay for.
245
246 With L<Coro::Multicore>, every release/acquire involves two pthread
247 switches, two coro thread switches, a bunch of syscalls, and sometimes
248 interacting with the event loop.
249
250 A dedicated thread pool such as the one L<IO::AIO> uses could reduce
251 these overheads, and would also reduce the dependencies (L<AnyEvent> is a
252 smaller and more portable dependency than L<Coro>), but it would require a
253 lot more work on the side of the module author wanting to support it than
254 this solution.
255
256 =item Low Code and Data Size Overhead
257
258 On a 64 bit system, F<perlmulticore.h> uses exactly C<8> octets (one
259 pointer) of your data segment, to store the C<perl_multicore_api>
260 pointer. In addition it creates a C<16> octet perl string to store the
261 function pointers in, and stores it in a hash provided by perl for this
262 purpose.
263
264 This is pretty much the equivalent of executing this code:
265
266 $existing_hash{perl_multicore_api} = "123456781234567812345678";
267
268 And that's it, which is, as I think, indeed very little.
269
270 As for code size, on my amd64 system, every call to C<perlinterp_release>
271 or C<perlinterp_acquire> results in a variation of the following 9-10
272 octet sequence:
273
274 150> mov 0x200f23(%rip),%rax # <perl_multicore_api>
275 157> callq *0x8(%rax)
276
277 The biggest part if the initialisation code, which consists of 11 lines of
278 typical XS code. On my system, all the code in F<perlmulticore.h> compiles
279 to less than 160 octets of read-only data.
280
281 =item Broad Applicability
282
283 While there are alternative ways to achieve the goal of parallel execution
284 with threads that might be more efficient, this mechanism was chosen
285 because it is very simple to retrofit existing modules with it, and it
286
287 The design goals for this mechanism were to be simple to use, very
288 efficient when not needed, low code and data size overhead and broad
289 applicability.
290
291 =back
292
293
294 =head1 DISABLING PERL MULTICORE AT COMPILE TIME
295
296 You can disable the complete perl multicore API by defining the
297 symbol C<PERL_MULTICORE_DISABLE> to C<1> (e.g. by specifying
298 F<-DPERL_MULTICORE_DISABLE> as compiler argument).
299
300 This will leave no traces of the API in the compiled code, suitable
301 "empty" C<perl_release> and C<perl_acquire> definitions will be provided.
302
303 This could be added to perl's C<CPPFLAGS> when configuring perl on
304 platforms that do not support threading at all for example.
305
306
307 =head1 AUTHOR
308
309 Marc A. Lehmann <perlmulticore@schmorp.de>
310 http://perlmulticore.schmorp.de/
311
312 =head1 LICENSE
313
314 The F<perlmulticore.h> header file is put into the public
315 domain. Where this is legally not possible, or at your
316 option, it can be licensed under creativecommons CC0
317 license: L<https://creativecommons.org/publicdomain/zero/1.0/>.
318