… | |
… | |
32 | |
32 | |
33 | The design goals for this mechanism were to be simple to use, very |
33 | The design goals for this mechanism were to be simple to use, very |
34 | efficient when not needed, low code and data size overhead and broad |
34 | efficient when not needed, low code and data size overhead and broad |
35 | applicability. |
35 | applicability. |
36 | |
36 | |
|
|
37 | The newest version of this document can be found at |
|
|
38 | L<http://pod.tst.eu/http://cvs.schmorp.de/Coro-Multicore/perlmulticore.h>. |
|
|
39 | |
|
|
40 | The nwest version of the header fgile itself, which |
|
|
41 | includes this documentation, can be downloaded from |
|
|
42 | L<http://cvs.schmorp.de/Coro-Multicore/perlmulticore.h>. |
37 | |
43 | |
38 | =head1 HOW DO I USE THIS IN MY MODULES? |
44 | =head1 HOW DO I USE THIS IN MY MODULES? |
39 | |
45 | |
40 | The suage is very simple - you include this header file in your XS module. Then, before you |
46 | The usage is very simple - you include this header file in your XS module. Then, before you |
41 | do your lengthy operation, you release the perl interpreter: |
47 | do your lengthy operation, you release the perl interpreter: |
42 | |
48 | |
43 | perlinterp_release (); |
49 | perlinterp_release (); |
44 | |
50 | |
45 | And when you are done with your computation, you acquire it again: |
51 | And when you are done with your computation, you acquire it again: |
46 | |
52 | |
47 | perlinterp_acquire (); |
53 | perlinterp_acquire (); |
48 | |
54 | |
49 | And that's it. This doesn't load any modules and consists of only a few |
55 | And that's it. This doesn't load any modules and consists of only a few |
50 | machine instructions when no module tot ake advantage of it is loaded. |
56 | machine instructions when no module to take advantage of it is loaded. |
51 | |
57 | |
52 | Here is a simple example, an C<flock> wrapper implemented in XS. Unlike |
58 | Here is a simple example, an C<flock> wrapper implemented in XS. Unlike |
53 | perl's built-in C<flock>, it allows other threads (for example, those |
59 | perl's built-in C<flock>, it allows other threads (for example, those |
54 | provided by L<Coro>) to execute, instead of blocking the whole perl |
60 | provided by L<Coro>) to execute, instead of blocking the whole perl |
55 | interpreter. For the sake of this example, it requires a file descriptor |
61 | interpreter. For the sake of this example, it requires a file descriptor |
… | |
… | |
80 | |
86 | |
81 | =head2 HOW ABOUT NOT-SO LONG WORK? |
87 | =head2 HOW ABOUT NOT-SO LONG WORK? |
82 | |
88 | |
83 | Sometimes you don't know how long your code will take - in a compression |
89 | Sometimes you don't know how long your code will take - in a compression |
84 | library for example, compressing a few hundred Kilobyte of data can take |
90 | library for example, compressing a few hundred Kilobyte of data can take |
85 | a while, while 50 Bytes will comptess so fast that even attempting to do |
91 | a while, while 50 Bytes will compress so fast that even attempting to do |
86 | something else could be more costly than just doing it. |
92 | something else could be more costly than just doing it. |
87 | |
93 | |
88 | This is a very hard problem to solve. The best you can do at the moment is |
94 | This is a very hard problem to solve. The best you can do at the moment is |
89 | to release the perl interpreter only when you think the work to be done |
95 | to release the perl interpreter only when you think the work to be done |
90 | justifies the expense. |
96 | justifies the expense. |
… | |
… | |
102 | Make sure the if conditions are exactly the same and don't change, so you |
108 | Make sure the if conditions are exactly the same and don't change, so you |
103 | always call acquire when you release, and vice versa. |
109 | always call acquire when you release, and vice versa. |
104 | |
110 | |
105 | When you don't have a handy indicator, you might still do something |
111 | When you don't have a handy indicator, you might still do something |
106 | useful. For example, if you do some file locking with C<fcntl> and you |
112 | useful. For example, if you do some file locking with C<fcntl> and you |
107 | expect the lock to be available immediatelly in most cases, you could try |
113 | expect the lock to be available immediately in most cases, you could try |
108 | with C<F_SETLK> (which doesn't wait), and only release/wait/acquire when |
114 | with C<F_SETLK> (which doesn't wait), and only release/wait/acquire when |
109 | the lock couldn't be set: |
115 | the lock couldn't be set: |
110 | |
116 | |
111 | int res = fcntl (fd, F_SETLK, &flock); |
117 | int res = fcntl (fd, F_SETLK, &flock); |
112 | |
118 | |
… | |
… | |
152 | |
158 | |
153 | if (!function_that_fails_with_0_return_value ()) |
159 | if (!function_that_fails_with_0_return_value ()) |
154 | { |
160 | { |
155 | perlinterp_acquire (); |
161 | perlinterp_acquire (); |
156 | croak ("error"); |
162 | croak ("error"); |
|
|
163 | // croak doesn't return |
157 | } |
164 | } |
158 | |
165 | |
159 | perlinterp_acquire (); |
166 | perlinterp_acquire (); |
160 | // do other stuff |
167 | // do other stuff |
161 | |
168 | |
… | |
… | |
181 | thread-safe, too. |
188 | thread-safe, too. |
182 | |
189 | |
183 | Always assume that the code between C<perlinterp_release> and |
190 | Always assume that the code between C<perlinterp_release> and |
184 | C<perlinterp_acquire> is executed in parallel on multiple CPUs at the same |
191 | C<perlinterp_acquire> is executed in parallel on multiple CPUs at the same |
185 | time. If your code can't cope with that, you could consider using a mutex |
192 | time. If your code can't cope with that, you could consider using a mutex |
186 | to only allow one such execution, which is sitll better than blocking |
193 | to only allow one such execution, which is still better than blocking |
187 | everybody else from doing anything: |
194 | everybody else from doing anything: |
188 | |
195 | |
189 | static pthread_mutex_t my_mutex = PTHREAD_MUTEX_INITIALIZER; |
196 | static pthread_mutex_t my_mutex = PTHREAD_MUTEX_INITIALIZER; |
190 | |
197 | |
191 | perlinterp_release (); |
198 | perlinterp_release (); |
… | |
… | |
216 | =over 4 |
223 | =over 4 |
217 | |
224 | |
218 | =item Simple to Use |
225 | =item Simple to Use |
219 | |
226 | |
220 | All you have to do is identify the place in your existing code where you |
227 | All you have to do is identify the place in your existing code where you |
221 | stop touching perl stuff, do your actual work, and strat touching perl |
228 | stop touching perl stuff, do your actual work, and start touching perl |
222 | stuff again. |
229 | stuff again. |
223 | |
230 | |
224 | Then slap C<perlinterp_release ()> and C<perlinterp_acquire ()> around the |
231 | Then slap C<perlinterp_release ()> and C<perlinterp_acquire ()> around the |
225 | actual work code. |
232 | actual work code. |
226 | |
233 | |
… | |
… | |
271 | or C<perlinterp_acquire> results in a variation of the following 9-10 |
278 | or C<perlinterp_acquire> results in a variation of the following 9-10 |
272 | octet sequence: |
279 | octet sequence: |
273 | |
280 | |
274 | 150> mov 0x200f23(%rip),%rax # <perl_multicore_api> |
281 | 150> mov 0x200f23(%rip),%rax # <perl_multicore_api> |
275 | 157> callq *0x8(%rax) |
282 | 157> callq *0x8(%rax) |
276 | |
|
|
277 | amd64 code sure is bloated. |
|
|
278 | |
283 | |
279 | The biggest part if the initialisation code, which consists of 11 lines of |
284 | The biggest part if the initialisation code, which consists of 11 lines of |
280 | typical XS code. On my system, all the code in F<perlmulticore.h> compiles |
285 | typical XS code. On my system, all the code in F<perlmulticore.h> compiles |
281 | to less than 160 octets of read-only data. |
286 | to less than 160 octets of read-only data. |
282 | |
287 | |