ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/perlmulticore/perlmulticore.pod
(Generate patch)

Comparing perlmulticore/perlmulticore.pod (file contents):
Revision 1.2 by root, Thu Jul 2 22:42:24 2015 UTC vs.
Revision 1.3 by root, Mon Jul 6 04:00:35 2015 UTC

25The newest version of this document can be found at 25The newest version of this document can be found at
26L<http://perlmulticore.schmorp.de/>. 26L<http://perlmulticore.schmorp.de/>.
27 27
28The newest version of the header file that implements this specification 28The newest version of the header file that implements this specification
29can be downloaded from L<http://perlmulticore.schmorp.de/perlmulticore.h>. 29can be downloaded from L<http://perlmulticore.schmorp.de/perlmulticore.h>.
30
31=head2 XS? HOW DO I USE THIS FROM PERL?
32
33This document is only about the XS-level mechanism that defines generic
34callbacks - to make use of this, you need a module that provides an
35implementation for these callbacks, for example
36L<Coro::Multicore|http://pod.tst.eu/http://cvs.schmorp.de/Coro-Multicore/Multicore.pm>.
37
38=head2 WHICH MODULES SUPPORT IT?
39
40You can check L<the perl multicore registry|http://perlmulticore.schmorp.de/registry>
41for a list of modules that support this specification.
30 42
31=head1 HOW DO I USE THIS IN MY MODULES? 43=head1 HOW DO I USE THIS IN MY MODULES?
32 44
33The usage is very simple - you include this header file in your XS module. Then, before you 45The usage is very simple - you include this header file in your XS module. Then, before you
34do your lengthy operation, you release the perl interpreter: 46do your lengthy operation, you release the perl interpreter:
57 RETVAL = flock (fd, operation); 69 RETVAL = flock (fd, operation);
58 perlinterp_acquire (); 70 perlinterp_acquire ();
59 OUTPUT: 71 OUTPUT:
60 RETVAL 72 RETVAL
61 73
62Another example would be to modify L<DBD::mysql> to allow other 74You cna find more examples In the L<Case Studies> appendix.
63threads to execute while executing SQL queries. One way to do this
64is find all C<mysql_st_internal_execute> and similar calls (such as
65C<mysql_st_internal_execute41>), and adorn them with release/acquire
66calls:
67
68 {
69 perlinterp_release ();
70 imp_sth->row_num= mysql_st_internal_execute(sth, ...);
71 perlinterp_acquire ();
72 }
73 75
74=head2 HOW ABOUT NOT-SO LONG WORK? 76=head2 HOW ABOUT NOT-SO LONG WORK?
75 77
76Sometimes you don't know how long your code will take - in a compression 78Sometimes you don't know how long your code will take - in a compression
77library for example, compressing a few hundred Kilobyte of data can take 79library for example, compressing a few hundred Kilobyte of data can take
302 304
303This could be added to perl's C<CPPFLAGS> when configuring perl on 305This could be added to perl's C<CPPFLAGS> when configuring perl on
304platforms that do not support threading at all for example. 306platforms that do not support threading at all for example.
305 307
306 308
309=head1 Appendix: Case StudiesX<Case Studies>
310
311This appendix shows some case studies on how to patch existing
312modules. Unless they are available on CPAN, the patched modules (including
313diffs), can be found at the perl multicore repository (see L<the
314perlmulticore registry|http://perlmulticore.schmorp.de/registry>)
315
316In addition to the patches shown, the
317L<perlmulticore.h|http://perlmulticore.schmorp.de/perlmulticore.h> header
318must be added to the module and included in any XS or C file that uses it.
319
320
321=head2 Case Study: C<Digest::MD5>
322
323The C<Digest::MD5> module presents some unique challenges becausu it mixes
324Perl-I/O and CPU-based processing.
325
326So first let's identify the easy cases - set up (in C<new>) and
327calculating the final digest are very fast operations and would unlikely
328profit from running them in a separate thread. Which leaves the C<add>
329method and the C<md5> (C<md5_hex>, C<md5_base64>) functions.
330
331They are both very easy to update - the C<MD5Update> call
332doesn't access any perl data structures, so you can slap
333C<perlinterp_release>/C<perlinterp_acquire> around it:
334
335 if (len > 8000) perlinterp_release ();
336 MD5Update(context, data, len);
337 if (len > 8000) perlinterp_acquire ();
338
339This works for both C<add> and C<md5> XS functions. The C<8000> is
340somewhat arbitrary.
341
342This leaves C<addfile>, which would normally be the ideal candidate,
343because it is often used on large files and needs to wait both for I/O and
344the CPU. Unfortunately, it is implemented like this (only the inner loop
345is shown):
346
347 unsigned char buffer[4096];
348
349 while ( (n = PerlIO_read(fh, buffer, sizeof(buffer))) > 0) {
350 MD5Update(context, buffer, n);
351 }
352
353That is, it uses a 4KB buffer per C<MD5Update>. Putting
354C<perlinterp_release>/C<perlinterp_acquire> calls around it would be way
355too inefficient. Ideally, you would want to put them around the whole
356loop.
357
358Unfortunately, C<Digest::MD5> uses C<PerlIO> for the actual I/O, and
359C<PerlIO> is not thread-safe. We can't even use a mutex, as we would have
360to protect against all other C<PerlIO> calls.
361
362As a compromise, we can use the C<USE_HEAP_INSTEAD_OF_STACK> option that
363C<Digest::MD5> provide, which puts the buffer onto the stack, and use a
364far larger buffer:
365
366 #define USE_HEAP_INSTEAD_OF_STACK
367
368 New(0, buffer, 1024 * 1024, unsigned char);
369
370 while ( (n = PerlIO_read(fh, buffer, sizeof(buffer))) > 0) {
371 if (n > 8000) perlinterp_release ();
372 MD5Update(context, buffer, n);
373 if (n > 8000) perlinterp_acquire ();
374 }
375
376This will unfortunately still block on I/O, and allocate a large block of
377memory, but it is better than nothing.
378
379
380=head2 Case Study: C<DBD::mysql>
381
382Another example would be to modify C<DBD::mysql> to allow other
383threads to execute while executing SQL queries.
384
385The actual code that needs to be patched is not actually in an F<.xs>
386file, but in the F<dbdimp.c> file, which is included in an XS file.
387
388While there are many calls, the most important ones are the statement
389execute calls. There are only two in F<dbdimp.c>, one call in
390C<mysql_st_internal_execute41>, and one in C<dbd_st_execute>, both calling
391the undocumented internal C<mysql_st_internal_execute> function.
392
393The difference is that the former is used with mysql 4.1+ and prepared
394statements.
395
396The call in C<dbd_st_execute> is easy, as it does all the important work
397and doesn't access any perl data structures (I checked C<DBIc_NUM_PARAMS>
398manually to make sure):
399
400 perlinterp_release ();
401 imp_sth->row_num= mysql_st_internal_execute(
402 sth,
403 *statement,
404 NULL,
405 DBIc_NUM_PARAMS(imp_sth),
406 imp_sth->params,
407 &imp_sth->result,
408 imp_dbh->pmysql,
409 imp_sth->use_mysql_use_result
410 );
411 perlinterp_acquire ();
412
413Despite the name, C<mysql_st_internal_execute41> isn't actually from
414F<libmysqlclient>, but a long function in F<dbdimp.c>. Here is an abridged version, with
415C<perlinterp_release>/C<perlinterp_acquire> calls:
416
417 int i;
418 enum enum_field_types enum_type;
419 dTHX;
420 int execute_retval;
421 my_ulonglong rows=0;
422 D_imp_xxh(sth);
423
424 if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
425 PerlIO_printf(DBIc_LOGPIO(imp_xxh),
426 "\t-> mysql_st_internal_execute41\n");
427
428 perlinterp_release ();
429
430 if (num_params > 0 && !(*has_been_bound))
431 {
432 if (mysql_stmt_bind_param(stmt,bind))
433 goto error;
434 }
435
436 if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
437 {
438 perlinterp_release ();
439 PerlIO_printf(DBIc_LOGPIO(imp_xxh),
440 "\t\tmysql_st_internal_execute41 calling mysql_execute with %d num_params\n",
441 num_params);
442 perlinterp_acquire ();
443 }
444
445
446 execute_retval= mysql_stmt_execute(stmt);
447
448 if (execute_retval)
449 goto error;
450
451 /*
452 This statement does not return a result set (INSERT, UPDATE...)
453 */
454 if (!(*result= mysql_stmt_result_metadata(stmt)))
455 {
456 if (mysql_stmt_errno(stmt))
457 goto error;
458
459 rows= mysql_stmt_affected_rows(stmt);
460 }
461 /*
462 This statement returns a result set (SELECT...)
463 */
464 else
465 {
466 for (i = mysql_stmt_field_count(stmt) - 1; i >=0; --i) {
467 enum_type = mysql_to_perl_type(stmt->fields[i].type);
468 if (enum_type != MYSQL_TYPE_DOUBLE && enum_type != MYSQL_TYPE_LONG)
469 {
470 /* mysql_stmt_store_result to update MYSQL_FIELD->max_length */
471 my_bool on = 1;
472 mysql_stmt_attr_set(stmt, STMT_ATTR_UPDATE_MAX_LENGTH, &on);
473 break;
474 }
475 }
476 /* Get the total rows affected and return */
477 if (mysql_stmt_store_result(stmt))
478 goto error;
479 else
480 rows= mysql_stmt_num_rows(stmt);
481 }
482 perlinterp_acquire ();
483 if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
484 PerlIO_printf(DBIc_LOGPIO(imp_xxh),
485 "\t<- mysql_internal_execute_41 returning %d rows\n",
486 (int) rows);
487 return(rows);
488
489 error:
490 if (*result)
491 {
492 mysql_free_result(*result);
493 *result= 0;
494 }
495 perlinterp_acquire ();
496 if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
497 PerlIO_printf(DBIc_LOGPIO(imp_xxh),
498 " errno %d err message %s\n",
499 mysql_stmt_errno(stmt),
500 mysql_stmt_error(stmt));
501
502So C<perlinterp_release> is called after some logging, but before the
503C<mysql_free_result> call.
504
505To make things more interesting, the function has multiple calls to
506C<PerlIO> to log things, all of which aren't thread-safe, and need to be
507surrounded with C<perlinterp_acquire> and C<pelrinterp_release> calls
508to temporarily re-acquire the interpreter. This is slow, but logging is
509normally off:
510
511 if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
512 {
513 perlinterp_release ();
514 PerlIO_printf(DBIc_LOGPIO(imp_xxh),
515 "\t\tmysql_st_internal_execute41 calling mysql_execute with %d num_params\n",
516 num_params);
517 perlinterp_acquire ();
518 }
519
520The function also has a separate error exit, each of which needs it's own
521C<perlinterp_acquire> call. First the normal function exit:
522
523 perlinterp_acquire ();
524 if (DBIc_TRACE_LEVEL(imp_xxh) >= 2)
525 PerlIO_printf(DBIc_LOGPIO(imp_xxh),
526 "\t<- mysql_internal_execute_41 returning %d rows\n",
527 (int) rows);
528 return(rows);
529
530And this is the error exit:
531
532 error:
533 if (*result)
534 {
535 mysql_free_result(*result);
536 *result= 0;
537 }
538 perlinterp_acquire ();
539
540This is enough to run DBI's C<execute> calls in separate threads.
541
542=head3 Interlude: the various C<DBD::mysql> async mechanisms
543
544Here is a short discussion of the four principal ways to run
545C<DBD::mysql> SQL queries asynchronously.
546
547=over 4
548
549=item in a separate process
550
551Both C<AnyEvent::DBI> and C<DBD::Gofer> (via
552C<DBD::Gofer::Transport::corostream>) can run C<DBI> calls in a separate
553process, and this is not limited to mysql. This has to be paid with more
554complex management, some limitations in what can be done, and an extra
555serailisation/deserialisation step for all data.
556
557=item C<DBD::mysql>'s async support
558
559This let's you execute the SQL query, while waiting for the results
560via an event loop or similar mechanism. This is reasonably fast and
561very compatible, but the disadvantage are that C<DBD::mysql> requires
562undocumented internal functions to do this, and more importantly, this
563only covers the actual execution phase, not the data transfer phase:
564for statements with large results, the program blocks till all of it is
565transferred, which can include large amounts of disk I/O.
566
567=item C<Coro::Mysql>
568
569This module actually works quite similar to the perl multicore, but uses
570Coro threads exclusively. It shares the advantages of C<DBD::mysql>'s
571async mode, but not, at least in theory, it's disadvantages. In practise,
572the mechanism it uses isn't undocumented, but distributions often don't
573come with the correct header file needed top use it, and oracle's mysql
574has broken whtis mechanism multiple times (mariadb supports it), so it's
575actually less reliably available than C<DBD::mysql>'s async mode or perl
576multicore.
577
578It also requires C<Coro>.
579
580=item perl multicore
581
582This method has all the advantages of C<Coro::Mysql> without most
583disadvantages, except that it incurs higher overhead due to the extra
584thread switching.
585
586=back
587
588Pick your poison.
589
590
307=head1 AUTHOR 591=head1 AUTHOR
308 592
309 Marc A. Lehmann <perlmulticore@schmorp.de> 593 Marc A. Lehmann <perlmulticore@schmorp.de>
310 http://perlmulticore.schmorp.de/ 594 http://perlmulticore.schmorp.de/
311 595

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines