ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/BDB/BDB.pm
(Generate patch)

Comparing BDB/BDB.pm (file contents):
Revision 1.14 by root, Thu Sep 13 12:29:49 2007 UTC vs.
Revision 1.37 by root, Mon Jul 7 14:28:53 2008 UTC

3BDB - Asynchronous Berkeley DB access 3BDB - Asynchronous Berkeley DB access
4 4
5=head1 SYNOPSIS 5=head1 SYNOPSIS
6 6
7 use BDB; 7 use BDB;
8
9 my $env = db_env_create;
10
11 mkdir "bdtest", 0700;
12 db_env_open
13 $env,
14 "bdtest",
15 BDB::INIT_LOCK | BDB::INIT_LOG | BDB::INIT_MPOOL
16 | BDB::INIT_TXN | BDB::RECOVER | BDB::USE_ENVIRON | BDB::CREATE,
17 0600;
18
19 $env->set_flags (BDB::AUTO_COMMIT | BDB::TXN_NOSYNC, 1);
20
21 my $db = db_create $env;
22 db_open $db, undef, "table", undef, BDB::BTREE, BDB::AUTO_COMMIT | BDB::CREATE
23 | BDB::READ_UNCOMMITTED, 0600;
24 db_put $db, undef, "key", "data", 0, sub {
25 db_del $db, undef, "key";
26 };
27 db_sync $db;
28
29 # when you also use Coro, management is easy:
30 use Coro::BDB;
31
32 # automatic event loop intergration with AnyEvent:
33 use AnyEvent::BDB;
34
35 # automatic result processing with EV:
36 my $WATCHER = EV::io BDB::poll_fileno, EV::READ, \&BDB::poll_cb;
37
38 # with Glib:
39 add_watch Glib::IO BDB::poll_fileno,
40 in => sub { BDB::poll_cb; 1 };
41
42 # or simply flush manually
43 BDB::flush;
44
8 45
9=head1 DESCRIPTION 46=head1 DESCRIPTION
10 47
11See the BerkeleyDB documentation (L<http://www.oracle.com/technology/documentation/berkeley-db/db/index.html>). 48See the BerkeleyDB documentation (L<http://www.oracle.com/technology/documentation/berkeley-db/db/index.html>).
12The BDB API is very similar to the C API (the translation has been very faithful). 49The BDB API is very similar to the C API (the translation has been very faithful).
72use strict 'vars'; 109use strict 'vars';
73 110
74use base 'Exporter'; 111use base 'Exporter';
75 112
76BEGIN { 113BEGIN {
77 our $VERSION = '1.0'; 114 our $VERSION = '1.5';
78 115
79 our @BDB_REQ = qw( 116 our @BDB_REQ = qw(
80 db_env_open db_env_close db_env_txn_checkpoint db_env_lock_detect 117 db_env_open db_env_close db_env_txn_checkpoint db_env_lock_detect
81 db_env_memp_sync db_env_memp_trickle 118 db_env_memp_sync db_env_memp_trickle
82 db_open db_close db_compact db_sync db_put db_get db_pget db_del db_key_range 119 db_open db_close db_compact db_sync db_upgrade
120 db_put db_get db_pget db_del db_key_range
83 db_txn_commit db_txn_abort 121 db_txn_commit db_txn_abort db_txn_finish
84 db_c_close db_c_count db_c_put db_c_get db_c_pget db_c_del 122 db_c_close db_c_count db_c_put db_c_get db_c_pget db_c_del
85 db_sequence_open db_sequence_close 123 db_sequence_open db_sequence_close
86 db_sequence_get db_sequence_remove 124 db_sequence_get db_sequence_remove
87 ); 125 );
88 our @EXPORT = (@BDB_REQ, qw(dbreq_pri dbreq_nice db_env_create db_create)); 126 our @EXPORT = (@BDB_REQ, qw(dbreq_pri dbreq_nice db_env_create db_create));
95 133
96 require XSLoader; 134 require XSLoader;
97 XSLoader::load ("BDB", $VERSION); 135 XSLoader::load ("BDB", $VERSION);
98} 136}
99 137
138=head2 WIN32 FILENAMES/DATABASE NAME MESS
139
140Perl on Win32 supports only ASCII filenames (the reason is that it abuses
141an internal flag to store wether a filename is Unicode or ANSI, but that
142flag is used for somethign else in the perl core, so there is no way to
143detect wether a filename is ANSI or Unicode-encoded). The BDB module
144tries to work around this issue by assuming that the filename is an ANSI
145filename and BDB was built for unicode support.
146
100=head2 BERKELEYDB FUNCTIONS 147=head2 BERKELEYDB FUNCTIONS
101 148
102All of these are functions. The create functions simply return a new 149All of these are functions. The create functions simply return a new
103object and never block. All the remaining functions all take an optional 150object and never block. All the remaining functions take an optional
104callback as last argument. If it is missing, then the fucntion will be 151callback as last argument. If it is missing, then the function will be
105executed synchronously. 152executed synchronously. In both cases, C<$!> will reflect the return value
153of the function.
106 154
107BDB functions that cannot block (mostly functions that manipulate 155BDB functions that cannot block (mostly functions that manipulate
108settings) are method calls on the relevant objects, so the rule of thumb 156settings) are method calls on the relevant objects, so the rule of thumb
109is: if its a method, its not blocking, if its a function, it takes a 157is: if it's a method, it's not blocking, if it's a function, it takes a
110callback as last argument. 158callback as last argument.
111 159
112In the following, C<$int> signifies an integer return value, 160In the following, C<$int> signifies an integer return value,
113C<octetstring> is a "binary string" (i.e. a perl string with no character 161C<octetstring> is a "binary string" (i.e. a perl string with no character
114indices >255), C<U32> is an unsigned 32 bit integer, C<int> is some 162indices >255), C<U32> is an unsigned 32 bit integer, C<int> is some
145 193
146 db_open (DB *db, DB_TXN_ornull *txnid, octetstring file, octetstring database, int type, U32 flags, int mode, SV *callback = &PL_sv_undef) 194 db_open (DB *db, DB_TXN_ornull *txnid, octetstring file, octetstring database, int type, U32 flags, int mode, SV *callback = &PL_sv_undef)
147 flags: AUTO_COMMIT CREATE EXCL MULTIVERSION NOMMAP RDONLY READ_UNCOMMITTED THREAD TRUNCATE 195 flags: AUTO_COMMIT CREATE EXCL MULTIVERSION NOMMAP RDONLY READ_UNCOMMITTED THREAD TRUNCATE
148 db_close (DB *db, U32 flags = 0, SV *callback = &PL_sv_undef) 196 db_close (DB *db, U32 flags = 0, SV *callback = &PL_sv_undef)
149 flags: DB_NOSYNC 197 flags: DB_NOSYNC
198 db_upgrade (DB *db, octetstring file, U32 flags = 0, SV *callback = &PL_sv_undef)
150 db_compact (DB *db, DB_TXN_ornull *txn = 0, SV *start = 0, SV *stop = 0, SV *unused1 = 0, U32 flags = DB_FREE_SPACE, SV *unused2 = 0, SV *callback = &PL_sv_undef) 199 db_compact (DB *db, DB_TXN_ornull *txn = 0, SV *start = 0, SV *stop = 0, SV *unused1 = 0, U32 flags = DB_FREE_SPACE, SV *unused2 = 0, SV *callback = &PL_sv_undef)
151 flags: FREELIST_ONLY FREE_SPACE 200 flags: FREELIST_ONLY FREE_SPACE
152 db_sync (DB *db, U32 flags = 0, SV *callback = &PL_sv_undef) 201 db_sync (DB *db, U32 flags = 0, SV *callback = &PL_sv_undef)
153 db_key_range (DB *db, DB_TXN_ornull *txn, SV *key, SV *key_range, U32 flags = 0, SV *callback = &PL_sv_undef) 202 db_key_range (DB *db, DB_TXN_ornull *txn, SV *key, SV *key_range, U32 flags = 0, SV *callback = &PL_sv_undef)
154 db_put (DB *db, DB_TXN_ornull *txn, SV *key, SV *data, U32 flags = 0, SV *callback = &PL_sv_undef) 203 db_put (DB *db, DB_TXN_ornull *txn, SV *key, SV *data, U32 flags = 0, SV *callback = &PL_sv_undef)
177 db_sequence_get (DB_SEQUENCE *seq, DB_TXN_ornull *txnid, int delta, SV *seq_value, U32 flags = DB_TXN_NOSYNC, SV *callback = &PL_sv_undef) 226 db_sequence_get (DB_SEQUENCE *seq, DB_TXN_ornull *txnid, int delta, SV *seq_value, U32 flags = DB_TXN_NOSYNC, SV *callback = &PL_sv_undef)
178 flags: TXN_NOSYNC 227 flags: TXN_NOSYNC
179 db_sequence_remove (DB_SEQUENCE *seq, DB_TXN_ornull *txnid = 0, U32 flags = 0, SV *callback = &PL_sv_undef) 228 db_sequence_remove (DB_SEQUENCE *seq, DB_TXN_ornull *txnid = 0, U32 flags = 0, SV *callback = &PL_sv_undef)
180 flags: TXN_NOSYNC 229 flags: TXN_NOSYNC
181 230
231=head4 db_txn_finish (DB_TXN *txn, U32 flags = 0, SV *callback = &PL_sv_undef)
232
233This is not actually a Berkeley DB function but a BDB module
234extension. The background for this exytension is: It is very annoying to
235have to check every single BDB function for error returns and provide a
236codepath out of your transaction. While the BDB module still makes this
237possible, it contains the following extensions:
238
239When a transaction-protected function returns any operating system
240error (errno > 0), BDB will set the C<TXN_DEADLOCK> flag on the
241transaction. This flag is also set by Berkeley DB functions themselves
242when an operation fails with LOCK_DEADLOCK, and it causes all further
243operations on that transaction (including C<db_txn_commit>) to fail.
244
245The C<db_txn_finish> request will look at this flag, and, if it is set,
246will automatically call C<db_txn_abort> (setting errno to C<LOCK_DEADLOCK>
247if it isn't set to something else yet). If it isn't set, it will call
248C<db_txn_commit> and return the error normally.
249
250How to use this? Easy: just write your transaction normally:
251
252 my $txn = $db_env->txn_begin;
253 db_get $db, $txn, "key", my $data;
254 db_put $db, $txn, "key", $data + 1 unless $! == BDB::NOTFOUND;
255 db_txn_finish $txn;
256 die "transaction failed" if $!;
257
258That is, handle only the expected errors. If something unexpected happens
259(EIO, LOCK_NOTGRANTED or a deadlock in either db_get or db_put), then the remaining
260requests (db_put in this case) will simply be skipped (they will fail with
261LOCK_DEADLOCK) and the transaction will be aborted.
262
263You can use the C<< $txn->failed >> method to check wether a transaction
264has failed in this way and abort further processing (excluding
265C<db_txn_finish>).
266
182=head3 DB_ENV/database environment methods 267=head3 DB_ENV/database environment methods
183 268
184Methods available on DB_ENV/$env handles: 269Methods available on DB_ENV/$env handles:
185 270
186 DESTROY (DB_ENV_ornull *env) 271 DESTROY (DB_ENV_ornull *env)
196 $int = $env->set_flags (U32 flags, int onoff) 281 $int = $env->set_flags (U32 flags, int onoff)
197 $env->set_errfile (FILE *errfile = 0) 282 $env->set_errfile (FILE *errfile = 0)
198 $env->set_msgfile (FILE *msgfile = 0) 283 $env->set_msgfile (FILE *msgfile = 0)
199 $int = $env->set_verbose (U32 which, int onoff = 1) 284 $int = $env->set_verbose (U32 which, int onoff = 1)
200 $int = $env->set_encrypt (const char *password, U32 flags = 0) 285 $int = $env->set_encrypt (const char *password, U32 flags = 0)
201 $int = $env->set_timeout (NV timeout, U32 flags) 286 $int = $env->set_timeout (NV timeout_seconds, U32 flags = SET_TXN_TIMEOUT)
202 $int = $env->set_mp_max_openfd (int maxopenfd); 287 $int = $env->set_mp_max_openfd (int maxopenfd);
203 $int = $env->set_mp_max_write (int maxwrite, int maxwrite_sleep); 288 $int = $env->set_mp_max_write (int maxwrite, int maxwrite_sleep);
204 $int = $env->set_mp_mmapsize (int mmapsize_mb) 289 $int = $env->set_mp_mmapsize (int mmapsize_mb)
205 $int = $env->set_lk_detect (U32 detect = DB_LOCK_DEFAULT) 290 $int = $env->set_lk_detect (U32 detect = DB_LOCK_DEFAULT)
206 $int = $env->set_lk_max_lockers (U32 max) 291 $int = $env->set_lk_max_lockers (U32 max)
207 $int = $env->set_lk_max_locks (U32 max) 292 $int = $env->set_lk_max_locks (U32 max)
208 $int = $env->set_lk_max_objects (U32 max) 293 $int = $env->set_lk_max_objects (U32 max)
209 $int = $env->set_lg_bsize (U32 max) 294 $int = $env->set_lg_bsize (U32 max)
210 $int = $env->set_lg_max (U32 max) 295 $int = $env->set_lg_max (U32 max)
296 $int = $env->mutex_set_increment (U32 increment)
297 $int = $env->mutex_set_tas_spins (U32 tas_spins)
298 $int = $env->mutex_set_max (U32 max)
299 $int = $env->mutex_set_align (U32 align)
211 300
212 $txn = $env->txn_begin (DB_TXN_ornull *parent = 0, U32 flags = 0) 301 $txn = $env->txn_begin (DB_TXN_ornull *parent = 0, U32 flags = 0)
213 flags: READ_COMMITTED READ_UNCOMMITTED TXN_NOSYNC TXN_NOWAIT TXN_SNAPSHOT TXN_SYNC TXN_WAIT TXN_WRITE_NOSYNC 302 flags: READ_COMMITTED READ_UNCOMMITTED TXN_NOSYNC TXN_NOWAIT TXN_SNAPSHOT TXN_SYNC TXN_WAIT TXN_WRITE_NOSYNC
214 303
215=head4 Example: 304=head4 Example:
294 DESTROY (DB_TXN_ornull *txn) 383 DESTROY (DB_TXN_ornull *txn)
295 CODE: 384 CODE:
296 if (txn) 385 if (txn)
297 txn->abort (txn); 386 txn->abort (txn);
298 387
299 $int = $txn->set_timeout (NV timeout, U32 flags) 388 $int = $txn->set_timeout (NV timeout_seconds, U32 flags = SET_TXN_TIMEOUT)
300 flags: SET_LOCK_TIMEOUT SET_TXN_TIMEOUT 389 flags: SET_LOCK_TIMEOUT SET_TXN_TIMEOUT
390
391 $bool = $txn->failed
392 # see db_txn_finish documentation, above
301 393
302 394
303=head3 DBC/cursor methods 395=head3 DBC/cursor methods
304 396
305Methods available on DBC/$dbc handles: 397Methods available on DBC/$dbc handles:
306 398
307 DESTROY (DBC_ornull *dbc) 399 DESTROY (DBC_ornull *dbc)
308 CODE: 400 CODE:
309 if (dbc) 401 if (dbc)
310 dbc->c_close (dbc); 402 dbc->c_close (dbc);
403
404 $int = $cursor->set_priority ($priority = PRIORITY_*)
311 405
312=head4 Example: 406=head4 Example:
313 407
314 my $c = $db->cursor; 408 my $c = $db->cursor;
315 409
349 443
350=head3 EVENT PROCESSING AND EVENT LOOP INTEGRATION 444=head3 EVENT PROCESSING AND EVENT LOOP INTEGRATION
351 445
352=over 4 446=over 4
353 447
448=item $msg = BDB::strerror [$errno]
449
450Returns the string corresponding to the given errno value. If no argument
451is given, use C<$!>.
452
453Note that the BDB module also patches the C<$!> variable directly, so you
454should be able to get a bdb error string by simply stringifying C<$!>.
455
354=item $fileno = BDB::poll_fileno 456=item $fileno = BDB::poll_fileno
355 457
356Return the I<request result pipe file descriptor>. This filehandle must be 458Return the I<request result pipe file descriptor>. This filehandle must be
357polled for reading by some mechanism outside this module (e.g. Event or 459polled for reading by some mechanism outside this module (e.g. Event or
358select, see below or the SYNOPSIS). If the pipe becomes readable you have 460select, see below or the SYNOPSIS). If the pipe becomes readable you have
396interactiveness when perl is not fast enough to process all requests in 498interactiveness when perl is not fast enough to process all requests in
397time. 499time.
398 500
399For interactive programs, values such as C<0.01> to C<0.1> should be fine. 501For interactive programs, values such as C<0.01> to C<0.1> should be fine.
400 502
401Example: Install an Event watcher that automatically calls 503Example: Install an EV watcher that automatically calls
402BDB::poll_cb with low priority, to ensure that other parts of the 504BDB::poll_cb with low priority, to ensure that other parts of the
403program get the CPU sometimes even under high AIO load. 505program get the CPU sometimes even under high load.
404 506
405 # try not to spend much more than 0.1s in poll_cb 507 # try not to spend much more than 0.1s in poll_cb
406 BDB::max_poll_time 0.1; 508 BDB::max_poll_time 0.1;
407 509
408 # use a low priority so other tasks have priority 510 my $bdb_poll = EV::io BDB::poll_fileno, EV::READ, \&BDB::poll_cb);
409 Event->io (fd => BDB::poll_fileno,
410 poll => 'r', nice => 1,
411 cb => &BDB::poll_cb);
412 511
413=item BDB::poll_wait 512=item BDB::poll_wait
414 513
415If there are any outstanding requests and none of them in the result 514If there are any outstanding requests and none of them in the result
416phase, wait till the result filehandle becomes ready for reading (simply 515phase, wait till the result filehandle becomes ready for reading (simply
428 527
429 BDB::poll_wait, BDB::poll_cb 528 BDB::poll_wait, BDB::poll_cb
430 529
431=item BDB::flush 530=item BDB::flush
432 531
433Wait till all outstanding AIO requests have been handled. 532Wait till all outstanding BDB requests have been handled.
434 533
435Strictly equivalent to: 534Strictly equivalent to:
436 535
437 BDB::poll_wait, BDB::poll_cb 536 BDB::poll_wait, BDB::poll_cb
438 while BDB::nreqs; 537 while BDB::nreqs;
443 542
444=over 4 543=over 4
445 544
446=item BDB::min_parallel $nthreads 545=item BDB::min_parallel $nthreads
447 546
448Set the minimum number of AIO threads to C<$nthreads>. The current 547Set the minimum number of BDB threads to C<$nthreads>. The current
449default is C<8>, which means eight asynchronous operations can execute 548default is C<8>, which means eight asynchronous operations can execute
450concurrently at any one time (the number of outstanding requests, 549concurrently at any one time (the number of outstanding requests,
451however, is unlimited). 550however, is unlimited).
452 551
453BDB starts threads only on demand, when an AIO request is queued and 552BDB starts threads only on demand, when an BDB request is queued and
454no free thread exists. Please note that queueing up a hundred requests can 553no free thread exists. Please note that queueing up a hundred requests can
455create demand for a hundred threads, even if it turns out that everything 554create demand for a hundred threads, even if it turns out that everything
456is in the cache and could have been processed faster by a single thread. 555is in the cache and could have been processed faster by a single thread.
457 556
458It is recommended to keep the number of threads relatively low, as some 557It is recommended to keep the number of threads relatively low, as some
463Under most circumstances you don't need to call this function, as the 562Under most circumstances you don't need to call this function, as the
464module selects a default that is suitable for low to moderate load. 563module selects a default that is suitable for low to moderate load.
465 564
466=item BDB::max_parallel $nthreads 565=item BDB::max_parallel $nthreads
467 566
468Sets the maximum number of AIO threads to C<$nthreads>. If more than the 567Sets the maximum number of BDB threads to C<$nthreads>. If more than the
469specified number of threads are currently running, this function kills 568specified number of threads are currently running, this function kills
470them. This function blocks until the limit is reached. 569them. This function blocks until the limit is reached.
471 570
472While C<$nthreads> are zero, aio requests get queued but not executed 571While C<$nthreads> are zero, aio requests get queued but not executed
473until the number of threads has been increased again. 572until the number of threads has been increased again.
512 611
513=item BDB::set_sync_prepare $cb 612=item BDB::set_sync_prepare $cb
514 613
515Sets a callback that is called whenever a request is created without an 614Sets a callback that is called whenever a request is created without an
516explicit callback. It has to return two code references. The first is used 615explicit callback. It has to return two code references. The first is used
517as the request callback, and the second is called to wait until the first 616as the request callback (it should save the return status), and the second
617is called to wait until the first callback has been called (it must set
618C<$!> to the return status).
619
620This mechanism can be used to include BDB into other event mechanisms,
621such as L<AnyEvent::BDB> or L<Coro::BDB>.
622
518callback has been called. The default implementation works like this: 623The default implementation works like this:
519 624
520 sub { 625 sub {
521 my $status; 626 my $status;
522 ( 627 (
523 sub { $status = $! }, 628 sub { $status = $! },
524 sub { BDB::poll while !defined $status; $! = $status }, 629 sub { BDB::poll while !defined $status; $! = $status },
525 ) 630 )
526 } 631 }
632
633It simply blocks the process till the request has finished and then sets
634C<$!> to the return value. This means that if you don't use a callback,
635BDB will simply fall back to synchronous operations.
527 636
528=back 637=back
529 638
530=head3 STATISTICAL INFORMATION 639=head3 STATISTICAL INFORMATION
531 640
576 685
577=head2 FORK BEHAVIOUR 686=head2 FORK BEHAVIOUR
578 687
579This module should do "the right thing" when the process using it forks: 688This module should do "the right thing" when the process using it forks:
580 689
581Before the fork, IO::AIO enters a quiescent state where no requests 690Before the fork, BDB enters a quiescent state where no requests
582can be added in other threads and no results will be processed. After 691can be added in other threads and no results will be processed. After
583the fork the parent simply leaves the quiescent state and continues 692the fork the parent simply leaves the quiescent state and continues
584request/result processing, while the child frees the request/result queue 693request/result processing, while the child frees the request/result queue
585(so that the requests started before the fork will only be handled in the 694(so that the requests started before the fork will only be handled in the
586parent). Threads will be started on demand until the limit set in the 695parent). Threads will be started on demand until the limit set in the
587parent process has been reached again. 696parent process has been reached again.
588 697
589In short: the parent will, after a short pause, continue as if fork had 698In short: the parent will, after a short pause, continue as if fork had
590not been called, while the child will act as if IO::AIO has not been used 699not been called, while the child will act as if BDB has not been used
591yet. 700yet.
701
702Win32 note: there is no fork on win32, and perls emulation of it is too
703broken to be supported, so do not use BDB in a windows pseudo-fork, better
704yet, switch to a more capable platform.
592 705
593=head2 MEMORY USAGE 706=head2 MEMORY USAGE
594 707
595Per-request usage: 708Per-request usage:
596 709
609temporary buffers, and each thread requires a stack and other data 722temporary buffers, and each thread requires a stack and other data
610structures (usually around 16k-128k, depending on the OS). 723structures (usually around 16k-128k, depending on the OS).
611 724
612=head1 KNOWN BUGS 725=head1 KNOWN BUGS
613 726
614Known bugs will be fixed in the next release. 727Known bugs will be fixed in the next release, except:
728
729 If you use a transaction in any request, and the request returns
730 with an operating system error or DB_LOCK_NOTGRANTED, the internal
731 TXN_DEADLOCK flag will be set on the transaction. See C<db_txn_finish>,
732 above.
615 733
616=head1 SEE ALSO 734=head1 SEE ALSO
617 735
618L<Coro::AIO>. 736L<AnyEvent::BDB> (event loop integration), L<Coro::BDB> (more natural
737syntax), L<IO::AIO> (nice to have).
619 738
620=head1 AUTHOR 739=head1 AUTHOR
621 740
622 Marc Lehmann <schmorp@schmorp.de> 741 Marc Lehmann <schmorp@schmorp.de>
623 http://home.schmorp.de/ 742 http://home.schmorp.de/

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines