ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/BDB/BDB.pm
Revision: 1.3
Committed: Mon Feb 5 22:19:07 2007 UTC (17 years, 3 months ago) by root
Branch: MAIN
Changes since 1.2: +37 -1 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3 root 1.2 BDB - Asynchronous Berkeley DB access
4 root 1.1
5     =head1 SYNOPSIS
6    
7 root 1.2 use BDB;
8 root 1.1
9     =head1 DESCRIPTION
10    
11     =head2 EXAMPLE
12    
13     =head1 REQUEST ANATOMY AND LIFETIME
14    
15     Every request method creates a request. which is a C data structure not
16     directly visible to Perl.
17    
18     During their existance, bdb requests travel through the following states,
19     in order:
20    
21     =over 4
22    
23     =item ready
24    
25     Immediately after a request is created it is put into the ready state,
26     waiting for a thread to execute it.
27    
28     =item execute
29    
30     A thread has accepted the request for processing and is currently
31     executing it (e.g. blocking in read).
32    
33     =item pending
34    
35     The request has been executed and is waiting for result processing.
36    
37     While request submission and execution is fully asynchronous, result
38     processing is not and relies on the perl interpreter calling C<poll_cb>
39     (or another function with the same effect).
40    
41     =item result
42    
43     The request results are processed synchronously by C<poll_cb>.
44    
45     The C<poll_cb> function will process all outstanding aio requests by
46     calling their callbacks, freeing memory associated with them and managing
47     any groups they are contained in.
48    
49     =item done
50    
51     Request has reached the end of its lifetime and holds no resources anymore
52     (except possibly for the Perl object, but its connection to the actual
53     aio request is severed and calling its methods will either do nothing or
54     result in a runtime error).
55    
56     =back
57    
58     =cut
59    
60 root 1.2 package BDB;
61 root 1.1
62     no warnings;
63     use strict 'vars';
64    
65     use base 'Exporter';
66    
67     BEGIN {
68     our $VERSION = '0.1';
69    
70 root 1.3 our @BDB_REQ = qw(
71     db_env_create db_env_open db_env_close
72     db_create db_open db_close db_compact db_sync db_put
73     );
74     our @EXPORT = (@BDB_REQ, qw(dbreq_pri dbreq_nice));
75 root 1.1 our @EXPORT_OK = qw(poll_fileno poll_cb poll_wait flush
76     min_parallel max_parallel max_idle
77     nreqs nready npending nthreads
78     max_poll_time max_poll_reqs);
79    
80     require XSLoader;
81 root 1.2 XSLoader::load ("BDB", $VERSION);
82 root 1.1 }
83    
84     =head2 SUPPORT FUNCTIONS
85    
86     =head3 EVENT PROCESSING AND EVENT LOOP INTEGRATION
87    
88     =over 4
89    
90 root 1.2 =item $fileno = BDB::poll_fileno
91 root 1.1
92     Return the I<request result pipe file descriptor>. This filehandle must be
93     polled for reading by some mechanism outside this module (e.g. Event or
94     select, see below or the SYNOPSIS). If the pipe becomes readable you have
95     to call C<poll_cb> to check the results.
96    
97     See C<poll_cb> for an example.
98    
99 root 1.2 =item BDB::poll_cb
100 root 1.1
101     Process some outstanding events on the result pipe. You have to call this
102     regularly. Returns the number of events processed. Returns immediately
103     when no events are outstanding. The amount of events processed depends on
104 root 1.2 the settings of C<BDB::max_poll_req> and C<BDB::max_poll_time>.
105 root 1.1
106     If not all requests were processed for whatever reason, the filehandle
107     will still be ready when C<poll_cb> returns.
108    
109     Example: Install an Event watcher that automatically calls
110 root 1.2 BDB::poll_cb with high priority:
111 root 1.1
112 root 1.2 Event->io (fd => BDB::poll_fileno,
113 root 1.1 poll => 'r', async => 1,
114 root 1.2 cb => \&BDB::poll_cb);
115 root 1.1
116 root 1.2 =item BDB::max_poll_reqs $nreqs
117 root 1.1
118 root 1.2 =item BDB::max_poll_time $seconds
119 root 1.1
120     These set the maximum number of requests (default C<0>, meaning infinity)
121 root 1.2 that are being processed by C<BDB::poll_cb> in one call, respectively
122 root 1.1 the maximum amount of time (default C<0>, meaning infinity) spent in
123 root 1.2 C<BDB::poll_cb> to process requests (more correctly the mininum amount
124 root 1.1 of time C<poll_cb> is allowed to use).
125    
126     Setting C<max_poll_time> to a non-zero value creates an overhead of one
127     syscall per request processed, which is not normally a problem unless your
128     callbacks are really really fast or your OS is really really slow (I am
129     not mentioning Solaris here). Using C<max_poll_reqs> incurs no overhead.
130    
131     Setting these is useful if you want to ensure some level of
132     interactiveness when perl is not fast enough to process all requests in
133     time.
134    
135     For interactive programs, values such as C<0.01> to C<0.1> should be fine.
136    
137     Example: Install an Event watcher that automatically calls
138 root 1.2 BDB::poll_cb with low priority, to ensure that other parts of the
139 root 1.1 program get the CPU sometimes even under high AIO load.
140    
141     # try not to spend much more than 0.1s in poll_cb
142 root 1.2 BDB::max_poll_time 0.1;
143 root 1.1
144     # use a low priority so other tasks have priority
145 root 1.2 Event->io (fd => BDB::poll_fileno,
146 root 1.1 poll => 'r', nice => 1,
147 root 1.2 cb => &BDB::poll_cb);
148 root 1.1
149 root 1.2 =item BDB::poll_wait
150 root 1.1
151     If there are any outstanding requests and none of them in the result
152     phase, wait till the result filehandle becomes ready for reading (simply
153     does a C<select> on the filehandle. This is useful if you want to
154     synchronously wait for some requests to finish).
155    
156     See C<nreqs> for an example.
157    
158 root 1.2 =item BDB::poll
159 root 1.1
160     Waits until some requests have been handled.
161    
162     Returns the number of requests processed, but is otherwise strictly
163     equivalent to:
164    
165 root 1.2 BDB::poll_wait, BDB::poll_cb
166 root 1.1
167 root 1.2 =item BDB::flush
168 root 1.1
169     Wait till all outstanding AIO requests have been handled.
170    
171     Strictly equivalent to:
172    
173 root 1.2 BDB::poll_wait, BDB::poll_cb
174     while BDB::nreqs;
175 root 1.1
176     =head3 CONTROLLING THE NUMBER OF THREADS
177    
178 root 1.2 =item BDB::min_parallel $nthreads
179 root 1.1
180     Set the minimum number of AIO threads to C<$nthreads>. The current
181     default is C<8>, which means eight asynchronous operations can execute
182     concurrently at any one time (the number of outstanding requests,
183     however, is unlimited).
184    
185 root 1.2 BDB starts threads only on demand, when an AIO request is queued and
186 root 1.1 no free thread exists. Please note that queueing up a hundred requests can
187     create demand for a hundred threads, even if it turns out that everything
188     is in the cache and could have been processed faster by a single thread.
189    
190     It is recommended to keep the number of threads relatively low, as some
191     Linux kernel versions will scale negatively with the number of threads
192     (higher parallelity => MUCH higher latency). With current Linux 2.6
193     versions, 4-32 threads should be fine.
194    
195     Under most circumstances you don't need to call this function, as the
196     module selects a default that is suitable for low to moderate load.
197    
198 root 1.2 =item BDB::max_parallel $nthreads
199 root 1.1
200     Sets the maximum number of AIO threads to C<$nthreads>. If more than the
201     specified number of threads are currently running, this function kills
202     them. This function blocks until the limit is reached.
203    
204     While C<$nthreads> are zero, aio requests get queued but not executed
205     until the number of threads has been increased again.
206    
207     This module automatically runs C<max_parallel 0> at program end, to ensure
208     that all threads are killed and that there are no outstanding requests.
209    
210     Under normal circumstances you don't need to call this function.
211    
212 root 1.2 =item BDB::max_idle $nthreads
213 root 1.1
214     Limit the number of threads (default: 4) that are allowed to idle (i.e.,
215     threads that did not get a request to process within 10 seconds). That
216     means if a thread becomes idle while C<$nthreads> other threads are also
217     idle, it will free its resources and exit.
218    
219     This is useful when you allow a large number of threads (e.g. 100 or 1000)
220     to allow for extremely high load situations, but want to free resources
221     under normal circumstances (1000 threads can easily consume 30MB of RAM).
222    
223     The default is probably ok in most situations, especially if thread
224     creation is fast. If thread creation is very slow on your system you might
225     want to use larger values.
226    
227 root 1.2 =item $oldmaxreqs = BDB::max_outstanding $maxreqs
228 root 1.1
229     This is a very bad function to use in interactive programs because it
230     blocks, and a bad way to reduce concurrency because it is inexact: Better
231     use an C<aio_group> together with a feed callback.
232    
233     Sets the maximum number of outstanding requests to C<$nreqs>. If you
234     to queue up more than this number of requests, the next call to the
235     C<poll_cb> (and C<poll_some> and other functions calling C<poll_cb>)
236     function will block until the limit is no longer exceeded.
237    
238     The default value is very large, so there is no practical limit on the
239     number of outstanding requests.
240    
241     You can still queue as many requests as you want. Therefore,
242     C<max_oustsanding> is mainly useful in simple scripts (with low values) or
243     as a stop gap to shield against fatal memory overflow (with large values).
244    
245 root 1.3 =item BDB::set_sync_prepare $cb
246    
247     Sets a callback that is called whenever a request is created without an
248     explicit callback. It has to return two code references. The first is used
249     as the request callback, and the second is called to wait until the first
250     callback has been called. The default implementation works like this:
251    
252     sub {
253     my $status;
254     (
255     sub { $status = $! },
256     sub { BDB::poll while !defined $status; $! = $status },
257     )
258     }
259    
260     =back
261    
262 root 1.1 =head3 STATISTICAL INFORMATION
263    
264 root 1.3 =over 4
265    
266 root 1.2 =item BDB::nreqs
267 root 1.1
268     Returns the number of requests currently in the ready, execute or pending
269     states (i.e. for which their callback has not been invoked yet).
270    
271     Example: wait till there are no outstanding requests anymore:
272    
273 root 1.2 BDB::poll_wait, BDB::poll_cb
274     while BDB::nreqs;
275 root 1.1
276 root 1.2 =item BDB::nready
277 root 1.1
278     Returns the number of requests currently in the ready state (not yet
279     executed).
280    
281 root 1.2 =item BDB::npending
282 root 1.1
283     Returns the number of requests currently in the pending state (executed,
284     but not yet processed by poll_cb).
285    
286     =back
287    
288     =cut
289    
290 root 1.3 set_sync_prepare {
291     my $status;
292     (
293     sub {
294     $status = $!;
295     },
296     sub {
297     BDB::poll while !defined $status;
298     $! = $status;
299     },
300     )
301     };
302    
303 root 1.1 min_parallel 8;
304    
305     END { flush }
306    
307     1;
308    
309     =head2 FORK BEHAVIOUR
310    
311     This module should do "the right thing" when the process using it forks:
312    
313     Before the fork, IO::AIO enters a quiescent state where no requests
314     can be added in other threads and no results will be processed. After
315     the fork the parent simply leaves the quiescent state and continues
316     request/result processing, while the child frees the request/result queue
317     (so that the requests started before the fork will only be handled in the
318     parent). Threads will be started on demand until the limit set in the
319     parent process has been reached again.
320    
321     In short: the parent will, after a short pause, continue as if fork had
322     not been called, while the child will act as if IO::AIO has not been used
323     yet.
324    
325     =head2 MEMORY USAGE
326    
327     Per-request usage:
328    
329     Each aio request uses - depending on your architecture - around 100-200
330     bytes of memory. In addition, stat requests need a stat buffer (possibly
331     a few hundred bytes), readdir requires a result buffer and so on. Perl
332     scalars and other data passed into aio requests will also be locked and
333     will consume memory till the request has entered the done state.
334    
335     This is now awfully much, so queuing lots of requests is not usually a
336     problem.
337    
338     Per-thread usage:
339    
340     In the execution phase, some aio requests require more memory for
341     temporary buffers, and each thread requires a stack and other data
342     structures (usually around 16k-128k, depending on the OS).
343    
344     =head1 KNOWN BUGS
345    
346     Known bugs will be fixed in the next release.
347    
348     =head1 SEE ALSO
349    
350     L<Coro::AIO>.
351    
352     =head1 AUTHOR
353    
354     Marc Lehmann <schmorp@schmorp.de>
355     http://home.schmorp.de/
356    
357     =cut
358