ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libeio/eio.pod
(Generate patch)

Comparing libeio/eio.pod (file contents):
Revision 1.10 by root, Sun Jun 5 23:22:04 2011 UTC vs.
Revision 1.21 by root, Thu Jul 7 22:36:18 2011 UTC

45Unlike the name component C<stamp> might indicate, it is also used for 45Unlike the name component C<stamp> might indicate, it is also used for
46time differences throughout libeio. 46time differences throughout libeio.
47 47
48=head2 FORK SUPPORT 48=head2 FORK SUPPORT
49 49
50Calling C<fork ()> is fully supported by this module. It is implemented in these steps: 50Calling C<fork ()> is fully supported by this module - but you must not
51rely on this. It is currently implemented in these steps:
51 52
52 1. wait till all requests in "execute" state have been handled 53 1. wait till all requests in "execute" state have been handled
53 (basically requests that are already handed over to the kernel). 54 (basically requests that are already handed over to the kernel).
54 2. fork 55 2. fork
55 3. in the parent, continue business as usual, done 56 3. in the parent, continue business as usual, done
56 4. in the child, destroy all ready and pending requests and free the 57 4. in the child, destroy all ready and pending requests and free the
57 memory used by the worker threads. This gives you a fully empty 58 memory used by the worker threads. This gives you a fully empty
58 libeio queue. 59 libeio queue.
59 60
60Note, however, since libeio does use threads, thr above guarantee doesn't 61Note, however, since libeio does use threads, the above guarantee doesn't
61cover your libc, for example, malloc and other libc functions are not 62cover your libc, for example, malloc and other libc functions are not
62fork-safe, so there is very little you can do after a fork, and in fatc, 63fork-safe, so there is very little you can do after a fork, and in fact,
63the above might crash, and thus change. 64the above might crash, and thus change.
64 65
65=head1 INITIALISATION/INTEGRATION 66=head1 INITIALISATION/INTEGRATION
66 67
67Before you can call any eio functions you first have to initialise the 68Before you can call any eio functions you first have to initialise the
124=back 125=back
125 126
126For libev, you would typically use an C<ev_async> watcher: the 127For libev, you would typically use an C<ev_async> watcher: the
127C<want_poll> callback would invoke C<ev_async_send> to wake up the event 128C<want_poll> callback would invoke C<ev_async_send> to wake up the event
128loop. Inside the callback set for the watcher, one would call C<eio_poll 129loop. Inside the callback set for the watcher, one would call C<eio_poll
129()> (followed by C<ev_async_send> again if C<eio_poll> indicates that not 130()>.
130all requests have been handled yet). The race is taken care of because 131
131libev resets/rearms the async watcher before calling your callback, 132If C<eio_poll ()> is configured to not handle all results in one go
132and therefore, before calling C<eio_poll>. This might result in (some) 133(i.e. it returns C<-1>) then you should start an idle watcher that calls
133spurious wake-ups, but is generally harmless. 134C<eio_poll> until it returns something C<!= -1>.
135
136A full-featured connector between libeio and libev would look as follows
137(if C<eio_poll> is handling all requests, it can of course be simplified a
138lot by removing the idle watcher logic):
139
140 static struct ev_loop *loop;
141 static ev_idle repeat_watcher;
142 static ev_async ready_watcher;
143
144 /* idle watcher callback, only used when eio_poll */
145 /* didn't handle all results in one call */
146 static void
147 repeat (EV_P_ ev_idle *w, int revents)
148 {
149 if (eio_poll () != -1)
150 ev_idle_stop (EV_A_ w);
151 }
152
153 /* eio has some results, process them */
154 static void
155 ready (EV_P_ ev_async *w, int revents)
156 {
157 if (eio_poll () == -1)
158 ev_idle_start (EV_A_ &repeat_watcher);
159 }
160
161 /* wake up the event loop */
162 static void
163 want_poll (void)
164 {
165 ev_async_send (loop, &ready_watcher)
166 }
167
168 void
169 my_init_eio ()
170 {
171 loop = EV_DEFAULT;
172
173 ev_idle_init (&repeat_watcher, repeat);
174 ev_async_init (&ready_watcher, ready);
175 ev_async_start (loop &watcher);
176
177 eio_init (want_poll, 0);
178 }
134 179
135For most other event loops, you would typically use a pipe - the event 180For most other event loops, you would typically use a pipe - the event
136loop should be told to wait for read readiness on the read end. In 181loop should be told to wait for read readiness on the read end. In
137C<want_poll> you would write a single byte, in C<done_poll> you would try 182C<want_poll> you would write a single byte, in C<done_poll> you would try
138to read that byte, and in the callback for the read end, you would call 183to read that byte, and in the callback for the read end, you would call
139C<eio_poll>. The race is avoided here because the event loop should invoke 184C<eio_poll>.
140your callback again and again until the byte has been read (as the pipe 185
141read callback does not read it, only C<done_poll>). 186You don't have to take special care in the case C<eio_poll> doesn't handle
187all requests, as the done callback will not be invoked, so the event loop
188will still signal readiness for the pipe until I<all> results have been
189processed.
142 190
143 191
144=head1 HIGH LEVEL REQUEST API 192=head1 HIGH LEVEL REQUEST API
145 193
146Libeio has both a high-level API, which consists of calling a request 194Libeio has both a high-level API, which consists of calling a request
153 201
154You submit a request by calling the relevant C<eio_TYPE> function with the 202You submit a request by calling the relevant C<eio_TYPE> function with the
155required parameters, a callback of type C<int (*eio_cb)(eio_req *req)> 203required parameters, a callback of type C<int (*eio_cb)(eio_req *req)>
156(called C<eio_cb> below) and a freely usable C<void *data> argument. 204(called C<eio_cb> below) and a freely usable C<void *data> argument.
157 205
158The return value will either be 0 206The return value will either be 0, in case something went really wrong
207(which can basically only happen on very fatal errors, such as C<malloc>
208returning 0, which is rather unlikely), or a pointer to the newly-created
209and submitted C<eio_req *>.
159 210
160The callback will be called with an C<eio_req *> which contains the 211The callback will be called with an C<eio_req *> which contains the
161results of the request. The members you can access inside that structure 212results of the request. The members you can access inside that structure
162vary from request to request, except for: 213vary from request to request, except for:
163 214
210 } 261 }
211 262
212 /* the first three arguments are passed to open(2) */ 263 /* the first three arguments are passed to open(2) */
213 /* the remaining are priority, callback and data */ 264 /* the remaining are priority, callback and data */
214 if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0)) 265 if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0))
215 abort (); /* something ent wrong, we will all die!!! */ 266 abort (); /* something went wrong, we will all die!!! */
216 267
217Note that you additionally need to call C<eio_poll> when the C<want_cb> 268Note that you additionally need to call C<eio_poll> when the C<want_cb>
218indicates that requests are ready to be processed. 269indicates that requests are ready to be processed.
270
271=head2 CANCELLING REQUESTS
272
273Sometimes the need for a request goes away before the request is
274finished. In that case, one can cancel the request by a call to
275C<eio_cancel>:
276
277=over 4
278
279=item eio_cancel (eio_req *req)
280
281Cancel the request (and all its subrequests). If the request is currently
282executing it might still continue to execute, and in other cases it might
283still take a while till the request is cancelled.
284
285Even if cancelled, the finish callback will still be invoked - the
286callbacks of all cancellable requests need to check whether the request
287has been cancelled by calling C<EIO_CANCELLED (req)>:
288
289 static int
290 my_eio_cb (eio_req *req)
291 {
292 if (EIO_CANCELLED (req))
293 return 0;
294 }
295
296In addition, cancelled requests will I<either> have C<< req->result >>
297set to C<-1> and C<errno> to C<ECANCELED>, or I<otherwise> they were
298successfully executed, despite being cancelled (e.g. when they have
299already been executed at the time they were cancelled).
300
301C<EIO_CANCELLED> is still true for requests that have successfully
302executed, as long as C<eio_cancel> was called on them at some point.
303
304=back
219 305
220=head2 AVAILABLE REQUESTS 306=head2 AVAILABLE REQUESTS
221 307
222The following request functions are available. I<All> of them return the 308The following request functions are available. I<All> of them return the
223C<eio_req *> on success and C<0> on failure, and I<all> of them have the 309C<eio_req *> on success and C<0> on failure, and I<all> of them have the
226custom data value as C<data>. 312custom data value as C<data>.
227 313
228=head3 POSIX API WRAPPERS 314=head3 POSIX API WRAPPERS
229 315
230These requests simply wrap the POSIX call of the same name, with the same 316These requests simply wrap the POSIX call of the same name, with the same
231arguments. If a function is not implemented by the OS and cnanot be emulated 317arguments. If a function is not implemented by the OS and cannot be emulated
232in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>. 318in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>.
233 319
234=over 4 320=over 4
235 321
236=item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data) 322=item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data)
317 char *target = strndup ((char *)req->ptr2, req->result); 403 char *target = strndup ((char *)req->ptr2, req->result);
318 404
319 free (target); 405 free (target);
320 } 406 }
321 407
408=item eio_realpath (const char *path, int pri, eio_cb cb, void *data)
409
410Similar to the realpath libc function, but unlike that one, result is
411C<-1> on failure and the length of the returned path in C<ptr2> (which is
412not 0-terminated) - this is similar to readlink.
413
322=item eio_stat (const char *path, int pri, eio_cb cb, void *data) 414=item eio_stat (const char *path, int pri, eio_cb cb, void *data)
323 415
324=item eio_lstat (const char *path, int pri, eio_cb cb, void *data) 416=item eio_lstat (const char *path, int pri, eio_cb cb, void *data)
325 417
326=item eio_fstat (int fd, int pri, eio_cb cb, void *data) 418=item eio_fstat (int fd, int pri, eio_cb cb, void *data)
327 419
328Stats a file - if C<< req->result >> indicates success, then you can 420Stats a file - if C<< req->result >> indicates success, then you can
329access the C<struct stat>-like structure via C<< req->ptr2 >>: 421access the C<struct stat>-like structure via C<< req->ptr2 >>:
330 422
331 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; 423 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2;
332 424
333=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data) 425=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data)
334 426
335=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data) 427=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data)
336 428
337Stats a filesystem - if C<< req->result >> indicates success, then you can 429Stats a filesystem - if C<< req->result >> indicates success, then you can
338access the C<struct statvfs>-like structure via C<< req->ptr2 >>: 430access the C<struct statvfs>-like structure via C<< req->ptr2 >>:
339 431
340 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; 432 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2;
341 433
342=back 434=back
343 435
344=head3 READING DIRECTORIES 436=head3 READING DIRECTORIES
345 437
346Reading directories sounds simple, but can be rather demanding, especially 438Reading directories sounds simple, but can be rather demanding, especially
347if you want to do stuff such as traversing a diretcory hierarchy or 439if you want to do stuff such as traversing a directory hierarchy or
348processing all files in a directory. Libeio can assist thess complex tasks 440processing all files in a directory. Libeio can assist these complex tasks
349with it's C<eio_readdir> call. 441with it's C<eio_readdir> call.
350 442
351=over 4 443=over 4
352 444
353=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data) 445=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data)
385 477
386If this flag is specified, then, in addition to the names in C<ptr2>, 478If this flag is specified, then, in addition to the names in C<ptr2>,
387also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct 479also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct
388eio_dirent> looks like this: 480eio_dirent> looks like this:
389 481
390 struct eio_dirent 482 struct eio_dirent
391 { 483 {
392 int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ 484 int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */
393 unsigned short namelen; /* size of filename without trailing 0 */ 485 unsigned short namelen; /* size of filename without trailing 0 */
394 unsigned char type; /* one of EIO_DT_* */ 486 unsigned char type; /* one of EIO_DT_* */
395 signed char score; /* internal use */ 487 signed char score; /* internal use */
396 ino_t inode; /* the inode number, if available, otherwise unspecified */ 488 ino_t inode; /* the inode number, if available, otherwise unspecified */
397 }; 489 };
398 490
399The only members you normally would access are C<nameofs>, which is the 491The only members you normally would access are C<nameofs>, which is the
400byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>. 492byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>.
401 493
402C<type> can be one of: 494C<type> can be one of:
445When this flag is specified, then the names will be returned in an order 537When this flag is specified, then the names will be returned in an order
446suitable for stat()'ing each one. That is, when you plan to stat() 538suitable for stat()'ing each one. That is, when you plan to stat()
447all files in the given directory, then the returned order will likely 539all files in the given directory, then the returned order will likely
448be fastest. 540be fastest.
449 541
450If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then 542If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the
451the likely dirs come first, resulting in a less optimal stat order. 543likely directories come first, resulting in a less optimal stat order.
452 544
453=item EIO_READDIR_FOUND_UNKNOWN 545=item EIO_READDIR_FOUND_UNKNOWN
454 546
455This flag should not be specified when calling C<eio_readdir>. Instead, 547This flag should not be specified when calling C<eio_readdir>. Instead,
456it is being set by C<eio_readdir> (you can access the C<flags> via C<< 548it is being set by C<eio_readdir> (you can access the C<flags> via C<<
457req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The 549req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The
458absense of this flag therefore indicates that all C<type>'s are known, 550absence of this flag therefore indicates that all C<type>'s are known,
459which can be used to speed up some algorithms. 551which can be used to speed up some algorithms.
460 552
461A typical use case would be to identify all subdirectories within a 553A typical use case would be to identify all subdirectories within a
462directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If 554directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If
463then this flag is I<NOT> set, then all the entries at the beginning of the 555then this flag is I<NOT> set, then all the entries at the beginning of the
501as calling C<fdatasync>. 593as calling C<fdatasync>.
502 594
503Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>, 595Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>,
504C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>. 596C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>.
505 597
598=item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data)
599
600Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is
601missing, then it returns failure and sets C<errno> to C<ENOSYS>.
602
603The C<mode> argument can be C<0> (for behaviour similar to
604C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size
605of the file unchanged (but still preallocates space beyond end of file).
606
506=back 607=back
507 608
508=head3 LIBEIO-SPECIFIC REQUESTS 609=head3 LIBEIO-SPECIFIC REQUESTS
509 610
510These requests are specific to libeio and do not correspond to any OS call. 611These requests are specific to libeio and do not correspond to any OS call.
551 652
552 eio_custom (my_open, 0, my_open_done, "/etc/passwd"); 653 eio_custom (my_open, 0, my_open_done, "/etc/passwd");
553 654
554=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data) 655=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data)
555 656
556This is a a request that takes C<delay> seconds to execute, but otherwise 657This is a request that takes C<delay> seconds to execute, but otherwise
557does nothing - it simply puts one of the worker threads to sleep for this 658does nothing - it simply puts one of the worker threads to sleep for this
558long. 659long.
559 660
560This request can be used to artificially increase load, e.g. for debugging 661This request can be used to artificially increase load, e.g. for debugging
561or benchmarking reasons. 662or benchmarking reasons.
568 669
569=back 670=back
570 671
571=head3 GROUPING AND LIMITING REQUESTS 672=head3 GROUPING AND LIMITING REQUESTS
572 673
674There is one more rather special request, C<eio_grp>. It is a very special
675aio request: Instead of doing something, it is a container for other eio
676requests.
677
678There are two primary use cases for this: a) bundle many requests into a
679single, composite, request with a definite callback and the ability to
680cancel the whole request with its subrequests and b) limiting the number
681of "active" requests.
682
683Further below you will find more discussion of these topics - first
684follows the reference section detailing the request generator and other
685methods.
686
687=over 4
688
689=item eio_req *grp = eio_grp (eio_cb cb, void *data)
690
691Creates, submits and returns a group request.
692
693=item eio_grp_add (eio_req *grp, eio_req *req)
694
695Adds a request to the request group.
696
697=item eio_grp_cancel (eio_req *grp)
698
699Cancels all requests I<in> the group, but I<not> the group request
700itself. You can cancel the group request via a normal C<eio_cancel> call.
701
702
703
704=back
705
706
707
573#TODO 708#TODO
574 709
575/*****************************************************************************/ 710/*****************************************************************************/
576/* groups */ 711/* groups */
577 712
578eio_req *eio_grp (eio_cb cb, void *data); 713eio_req *eio_grp (eio_cb cb, void *data);
579void eio_grp_feed (eio_req *grp, void (*feed)(eio_req *req), int limit); 714void eio_grp_feed (eio_req *grp, void (*feed)(eio_req *req), int limit);
580void eio_grp_limit (eio_req *grp, int limit); 715void eio_grp_limit (eio_req *grp, int limit);
581void eio_grp_add (eio_req *grp, eio_req *req);
582void eio_grp_cancel (eio_req *grp); /* cancels all sub requests but not the group */ 716void eio_grp_cancel (eio_req *grp); /* cancels all sub requests but not the group */
583 717
584 718
585=back 719=back
586 720
593=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST 727=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST
594 728
595A request is represented by a structure of type C<eio_req>. To initialise 729A request is represented by a structure of type C<eio_req>. To initialise
596it, clear it to all zero bytes: 730it, clear it to all zero bytes:
597 731
598 eio_req req; 732 eio_req req;
599 733
600 memset (&req, 0, sizeof (req)); 734 memset (&req, 0, sizeof (req));
601 735
602A more common way to initialise a new C<eio_req> is to use C<calloc>: 736A more common way to initialise a new C<eio_req> is to use C<calloc>:
603 737
604 eio_req *req = calloc (1, sizeof (*req)); 738 eio_req *req = calloc (1, sizeof (*req));
605 739
606In either case, libeio neither allocates, initialises or frees the 740In either case, libeio neither allocates, initialises or frees the
607C<eio_req> structure for you - it merely uses it. 741C<eio_req> structure for you - it merely uses it.
608 742
609zero 743zero
627for example, in interactive programs, you might want to limit this time to 761for example, in interactive programs, you might want to limit this time to
628C<0.01> seconds or so. 762C<0.01> seconds or so.
629 763
630Note that: 764Note that:
631 765
766=over 4
767
632a) libeio doesn't know how long your request callbacks take, so the time 768=item a) libeio doesn't know how long your request callbacks take, so the
633spent in C<eio_poll> is up to one callback invocation longer then this 769time spent in C<eio_poll> is up to one callback invocation longer then
634interval. 770this interval.
635 771
636b) this is implemented by calling C<gettimeofday> after each request, 772=item b) this is implemented by calling C<gettimeofday> after each
637which can be costly. 773request, which can be costly.
638 774
639c) at least one request will be handled. 775=item c) at least one request will be handled.
776
777=back
640 778
641=item eio_set_max_poll_reqs (unsigned int nreqs) 779=item eio_set_max_poll_reqs (unsigned int nreqs)
642 780
643When C<nreqs> is non-zero, then C<eio_poll> will not handle more than 781When C<nreqs> is non-zero, then C<eio_poll> will not handle more than
644C<nreqs> requests per invocation. This is a less costly way to limit the 782C<nreqs> requests per invocation. This is a less costly way to limit the

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines