ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libeio/eio.pod
(Generate patch)

Comparing libeio/eio.pod (file contents):
Revision 1.14 by root, Tue Jul 5 14:02:15 2011 UTC vs.
Revision 1.26 by root, Mon Jul 18 02:59:58 2011 UTC

45Unlike the name component C<stamp> might indicate, it is also used for 45Unlike the name component C<stamp> might indicate, it is also used for
46time differences throughout libeio. 46time differences throughout libeio.
47 47
48=head2 FORK SUPPORT 48=head2 FORK SUPPORT
49 49
50Calling C<fork ()> is fully supported by this module. It is implemented in these steps: 50Usage of pthreads in a program changes the semantics of fork
51considerably. Specifically, only async-safe functions can be called after
52fork. Libeio uses pthreads, so this applies, and makes using fork hard for
53anything but relatively fork + exec uses.
51 54
52 1. wait till all requests in "execute" state have been handled 55This library only works in the process that initialised it: Forking is
53 (basically requests that are already handed over to the kernel). 56fully supported, but using libeio in any other process than the one that
54 2. fork 57called C<eio_init> is not.
55 3. in the parent, continue business as usual, done
56 4. in the child, destroy all ready and pending requests and free the
57 memory used by the worker threads. This gives you a fully empty
58 libeio queue.
59 58
60Note, however, since libeio does use threads, thr above guarantee doesn't 59You might get around by not I<using> libeio before (or after) forking in
61cover your libc, for example, malloc and other libc functions are not 60the parent, and using it in the child afterwards. You could also try to
62fork-safe, so there is very little you can do after a fork, and in fatc, 61call the L<eio_init> function again in the child, which will brutally
63the above might crash, and thus change. 62reinitialise all data structures, which isn't POSIX conformant, but
63typically works.
64
65Otherwise, the only recommendation you should follow is: treat fork code
66the same way you treat signal handlers, and only ever call C<eio_init> in
67the process that uses it, and only once ever.
64 68
65=head1 INITIALISATION/INTEGRATION 69=head1 INITIALISATION/INTEGRATION
66 70
67Before you can call any eio functions you first have to initialise the 71Before you can call any eio functions you first have to initialise the
68library. The library integrates into any event loop, but can also be used 72library. The library integrates into any event loop, but can also be used
77This function initialises the library. On success it returns C<0>, on 81This function initialises the library. On success it returns C<0>, on
78failure it returns C<-1> and sets C<errno> appropriately. 82failure it returns C<-1> and sets C<errno> appropriately.
79 83
80It accepts two function pointers specifying callbacks as argument, both of 84It accepts two function pointers specifying callbacks as argument, both of
81which can be C<0>, in which case the callback isn't called. 85which can be C<0>, in which case the callback isn't called.
86
87There is currently no way to change these callbacks later, or to
88"uninitialise" the library again.
82 89
83=item want_poll callback 90=item want_poll callback
84 91
85The C<want_poll> callback is invoked whenever libeio wants attention (i.e. 92The C<want_poll> callback is invoked whenever libeio wants attention (i.e.
86it wants to be polled by calling C<eio_poll>). It is "edge-triggered", 93it wants to be polled by calling C<eio_poll>). It is "edge-triggered",
124=back 131=back
125 132
126For libev, you would typically use an C<ev_async> watcher: the 133For libev, you would typically use an C<ev_async> watcher: the
127C<want_poll> callback would invoke C<ev_async_send> to wake up the event 134C<want_poll> callback would invoke C<ev_async_send> to wake up the event
128loop. Inside the callback set for the watcher, one would call C<eio_poll 135loop. Inside the callback set for the watcher, one would call C<eio_poll
129()> (followed by C<ev_async_send> again if C<eio_poll> indicates that not 136()>.
130all requests have been handled yet). The race is taken care of because 137
131libev resets/rearms the async watcher before calling your callback, 138If C<eio_poll ()> is configured to not handle all results in one go
132and therefore, before calling C<eio_poll>. This might result in (some) 139(i.e. it returns C<-1>) then you should start an idle watcher that calls
133spurious wake-ups, but is generally harmless. 140C<eio_poll> until it returns something C<!= -1>.
141
142A full-featured connector between libeio and libev would look as follows
143(if C<eio_poll> is handling all requests, it can of course be simplified a
144lot by removing the idle watcher logic):
145
146 static struct ev_loop *loop;
147 static ev_idle repeat_watcher;
148 static ev_async ready_watcher;
149
150 /* idle watcher callback, only used when eio_poll */
151 /* didn't handle all results in one call */
152 static void
153 repeat (EV_P_ ev_idle *w, int revents)
154 {
155 if (eio_poll () != -1)
156 ev_idle_stop (EV_A_ w);
157 }
158
159 /* eio has some results, process them */
160 static void
161 ready (EV_P_ ev_async *w, int revents)
162 {
163 if (eio_poll () == -1)
164 ev_idle_start (EV_A_ &repeat_watcher);
165 }
166
167 /* wake up the event loop */
168 static void
169 want_poll (void)
170 {
171 ev_async_send (loop, &ready_watcher)
172 }
173
174 void
175 my_init_eio ()
176 {
177 loop = EV_DEFAULT;
178
179 ev_idle_init (&repeat_watcher, repeat);
180 ev_async_init (&ready_watcher, ready);
181 ev_async_start (loop &watcher);
182
183 eio_init (want_poll, 0);
184 }
134 185
135For most other event loops, you would typically use a pipe - the event 186For most other event loops, you would typically use a pipe - the event
136loop should be told to wait for read readiness on the read end. In 187loop should be told to wait for read readiness on the read end. In
137C<want_poll> you would write a single byte, in C<done_poll> you would try 188C<want_poll> you would write a single byte, in C<done_poll> you would try
138to read that byte, and in the callback for the read end, you would call 189to read that byte, and in the callback for the read end, you would call
139C<eio_poll>. The race is avoided here because the event loop should invoke 190C<eio_poll>.
140your callback again and again until the byte has been read (as the pipe 191
141read callback does not read it, only C<done_poll>). 192You don't have to take special care in the case C<eio_poll> doesn't handle
193all requests, as the done callback will not be invoked, so the event loop
194will still signal readiness for the pipe until I<all> results have been
195processed.
142 196
143 197
144=head1 HIGH LEVEL REQUEST API 198=head1 HIGH LEVEL REQUEST API
145 199
146Libeio has both a high-level API, which consists of calling a request 200Libeio has both a high-level API, which consists of calling a request
213 } 267 }
214 268
215 /* the first three arguments are passed to open(2) */ 269 /* the first three arguments are passed to open(2) */
216 /* the remaining are priority, callback and data */ 270 /* the remaining are priority, callback and data */
217 if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0)) 271 if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0))
218 abort (); /* something ent wrong, we will all die!!! */ 272 abort (); /* something went wrong, we will all die!!! */
219 273
220Note that you additionally need to call C<eio_poll> when the C<want_cb> 274Note that you additionally need to call C<eio_poll> when the C<want_cb>
221indicates that requests are ready to be processed. 275indicates that requests are ready to be processed.
276
277=head2 CANCELLING REQUESTS
278
279Sometimes the need for a request goes away before the request is
280finished. In that case, one can cancel the request by a call to
281C<eio_cancel>:
282
283=over 4
284
285=item eio_cancel (eio_req *req)
286
287Cancel the request (and all its subrequests). If the request is currently
288executing it might still continue to execute, and in other cases it might
289still take a while till the request is cancelled.
290
291Even if cancelled, the finish callback will still be invoked - the
292callbacks of all cancellable requests need to check whether the request
293has been cancelled by calling C<EIO_CANCELLED (req)>:
294
295 static int
296 my_eio_cb (eio_req *req)
297 {
298 if (EIO_CANCELLED (req))
299 return 0;
300 }
301
302In addition, cancelled requests will I<either> have C<< req->result >>
303set to C<-1> and C<errno> to C<ECANCELED>, or I<otherwise> they were
304successfully executed, despite being cancelled (e.g. when they have
305already been executed at the time they were cancelled).
306
307C<EIO_CANCELLED> is still true for requests that have successfully
308executed, as long as C<eio_cancel> was called on them at some point.
309
310=back
222 311
223=head2 AVAILABLE REQUESTS 312=head2 AVAILABLE REQUESTS
224 313
225The following request functions are available. I<All> of them return the 314The following request functions are available. I<All> of them return the
226C<eio_req *> on success and C<0> on failure, and I<all> of them have the 315C<eio_req *> on success and C<0> on failure, and I<all> of them have the
322 free (target); 411 free (target);
323 } 412 }
324 413
325=item eio_realpath (const char *path, int pri, eio_cb cb, void *data) 414=item eio_realpath (const char *path, int pri, eio_cb cb, void *data)
326 415
327Similar to the realpath libc function, but unlike that one, result is 416Similar to the realpath libc function, but unlike that one, C<<
328C<-1> on failure and the length of the returned path in C<ptr2> (which is 417req->result >> is C<-1> on failure. On success, the result is the length
329not 0-terminated) - this is similar to readlink. 418of the returned path in C<ptr2> (which is I<NOT> 0-terminated) - this is
419similar to readlink.
330 420
331=item eio_stat (const char *path, int pri, eio_cb cb, void *data) 421=item eio_stat (const char *path, int pri, eio_cb cb, void *data)
332 422
333=item eio_lstat (const char *path, int pri, eio_cb cb, void *data) 423=item eio_lstat (const char *path, int pri, eio_cb cb, void *data)
334 424
335=item eio_fstat (int fd, int pri, eio_cb cb, void *data) 425=item eio_fstat (int fd, int pri, eio_cb cb, void *data)
336 426
337Stats a file - if C<< req->result >> indicates success, then you can 427Stats a file - if C<< req->result >> indicates success, then you can
338access the C<struct stat>-like structure via C<< req->ptr2 >>: 428access the C<struct stat>-like structure via C<< req->ptr2 >>:
339 429
340 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; 430 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2;
341 431
342=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data) 432=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data)
343 433
344=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data) 434=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data)
345 435
346Stats a filesystem - if C<< req->result >> indicates success, then you can 436Stats a filesystem - if C<< req->result >> indicates success, then you can
347access the C<struct statvfs>-like structure via C<< req->ptr2 >>: 437access the C<struct statvfs>-like structure via C<< req->ptr2 >>:
348 438
349 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; 439 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2;
350 440
351=back 441=back
352 442
353=head3 READING DIRECTORIES 443=head3 READING DIRECTORIES
354 444
355Reading directories sounds simple, but can be rather demanding, especially 445Reading directories sounds simple, but can be rather demanding, especially
356if you want to do stuff such as traversing a diretcory hierarchy or 446if you want to do stuff such as traversing a directory hierarchy or
357processing all files in a directory. Libeio can assist thess complex tasks 447processing all files in a directory. Libeio can assist these complex tasks
358with it's C<eio_readdir> call. 448with it's C<eio_readdir> call.
359 449
360=over 4 450=over 4
361 451
362=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data) 452=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data)
394 484
395If this flag is specified, then, in addition to the names in C<ptr2>, 485If this flag is specified, then, in addition to the names in C<ptr2>,
396also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct 486also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct
397eio_dirent> looks like this: 487eio_dirent> looks like this:
398 488
399 struct eio_dirent 489 struct eio_dirent
400 { 490 {
401 int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ 491 int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */
402 unsigned short namelen; /* size of filename without trailing 0 */ 492 unsigned short namelen; /* size of filename without trailing 0 */
403 unsigned char type; /* one of EIO_DT_* */ 493 unsigned char type; /* one of EIO_DT_* */
404 signed char score; /* internal use */ 494 signed char score; /* internal use */
405 ino_t inode; /* the inode number, if available, otherwise unspecified */ 495 ino_t inode; /* the inode number, if available, otherwise unspecified */
406 }; 496 };
407 497
408The only members you normally would access are C<nameofs>, which is the 498The only members you normally would access are C<nameofs>, which is the
409byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>. 499byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>.
410 500
411C<type> can be one of: 501C<type> can be one of:
454When this flag is specified, then the names will be returned in an order 544When this flag is specified, then the names will be returned in an order
455suitable for stat()'ing each one. That is, when you plan to stat() 545suitable for stat()'ing each one. That is, when you plan to stat()
456all files in the given directory, then the returned order will likely 546all files in the given directory, then the returned order will likely
457be fastest. 547be fastest.
458 548
459If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then 549If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the
460the likely dirs come first, resulting in a less optimal stat order. 550likely directories come first, resulting in a less optimal stat order.
461 551
462=item EIO_READDIR_FOUND_UNKNOWN 552=item EIO_READDIR_FOUND_UNKNOWN
463 553
464This flag should not be specified when calling C<eio_readdir>. Instead, 554This flag should not be specified when calling C<eio_readdir>. Instead,
465it is being set by C<eio_readdir> (you can access the C<flags> via C<< 555it is being set by C<eio_readdir> (you can access the C<flags> via C<<
466req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The 556req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The
467absense of this flag therefore indicates that all C<type>'s are known, 557absence of this flag therefore indicates that all C<type>'s are known,
468which can be used to speed up some algorithms. 558which can be used to speed up some algorithms.
469 559
470A typical use case would be to identify all subdirectories within a 560A typical use case would be to identify all subdirectories within a
471directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If 561directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If
472then this flag is I<NOT> set, then all the entries at the beginning of the 562then this flag is I<NOT> set, then all the entries at the beginning of the
510as calling C<fdatasync>. 600as calling C<fdatasync>.
511 601
512Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>, 602Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>,
513C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>. 603C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>.
514 604
605=item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data)
606
607Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is
608missing, then it returns failure and sets C<errno> to C<ENOSYS>.
609
610The C<mode> argument can be C<0> (for behaviour similar to
611C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size
612of the file unchanged (but still preallocates space beyond end of file).
613
515=back 614=back
516 615
517=head3 LIBEIO-SPECIFIC REQUESTS 616=head3 LIBEIO-SPECIFIC REQUESTS
518 617
519These requests are specific to libeio and do not correspond to any OS call. 618These requests are specific to libeio and do not correspond to any OS call.
560 659
561 eio_custom (my_open, 0, my_open_done, "/etc/passwd"); 660 eio_custom (my_open, 0, my_open_done, "/etc/passwd");
562 661
563=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data) 662=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data)
564 663
565This is a a request that takes C<delay> seconds to execute, but otherwise 664This is a request that takes C<delay> seconds to execute, but otherwise
566does nothing - it simply puts one of the worker threads to sleep for this 665does nothing - it simply puts one of the worker threads to sleep for this
567long. 666long.
568 667
569This request can be used to artificially increase load, e.g. for debugging 668This request can be used to artificially increase load, e.g. for debugging
570or benchmarking reasons. 669or benchmarking reasons.
586There are two primary use cases for this: a) bundle many requests into a 685There are two primary use cases for this: a) bundle many requests into a
587single, composite, request with a definite callback and the ability to 686single, composite, request with a definite callback and the ability to
588cancel the whole request with its subrequests and b) limiting the number 687cancel the whole request with its subrequests and b) limiting the number
589of "active" requests. 688of "active" requests.
590 689
591Further below you will find more dicussion of these topics - first follows 690Further below you will find more discussion of these topics - first
592the reference section detailing the request generator and other methods. 691follows the reference section detailing the request generator and other
692methods.
593 693
594=over 4 694=over 4
595 695
596=item eio_grp (eio_cb cb, void *data) 696=item eio_req *grp = eio_grp (eio_cb cb, void *data)
597 697
598Creates and submits a group request. 698Creates, submits and returns a group request. Note that it doesn't have a
699priority, unlike all other requests.
599 700
600=back 701=item eio_grp_add (eio_req *grp, eio_req *req)
601 702
703Adds a request to the request group.
704
705=item eio_grp_cancel (eio_req *grp)
706
707Cancels all requests I<in> the group, but I<not> the group request
708itself. You can cancel the group request I<and> all subrequests via a
709normal C<eio_cancel> call.
710
711=back
712
713=head4 GROUP REQUEST LIFETIME
714
715Left alone, a group request will instantly move to the pending state and
716will be finished at the next call of C<eio_poll>.
717
718The usefulness stems from the fact that, if a subrequest is added to a
719group I<before> a call to C<eio_poll>, via C<eio_grp_add>, then the group
720will not finish until all the subrequests have finished.
721
722So the usage cycle of a group request is like this: after it is created,
723you normally instantly add a subrequest. If none is added, the group
724request will finish on it's own. As long as subrequests are added before
725the group request is finished it will be kept from finishing, that is the
726callbacks of any subrequests can, in turn, add more requests to the group,
727and as long as any requests are active, the group request itself will not
728finish.
729
730=head4 CREATING COMPOSITE REQUESTS
731
732Imagine you wanted to create an C<eio_load> request that opens a file,
733reads it and closes it. This means it has to execute at least three eio
734requests, but for various reasons it might be nice if that request looked
735like any other eio request.
736
737This can be done with groups:
738
739=over 4
740
741=item 1) create the request object
742
743Create a group that contains all further requests. This is the request you
744can return as "the load request".
745
746=item 2) open the file, maybe
747
748Next, open the file with C<eio_open> and add the request to the group
749request and you are finished setting up the request.
750
751If, for some reason, you cannot C<eio_open> (path is a null ptr?) you
752can set C<< grp->result >> to C<-1> to signal an error and let the group
753request finish on its own.
754
755=item 3) open callback adds more requests
756
757In the open callback, if the open was not successful, copy C<<
758req->errorno >> to C<< grp->errorno >> and set C<< grp->errorno >> to
759C<-1> to signal an error.
760
761Otherwise, malloc some memory or so and issue a read request, adding the
762read request to the group.
763
764=item 4) continue issuing requests till finished
765
766In the real callback, check for errors and possibly continue with
767C<eio_close> or any other eio request in the same way.
768
769As soon as no new requests are added the group request will finish. Make
770sure you I<always> set C<< grp->result >> to some sensible value.
771
772=back
773
774=head4 REQUEST LIMITING
602 775
603 776
604#TODO 777#TODO
605 778
606/*****************************************************************************/
607/* groups */
608
609eio_req *eio_grp (eio_cb cb, void *data);
610void eio_grp_feed (eio_req *grp, void (*feed)(eio_req *req), int limit);
611void eio_grp_limit (eio_req *grp, int limit); 779void eio_grp_limit (eio_req *grp, int limit);
612void eio_grp_add (eio_req *grp, eio_req *req);
613void eio_grp_cancel (eio_req *grp); /* cancels all sub requests but not the group */
614 780
615 781
616=back 782=back
617 783
618 784
624=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST 790=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST
625 791
626A request is represented by a structure of type C<eio_req>. To initialise 792A request is represented by a structure of type C<eio_req>. To initialise
627it, clear it to all zero bytes: 793it, clear it to all zero bytes:
628 794
629 eio_req req; 795 eio_req req;
630 796
631 memset (&req, 0, sizeof (req)); 797 memset (&req, 0, sizeof (req));
632 798
633A more common way to initialise a new C<eio_req> is to use C<calloc>: 799A more common way to initialise a new C<eio_req> is to use C<calloc>:
634 800
635 eio_req *req = calloc (1, sizeof (*req)); 801 eio_req *req = calloc (1, sizeof (*req));
636 802
637In either case, libeio neither allocates, initialises or frees the 803In either case, libeio neither allocates, initialises or frees the
638C<eio_req> structure for you - it merely uses it. 804C<eio_req> structure for you - it merely uses it.
639 805
640zero 806zero
658for example, in interactive programs, you might want to limit this time to 824for example, in interactive programs, you might want to limit this time to
659C<0.01> seconds or so. 825C<0.01> seconds or so.
660 826
661Note that: 827Note that:
662 828
829=over 4
830
663a) libeio doesn't know how long your request callbacks take, so the time 831=item a) libeio doesn't know how long your request callbacks take, so the
664spent in C<eio_poll> is up to one callback invocation longer then this 832time spent in C<eio_poll> is up to one callback invocation longer then
665interval. 833this interval.
666 834
667b) this is implemented by calling C<gettimeofday> after each request, 835=item b) this is implemented by calling C<gettimeofday> after each
668which can be costly. 836request, which can be costly.
669 837
670c) at least one request will be handled. 838=item c) at least one request will be handled.
839
840=back
671 841
672=item eio_set_max_poll_reqs (unsigned int nreqs) 842=item eio_set_max_poll_reqs (unsigned int nreqs)
673 843
674When C<nreqs> is non-zero, then C<eio_poll> will not handle more than 844When C<nreqs> is non-zero, then C<eio_poll> will not handle more than
675C<nreqs> requests per invocation. This is a less costly way to limit the 845C<nreqs> requests per invocation. This is a less costly way to limit the
745This symbol governs the stack size for each eio thread. Libeio itself 915This symbol governs the stack size for each eio thread. Libeio itself
746was written to use very little stackspace, but when using C<EIO_CUSTOM> 916was written to use very little stackspace, but when using C<EIO_CUSTOM>
747requests, you might want to increase this. 917requests, you might want to increase this.
748 918
749If this symbol is undefined (the default) then libeio will use its default 919If this symbol is undefined (the default) then libeio will use its default
750stack size (C<sizeof (long) * 4096> currently). If it is defined, but 920stack size (C<sizeof (void *) * 4096> currently). If it is defined, but
751C<0>, then the default operating system stack size will be used. In all 921C<0>, then the default operating system stack size will be used. In all
752other cases, the value must be an expression that evaluates to the desired 922other cases, the value must be an expression that evaluates to the desired
753stack size. 923stack size.
754 924
755=back 925=back

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines