ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libeio/eio.pod
Revision: 1.35
Committed: Mon Aug 18 08:11:54 2014 UTC (9 years, 9 months ago) by root
Branch: MAIN
CVS Tags: rel-4_32, rel-4_33
Changes since 1.34: +1 -15 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     libeio - truly asynchronous POSIX I/O
4    
5     =head1 SYNOPSIS
6    
7     #include <eio.h>
8    
9     =head1 DESCRIPTION
10    
11     The newest version of this document is also available as an html-formatted
12     web page you might find easier to navigate when reading it for the first
13     time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>.
14    
15     Note that this library is a by-product of the C<IO::AIO> perl
16 sf-exg 1.6 module, and many of the subtler points regarding requests lifetime
17 root 1.1 and so on are only documented in its documentation at the
18     moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>.
19    
20     =head2 FEATURES
21    
22     This library provides fully asynchronous versions of most POSIX functions
23 sf-exg 1.6 dealing with I/O. Unlike most asynchronous libraries, this not only
24 root 1.1 includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and
25     similar functions, as well as less rarely ones such as C<mknod>, C<futime>
26     or C<readlink>.
27    
28     It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and
29     FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with
30 root 1.33 emulation elsewhere).
31 root 1.1
32 root 1.5 The goal is to enable you to write fully non-blocking programs. For
33 root 1.1 example, in a game server, you would not want to freeze for a few seconds
34     just because the server is running a backup and you happen to call
35     C<readdir>.
36    
37     =head2 TIME REPRESENTATION
38    
39     Libeio represents time as a single floating point number, representing the
40     (fractional) number of seconds since the (POSIX) epoch (somewhere near
41     the beginning of 1970, details are complicated, don't ask). This type is
42 sf-exg 1.6 called C<eio_tstamp>, but it is guaranteed to be of type C<double> (or
43 root 1.1 better), so you can freely use C<double> yourself.
44    
45     Unlike the name component C<stamp> might indicate, it is also used for
46     time differences throughout libeio.
47    
48     =head2 FORK SUPPORT
49    
50 root 1.26 Usage of pthreads in a program changes the semantics of fork
51     considerably. Specifically, only async-safe functions can be called after
52     fork. Libeio uses pthreads, so this applies, and makes using fork hard for
53     anything but relatively fork + exec uses.
54    
55     This library only works in the process that initialised it: Forking is
56     fully supported, but using libeio in any other process than the one that
57     called C<eio_init> is not.
58    
59     You might get around by not I<using> libeio before (or after) forking in
60     the parent, and using it in the child afterwards. You could also try to
61     call the L<eio_init> function again in the child, which will brutally
62     reinitialise all data structures, which isn't POSIX conformant, but
63     typically works.
64    
65     Otherwise, the only recommendation you should follow is: treat fork code
66     the same way you treat signal handlers, and only ever call C<eio_init> in
67     the process that uses it, and only once ever.
68 root 1.7
69 root 1.1 =head1 INITIALISATION/INTEGRATION
70    
71     Before you can call any eio functions you first have to initialise the
72     library. The library integrates into any event loop, but can also be used
73     without one, including in polling mode.
74    
75     You have to provide the necessary glue yourself, however.
76    
77     =over 4
78    
79     =item int eio_init (void (*want_poll)(void), void (*done_poll)(void))
80    
81     This function initialises the library. On success it returns C<0>, on
82     failure it returns C<-1> and sets C<errno> appropriately.
83    
84     It accepts two function pointers specifying callbacks as argument, both of
85     which can be C<0>, in which case the callback isn't called.
86    
87 root 1.26 There is currently no way to change these callbacks later, or to
88     "uninitialise" the library again.
89    
90 root 1.1 =item want_poll callback
91    
92     The C<want_poll> callback is invoked whenever libeio wants attention (i.e.
93     it wants to be polled by calling C<eio_poll>). It is "edge-triggered",
94     that is, it will only be called once when eio wants attention, until all
95     pending requests have been handled.
96    
97     This callback is called while locks are being held, so I<you must
98     not call any libeio functions inside this callback>. That includes
99     C<eio_poll>. What you should do is notify some other thread, or wake up
100     your event loop, and then call C<eio_poll>.
101    
102     =item done_poll callback
103    
104     This callback is invoked when libeio detects that all pending requests
105     have been handled. It is "edge-triggered", that is, it will only be
106     called once after C<want_poll>. To put it differently, C<want_poll> and
107     C<done_poll> are invoked in pairs: after C<want_poll> you have to call
108     C<eio_poll ()> until either C<eio_poll> indicates that everything has been
109     handled or C<done_poll> has been called, which signals the same.
110    
111     Note that C<eio_poll> might return after C<done_poll> and C<want_poll>
112     have been called again, so watch out for races in your code.
113    
114 sf-exg 1.6 As with C<want_poll>, this callback is called while locks are being held,
115 root 1.1 so you I<must not call any libeio functions form within this callback>.
116    
117     =item int eio_poll ()
118    
119     This function has to be called whenever there are pending requests that
120     need finishing. You usually call this after C<want_poll> has indicated
121     that you should do so, but you can also call this function regularly to
122     poll for new results.
123    
124     If any request invocation returns a non-zero value, then C<eio_poll ()>
125     immediately returns with that value as return value.
126    
127     Otherwise, if all requests could be handled, it returns C<0>. If for some
128     reason not all requests have been handled, i.e. some are still pending, it
129     returns C<-1>.
130    
131     =back
132    
133     For libev, you would typically use an C<ev_async> watcher: the
134     C<want_poll> callback would invoke C<ev_async_send> to wake up the event
135     loop. Inside the callback set for the watcher, one would call C<eio_poll
136 root 1.15 ()>.
137    
138     If C<eio_poll ()> is configured to not handle all results in one go
139     (i.e. it returns C<-1>) then you should start an idle watcher that calls
140     C<eio_poll> until it returns something C<!= -1>.
141    
142 sf-exg 1.20 A full-featured connector between libeio and libev would look as follows
143 root 1.16 (if C<eio_poll> is handling all requests, it can of course be simplified a
144     lot by removing the idle watcher logic):
145 root 1.15
146 root 1.17 static struct ev_loop *loop;
147     static ev_idle repeat_watcher;
148     static ev_async ready_watcher;
149 root 1.15
150 root 1.17 /* idle watcher callback, only used when eio_poll */
151     /* didn't handle all results in one call */
152     static void
153     repeat (EV_P_ ev_idle *w, int revents)
154     {
155     if (eio_poll () != -1)
156     ev_idle_stop (EV_A_ w);
157     }
158    
159     /* eio has some results, process them */
160     static void
161     ready (EV_P_ ev_async *w, int revents)
162     {
163     if (eio_poll () == -1)
164     ev_idle_start (EV_A_ &repeat_watcher);
165     }
166    
167     /* wake up the event loop */
168     static void
169     want_poll (void)
170     {
171     ev_async_send (loop, &ready_watcher)
172     }
173    
174     void
175     my_init_eio ()
176     {
177     loop = EV_DEFAULT;
178    
179     ev_idle_init (&repeat_watcher, repeat);
180     ev_async_init (&ready_watcher, ready);
181 root 1.34 ev_async_start (loop, &watcher);
182 root 1.17
183     eio_init (want_poll, 0);
184     }
185 root 1.1
186     For most other event loops, you would typically use a pipe - the event
187 sf-exg 1.6 loop should be told to wait for read readiness on the read end. In
188 root 1.1 C<want_poll> you would write a single byte, in C<done_poll> you would try
189     to read that byte, and in the callback for the read end, you would call
190 root 1.16 C<eio_poll>.
191    
192     You don't have to take special care in the case C<eio_poll> doesn't handle
193     all requests, as the done callback will not be invoked, so the event loop
194 root 1.18 will still signal readiness for the pipe until I<all> results have been
195 root 1.16 processed.
196 root 1.1
197    
198 root 1.7 =head1 HIGH LEVEL REQUEST API
199    
200     Libeio has both a high-level API, which consists of calling a request
201     function with a callback to be called on completion, and a low-level API
202     where you fill out request structures and submit them.
203    
204     This section describes the high-level API.
205    
206     =head2 REQUEST SUBMISSION AND RESULT PROCESSING
207    
208     You submit a request by calling the relevant C<eio_TYPE> function with the
209     required parameters, a callback of type C<int (*eio_cb)(eio_req *req)>
210     (called C<eio_cb> below) and a freely usable C<void *data> argument.
211    
212 root 1.12 The return value will either be 0, in case something went really wrong
213     (which can basically only happen on very fatal errors, such as C<malloc>
214     returning 0, which is rather unlikely), or a pointer to the newly-created
215     and submitted C<eio_req *>.
216 root 1.7
217     The callback will be called with an C<eio_req *> which contains the
218     results of the request. The members you can access inside that structure
219     vary from request to request, except for:
220    
221     =over 4
222    
223     =item C<ssize_t result>
224    
225     This contains the result value from the call (usually the same as the
226     syscall of the same name).
227    
228     =item C<int errorno>
229    
230     This contains the value of C<errno> after the call.
231    
232     =item C<void *data>
233    
234     The C<void *data> member simply stores the value of the C<data> argument.
235    
236     =back
237    
238 sf-exg 1.29 Members not explicitly described as accessible must not be
239     accessed. Specifically, there is no guarantee that any members will still
240 root 1.28 have the value they had when the request was submitted.
241    
242 root 1.7 The return value of the callback is normally C<0>, which tells libeio to
243     continue normally. If a callback returns a nonzero value, libeio will
244     stop processing results (in C<eio_poll>) and will return the value to its
245     caller.
246    
247 root 1.28 Memory areas passed to libeio wrappers must stay valid as long as a
248     request executes, with the exception of paths, which are being copied
249 root 1.7 internally. Any memory libeio itself allocates will be freed after the
250     finish callback has been called. If you want to manage all memory passed
251     to libeio yourself you can use the low-level API.
252    
253     For example, to open a file, you could do this:
254    
255     static int
256     file_open_done (eio_req *req)
257     {
258     if (req->result < 0)
259     {
260     /* open() returned -1 */
261     errno = req->errorno;
262     perror ("open");
263     }
264     else
265     {
266     int fd = req->result;
267     /* now we have the new fd in fd */
268     }
269    
270     return 0;
271     }
272    
273     /* the first three arguments are passed to open(2) */
274     /* the remaining are priority, callback and data */
275     if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0))
276 root 1.18 abort (); /* something went wrong, we will all die!!! */
277 root 1.7
278     Note that you additionally need to call C<eio_poll> when the C<want_cb>
279     indicates that requests are ready to be processed.
280    
281 root 1.17 =head2 CANCELLING REQUESTS
282    
283     Sometimes the need for a request goes away before the request is
284 root 1.18 finished. In that case, one can cancel the request by a call to
285 root 1.17 C<eio_cancel>:
286    
287     =over 4
288    
289     =item eio_cancel (eio_req *req)
290    
291 root 1.19 Cancel the request (and all its subrequests). If the request is currently
292 root 1.18 executing it might still continue to execute, and in other cases it might
293     still take a while till the request is cancelled.
294 root 1.17
295 root 1.35 When cancelled, the finish callback will not be invoked.
296 root 1.18
297     C<EIO_CANCELLED> is still true for requests that have successfully
298     executed, as long as C<eio_cancel> was called on them at some point.
299 root 1.17
300     =back
301    
302 root 1.7 =head2 AVAILABLE REQUESTS
303    
304     The following request functions are available. I<All> of them return the
305     C<eio_req *> on success and C<0> on failure, and I<all> of them have the
306     same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is
307     mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some
308     custom data value as C<data>.
309    
310     =head3 POSIX API WRAPPERS
311    
312     These requests simply wrap the POSIX call of the same name, with the same
313 root 1.11 arguments. If a function is not implemented by the OS and cannot be emulated
314 root 1.10 in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>.
315 root 1.7
316     =over 4
317    
318     =item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data)
319    
320     =item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data)
321    
322     =item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
323    
324     =item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
325    
326     =item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
327    
328     =item eio_rmdir (const char *path, int pri, eio_cb cb, void *data)
329    
330     =item eio_unlink (const char *path, int pri, eio_cb cb, void *data)
331    
332 root 1.10 =item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
333 root 1.7
334     =item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data)
335    
336     =item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
337    
338     =item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
339    
340     =item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
341    
342     =item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data)
343    
344     =item eio_close (int fd, int pri, eio_cb cb, void *data)
345    
346     =item eio_sync (int pri, eio_cb cb, void *data)
347    
348     =item eio_fsync (int fd, int pri, eio_cb cb, void *data)
349    
350     =item eio_fdatasync (int fd, int pri, eio_cb cb, void *data)
351    
352     =item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
353    
354     =item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data)
355    
356     =item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data)
357    
358     =item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
359    
360     =item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data)
361    
362     These have the same semantics as the syscall of the same name, their
363     return value is available as C<< req->result >> later.
364    
365     =item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
366    
367     =item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
368    
369     These two requests are called C<read> and C<write>, but actually wrap
370     C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin),
371     libeio uses lseek/read_or_write/lseek and a mutex to serialise the
372     requests, so all these requests run serially and do not disturb each
373     other. However, they still disturb the file offset while they run, so it's
374     not safe to call these functions concurrently with non-libeio functions on
375     the same fd on these systems.
376    
377     Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X),
378     so it is advised not to submit multiple requests on the same fd on this
379     horrible pile of garbage.
380    
381 root 1.10 =item eio_mlockall (int flags, int pri, eio_cb cb, void *data)
382    
383     Like C<mlockall>, but the flag value constants are called
384     C<EIO_MCL_CURRENT> and C<EIO_MCL_FUTURE>.
385    
386     =item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
387    
388     Just like msync, except that the flag values are called C<EIO_MS_ASYNC>,
389     C<EIO_MS_INVALIDATE> and C<EIO_MS_SYNC>.
390    
391     =item eio_readlink (const char *path, int pri, eio_cb cb, void *data)
392    
393     If successful, the path read by C<readlink(2)> can be accessed via C<<
394     req->ptr2 >> and is I<NOT> null-terminated, with the length specified as
395     C<< req->result >>.
396    
397     if (req->result >= 0)
398     {
399     char *target = strndup ((char *)req->ptr2, req->result);
400    
401     free (target);
402     }
403    
404 root 1.13 =item eio_realpath (const char *path, int pri, eio_cb cb, void *data)
405    
406 root 1.22 Similar to the realpath libc function, but unlike that one, C<<
407     req->result >> is C<-1> on failure. On success, the result is the length
408     of the returned path in C<ptr2> (which is I<NOT> 0-terminated) - this is
409     similar to readlink.
410 root 1.13
411 root 1.10 =item eio_stat (const char *path, int pri, eio_cb cb, void *data)
412    
413     =item eio_lstat (const char *path, int pri, eio_cb cb, void *data)
414    
415 root 1.7 =item eio_fstat (int fd, int pri, eio_cb cb, void *data)
416    
417     Stats a file - if C<< req->result >> indicates success, then you can
418     access the C<struct stat>-like structure via C<< req->ptr2 >>:
419    
420 root 1.17 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2;
421 root 1.7
422 root 1.10 =item eio_statvfs (const char *path, int pri, eio_cb cb, void *data)
423    
424     =item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data)
425 root 1.7
426     Stats a filesystem - if C<< req->result >> indicates success, then you can
427     access the C<struct statvfs>-like structure via C<< req->ptr2 >>:
428    
429 root 1.17 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2;
430 root 1.7
431     =back
432    
433     =head3 READING DIRECTORIES
434    
435     Reading directories sounds simple, but can be rather demanding, especially
436 root 1.18 if you want to do stuff such as traversing a directory hierarchy or
437     processing all files in a directory. Libeio can assist these complex tasks
438 root 1.7 with it's C<eio_readdir> call.
439    
440     =over 4
441    
442     =item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data)
443    
444     This is a very complex call. It basically reads through a whole directory
445     (via the C<opendir>, C<readdir> and C<closedir> calls) and returns either
446     the names or an array of C<struct eio_dirent>, depending on the C<flags>
447     argument.
448    
449     The C<< req->result >> indicates either the number of files found, or
450 root 1.10 C<-1> on error. On success, null-terminated names can be found as C<< req->ptr2 >>,
451 root 1.7 and C<struct eio_dirents>, if requested by C<flags>, can be found via C<<
452     req->ptr1 >>.
453    
454     Here is an example that prints all the names:
455    
456     int i;
457     char *names = (char *)req->ptr2;
458    
459     for (i = 0; i < req->result; ++i)
460     {
461     printf ("name #%d: %s\n", i, names);
462    
463     /* move to next name */
464     names += strlen (names) + 1;
465     }
466    
467     Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>.
468    
469     C<flags> can be any combination of:
470    
471     =over 4
472    
473     =item EIO_READDIR_DENTS
474    
475     If this flag is specified, then, in addition to the names in C<ptr2>,
476     also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct
477     eio_dirent> looks like this:
478    
479 root 1.17 struct eio_dirent
480     {
481     int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */
482     unsigned short namelen; /* size of filename without trailing 0 */
483     unsigned char type; /* one of EIO_DT_* */
484     signed char score; /* internal use */
485     ino_t inode; /* the inode number, if available, otherwise unspecified */
486     };
487 root 1.7
488     The only members you normally would access are C<nameofs>, which is the
489     byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>.
490    
491     C<type> can be one of:
492    
493     C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat>
494     the name yourself if you need to know,
495     one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>,
496     C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>)
497     or some OS-specific type (currently
498     C<EIO_DT_MPC> - multiplexed char device (v7+coherent),
499     C<EIO_DT_NAM> - xenix special named file,
500     C<EIO_DT_MPB> - multiplexed block device (v7+coherent),
501     C<EIO_DT_NWK> - HP-UX network special,
502     C<EIO_DT_CMP> - VxFS compressed,
503     C<EIO_DT_DOOR> - solaris door, or
504     C<EIO_DT_WHT>).
505    
506     This example prints all names and their type:
507    
508     int i;
509     struct eio_dirent *ents = (struct eio_dirent *)req->ptr1;
510     char *names = (char *)req->ptr2;
511    
512     for (i = 0; i < req->result; ++i)
513     {
514     struct eio_dirent *ent = ents + i;
515     char *name = names + ent->nameofs;
516    
517     printf ("name #%d: %s (type %d)\n", i, name, ent->type);
518     }
519    
520     =item EIO_READDIR_DIRS_FIRST
521    
522     When this flag is specified, then the names will be returned in an order
523     where likely directories come first, in optimal C<stat> order. This is
524     useful when you need to quickly find directories, or you want to find all
525     directories while avoiding to stat() each entry.
526    
527     If the system returns type information in readdir, then this is used
528     to find directories directly. Otherwise, likely directories are names
529     beginning with ".", or otherwise names with no dots, of which names with
530     short names are tried first.
531    
532     =item EIO_READDIR_STAT_ORDER
533    
534     When this flag is specified, then the names will be returned in an order
535     suitable for stat()'ing each one. That is, when you plan to stat()
536     all files in the given directory, then the returned order will likely
537     be fastest.
538    
539 root 1.18 If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the
540     likely directories come first, resulting in a less optimal stat order.
541 root 1.7
542     =item EIO_READDIR_FOUND_UNKNOWN
543    
544     This flag should not be specified when calling C<eio_readdir>. Instead,
545     it is being set by C<eio_readdir> (you can access the C<flags> via C<<
546     req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The
547 root 1.18 absence of this flag therefore indicates that all C<type>'s are known,
548 root 1.7 which can be used to speed up some algorithms.
549    
550     A typical use case would be to identify all subdirectories within a
551     directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If
552     then this flag is I<NOT> set, then all the entries at the beginning of the
553     returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you
554     should start C<stat()>'ing the entries starting at the beginning of the
555     array, stopping as soon as you found all directories (the count can be
556     deduced by the link count of the directory).
557    
558     =back
559    
560     =back
561    
562     =head3 OS-SPECIFIC CALL WRAPPERS
563    
564     These wrap OS-specific calls (usually Linux ones), and might or might not
565     be emulated on other operating systems. Calls that are not emulated will
566     return C<-1> and set C<errno> to C<ENOSYS>.
567    
568     =over 4
569    
570     =item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data)
571    
572     Wraps the C<sendfile> syscall. The arguments follow the Linux version, but
573     libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and
574     Darwin.
575    
576     If the OS doesn't support some sendfile-like call, or the call fails,
577     indicating support for the given file descriptor type (for example,
578     Linux's sendfile might not support file to file copies), then libeio will
579     emulate the call in userspace, so there are almost no limitations on its
580     use.
581    
582     =item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data)
583    
584     Calls C<readahead(2)>. If the syscall is missing, then the call is
585     emulated by simply reading the data (currently in 64kiB chunks).
586    
587 root 1.27 =item eio_syncfs (int fd, int pri, eio_cb cb, void *data)
588    
589     Calls Linux' C<syncfs> syscall, if available. Returns C<-1> and sets
590     C<errno> to C<ENOSYS> if the call is missing I<but still calls sync()>,
591     if the C<fd> is C<< >= 0 >>, so you can probe for the availability of the
592     syscall with a negative C<fd> argument and checking for C<-1/ENOSYS>.
593    
594 root 1.7 =item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data)
595    
596     Calls C<sync_file_range>. If the syscall is missing, then this is the same
597     as calling C<fdatasync>.
598    
599 root 1.10 Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>,
600     C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>.
601    
602 root 1.21 =item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data)
603    
604     Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is
605     missing, then it returns failure and sets C<errno> to C<ENOSYS>.
606    
607     The C<mode> argument can be C<0> (for behaviour similar to
608     C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size
609     of the file unchanged (but still preallocates space beyond end of file).
610    
611 root 1.7 =back
612    
613     =head3 LIBEIO-SPECIFIC REQUESTS
614    
615     These requests are specific to libeio and do not correspond to any OS call.
616    
617     =over 4
618    
619 root 1.9 =item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
620 root 1.7
621 root 1.31 Reads (C<flags == 0>) or modifies (C<flags == EIO_MT_MODIFY>) the given
622 root 1.9 memory area, page-wise, that is, it reads (or reads and writes back) the
623     first octet of every page that spans the memory area.
624    
625     This can be used to page in some mmapped file, or dirty some pages. Note
626     that dirtying is an unlocked read-write access, so races can ensue when
627     the some other thread modifies the data stored in that memory area.
628    
629     =item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data)
630 root 1.7
631     Executes a custom request, i.e., a user-specified callback.
632    
633     The callback gets the C<eio_req *> as parameter and is expected to read
634     and modify any request-specific members. Specifically, it should set C<<
635     req->result >> to the result value, just like other requests.
636    
637     Here is an example that simply calls C<open>, like C<eio_open>, but it
638     uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If
639     you want to pass more/other parameters, you either need to pass some
640     struct or so via C<data> or provide your own wrapper using the low-level
641     API.
642    
643     static int
644     my_open_done (eio_req *req)
645     {
646     int fd = req->result;
647    
648     return 0;
649     }
650    
651     static void
652     my_open (eio_req *req)
653     {
654     req->result = open (req->data, O_RDONLY);
655     }
656    
657     eio_custom (my_open, 0, my_open_done, "/etc/passwd");
658    
659 root 1.9 =item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data)
660 root 1.7
661 root 1.18 This is a request that takes C<delay> seconds to execute, but otherwise
662 root 1.7 does nothing - it simply puts one of the worker threads to sleep for this
663     long.
664    
665     This request can be used to artificially increase load, e.g. for debugging
666     or benchmarking reasons.
667    
668 root 1.9 =item eio_nop (int pri, eio_cb cb, void *data)
669 root 1.7
670     This request does nothing, except go through the whole request cycle. This
671     can be used to measure latency or in some cases to simplify code, but is
672     not really of much use.
673    
674     =back
675    
676     =head3 GROUPING AND LIMITING REQUESTS
677 root 1.1
678 root 1.12 There is one more rather special request, C<eio_grp>. It is a very special
679     aio request: Instead of doing something, it is a container for other eio
680     requests.
681    
682     There are two primary use cases for this: a) bundle many requests into a
683     single, composite, request with a definite callback and the ability to
684     cancel the whole request with its subrequests and b) limiting the number
685     of "active" requests.
686    
687 root 1.18 Further below you will find more discussion of these topics - first
688     follows the reference section detailing the request generator and other
689     methods.
690 root 1.12
691     =over 4
692    
693 root 1.17 =item eio_req *grp = eio_grp (eio_cb cb, void *data)
694    
695 root 1.23 Creates, submits and returns a group request. Note that it doesn't have a
696     priority, unlike all other requests.
697 root 1.17
698     =item eio_grp_add (eio_req *grp, eio_req *req)
699    
700     Adds a request to the request group.
701    
702     =item eio_grp_cancel (eio_req *grp)
703    
704     Cancels all requests I<in> the group, but I<not> the group request
705 root 1.23 itself. You can cancel the group request I<and> all subrequests via a
706     normal C<eio_cancel> call.
707 root 1.17
708 root 1.23 =back
709    
710     =head4 GROUP REQUEST LIFETIME
711    
712     Left alone, a group request will instantly move to the pending state and
713     will be finished at the next call of C<eio_poll>.
714    
715 sf-exg 1.24 The usefulness stems from the fact that, if a subrequest is added to a
716 root 1.23 group I<before> a call to C<eio_poll>, via C<eio_grp_add>, then the group
717     will not finish until all the subrequests have finished.
718    
719     So the usage cycle of a group request is like this: after it is created,
720     you normally instantly add a subrequest. If none is added, the group
721     request will finish on it's own. As long as subrequests are added before
722     the group request is finished it will be kept from finishing, that is the
723     callbacks of any subrequests can, in turn, add more requests to the group,
724     and as long as any requests are active, the group request itself will not
725     finish.
726    
727     =head4 CREATING COMPOSITE REQUESTS
728    
729     Imagine you wanted to create an C<eio_load> request that opens a file,
730     reads it and closes it. This means it has to execute at least three eio
731     requests, but for various reasons it might be nice if that request looked
732     like any other eio request.
733    
734     This can be done with groups:
735    
736     =over 4
737    
738     =item 1) create the request object
739    
740     Create a group that contains all further requests. This is the request you
741     can return as "the load request".
742 root 1.12
743 root 1.23 =item 2) open the file, maybe
744    
745     Next, open the file with C<eio_open> and add the request to the group
746 sf-exg 1.24 request and you are finished setting up the request.
747 root 1.23
748     If, for some reason, you cannot C<eio_open> (path is a null ptr?) you
749 sf-exg 1.24 can set C<< grp->result >> to C<-1> to signal an error and let the group
750 root 1.23 request finish on its own.
751    
752     =item 3) open callback adds more requests
753    
754     In the open callback, if the open was not successful, copy C<<
755 root 1.30 req->errorno >> to C<< grp->errorno >> and set C<< grp->result >> to
756 root 1.23 C<-1> to signal an error.
757    
758     Otherwise, malloc some memory or so and issue a read request, adding the
759     read request to the group.
760    
761 sf-exg 1.24 =item 4) continue issuing requests till finished
762 root 1.23
763 root 1.30 In the read callback, check for errors and possibly continue with
764 root 1.23 C<eio_close> or any other eio request in the same way.
765    
766 root 1.30 As soon as no new requests are added, the group request will finish. Make
767 root 1.23 sure you I<always> set C<< grp->result >> to some sensible value.
768 root 1.12
769     =back
770    
771 root 1.23 =head4 REQUEST LIMITING
772 root 1.12
773    
774 root 1.1 #TODO
775    
776 root 1.7 void eio_grp_limit (eio_req *grp, int limit);
777 root 1.1
778    
779    
780     =head1 LOW LEVEL REQUEST API
781    
782     #TODO
783    
784 root 1.7
785     =head1 ANATOMY AND LIFETIME OF AN EIO REQUEST
786    
787     A request is represented by a structure of type C<eio_req>. To initialise
788     it, clear it to all zero bytes:
789    
790 root 1.17 eio_req req;
791 root 1.7
792 root 1.17 memset (&req, 0, sizeof (req));
793 root 1.7
794     A more common way to initialise a new C<eio_req> is to use C<calloc>:
795    
796 root 1.17 eio_req *req = calloc (1, sizeof (*req));
797 root 1.7
798     In either case, libeio neither allocates, initialises or frees the
799     C<eio_req> structure for you - it merely uses it.
800    
801     zero
802    
803     #TODO
804    
805 root 1.8 =head2 CONFIGURATION
806    
807     The functions in this section can sometimes be useful, but the default
808     configuration will do in most case, so you should skip this section on
809     first reading.
810    
811     =over 4
812    
813     =item eio_set_max_poll_time (eio_tstamp nseconds)
814    
815     This causes C<eio_poll ()> to return after it has detected that it was
816     running for C<nsecond> seconds or longer (this number can be fractional).
817    
818     This can be used to limit the amount of time spent handling eio requests,
819     for example, in interactive programs, you might want to limit this time to
820     C<0.01> seconds or so.
821    
822     Note that:
823    
824 root 1.18 =over 4
825    
826     =item a) libeio doesn't know how long your request callbacks take, so the
827     time spent in C<eio_poll> is up to one callback invocation longer then
828     this interval.
829 root 1.8
830 root 1.18 =item b) this is implemented by calling C<gettimeofday> after each
831     request, which can be costly.
832 root 1.8
833 root 1.18 =item c) at least one request will be handled.
834    
835     =back
836 root 1.8
837     =item eio_set_max_poll_reqs (unsigned int nreqs)
838    
839     When C<nreqs> is non-zero, then C<eio_poll> will not handle more than
840     C<nreqs> requests per invocation. This is a less costly way to limit the
841     amount of work done by C<eio_poll> then setting a time limit.
842    
843     If you know your callbacks are generally fast, you could use this to
844     encourage interactiveness in your programs by setting it to C<10>, C<100>
845     or even C<1000>.
846    
847     =item eio_set_min_parallel (unsigned int nthreads)
848    
849     Make sure libeio can handle at least this many requests in parallel. It
850     might be able handle more.
851    
852     =item eio_set_max_parallel (unsigned int nthreads)
853    
854     Set the maximum number of threads that libeio will spawn.
855    
856     =item eio_set_max_idle (unsigned int nthreads)
857    
858     Libeio uses threads internally to handle most requests, and will start and stop threads on demand.
859    
860     This call can be used to limit the number of idle threads (threads without
861     work to do): libeio will keep some threads idle in preparation for more
862     requests, but never longer than C<nthreads> threads.
863    
864     In addition to this, libeio will also stop threads when they are idle for
865     a few seconds, regardless of this setting.
866    
867     =item unsigned int eio_nthreads ()
868    
869     Return the number of worker threads currently running.
870    
871     =item unsigned int eio_nreqs ()
872    
873     Return the number of requests currently handled by libeio. This is the
874     total number of requests that have been submitted to libeio, but not yet
875     destroyed.
876    
877     =item unsigned int eio_nready ()
878    
879     Returns the number of ready requests, i.e. requests that have been
880     submitted but have not yet entered the execution phase.
881    
882     =item unsigned int eio_npending ()
883    
884     Returns the number of pending requests, i.e. requests that have been
885     executed and have results, but have not been finished yet by a call to
886     C<eio_poll>).
887    
888     =back
889    
890 root 1.1 =head1 EMBEDDING
891    
892     Libeio can be embedded directly into programs. This functionality is not
893     documented and not (yet) officially supported.
894    
895 root 1.3 Note that, when including C<libeio.m4>, you are responsible for defining
896     the compilation environment (C<_LARGEFILE_SOURCE>, C<_GNU_SOURCE> etc.).
897    
898 root 1.2 If you need to know how, check the C<IO::AIO> perl module, which does
899 root 1.1 exactly that.
900    
901    
902 root 1.4 =head1 COMPILETIME CONFIGURATION
903    
904     These symbols, if used, must be defined when compiling F<eio.c>.
905    
906     =over 4
907    
908     =item EIO_STACKSIZE
909    
910     This symbol governs the stack size for each eio thread. Libeio itself
911     was written to use very little stackspace, but when using C<EIO_CUSTOM>
912     requests, you might want to increase this.
913    
914     If this symbol is undefined (the default) then libeio will use its default
915 root 1.32 stack size (C<sizeof (void *) * 4096> currently). In all other cases, the
916     value must be an expression that evaluates to the desired stack size.
917 root 1.4
918     =back
919    
920    
921 root 1.1 =head1 PORTABILITY REQUIREMENTS
922    
923     In addition to a working ISO-C implementation, libeio relies on a few
924     additional extensions:
925    
926     =over 4
927    
928     =item POSIX threads
929    
930     To be portable, this module uses threads, specifically, the POSIX threads
931     library must be available (and working, which partially excludes many xBSD
932     systems, where C<fork ()> is buggy).
933    
934     =item POSIX-compatible filesystem API
935    
936     This is actually a harder portability requirement: The libeio API is quite
937     demanding regarding POSIX API calls (symlinks, user/group management
938     etc.).
939    
940     =item C<double> must hold a time value in seconds with enough accuracy
941    
942     The type C<double> is used to represent timestamps. It is required to
943     have at least 51 bits of mantissa (and 9 bits of exponent), which is good
944     enough for at least into the year 4000. This requirement is fulfilled by
945     implementations implementing IEEE 754 (basically all existing ones).
946    
947     =back
948    
949     If you know of other additional requirements drop me a note.
950    
951    
952     =head1 AUTHOR
953    
954     Marc Lehmann <libeio@schmorp.de>.
955