ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libeio/eio.pod
Revision: 1.23
Committed: Wed Jul 13 21:31:40 2011 UTC (12 years, 10 months ago) by root
Branch: MAIN
Changes since 1.22: +64 -8 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 =head1 NAME
2    
3     libeio - truly asynchronous POSIX I/O
4    
5     =head1 SYNOPSIS
6    
7     #include <eio.h>
8    
9     =head1 DESCRIPTION
10    
11     The newest version of this document is also available as an html-formatted
12     web page you might find easier to navigate when reading it for the first
13     time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>.
14    
15     Note that this library is a by-product of the C<IO::AIO> perl
16 sf-exg 1.6 module, and many of the subtler points regarding requests lifetime
17 root 1.1 and so on are only documented in its documentation at the
18     moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>.
19    
20     =head2 FEATURES
21    
22     This library provides fully asynchronous versions of most POSIX functions
23 sf-exg 1.6 dealing with I/O. Unlike most asynchronous libraries, this not only
24 root 1.1 includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and
25     similar functions, as well as less rarely ones such as C<mknod>, C<futime>
26     or C<readlink>.
27    
28     It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and
29     FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with
30     emulation elsewhere>).
31    
32 root 1.5 The goal is to enable you to write fully non-blocking programs. For
33 root 1.1 example, in a game server, you would not want to freeze for a few seconds
34     just because the server is running a backup and you happen to call
35     C<readdir>.
36    
37     =head2 TIME REPRESENTATION
38    
39     Libeio represents time as a single floating point number, representing the
40     (fractional) number of seconds since the (POSIX) epoch (somewhere near
41     the beginning of 1970, details are complicated, don't ask). This type is
42 sf-exg 1.6 called C<eio_tstamp>, but it is guaranteed to be of type C<double> (or
43 root 1.1 better), so you can freely use C<double> yourself.
44    
45     Unlike the name component C<stamp> might indicate, it is also used for
46     time differences throughout libeio.
47    
48     =head2 FORK SUPPORT
49    
50 root 1.17 Calling C<fork ()> is fully supported by this module - but you must not
51     rely on this. It is currently implemented in these steps:
52 root 1.1
53 root 1.17 1. wait till all requests in "execute" state have been handled
54     (basically requests that are already handed over to the kernel).
55     2. fork
56     3. in the parent, continue business as usual, done
57     4. in the child, destroy all ready and pending requests and free the
58     memory used by the worker threads. This gives you a fully empty
59     libeio queue.
60 root 1.1
61 root 1.17 Note, however, since libeio does use threads, the above guarantee doesn't
62 root 1.7 cover your libc, for example, malloc and other libc functions are not
63 root 1.17 fork-safe, so there is very little you can do after a fork, and in fact,
64 root 1.7 the above might crash, and thus change.
65    
66 root 1.1 =head1 INITIALISATION/INTEGRATION
67    
68     Before you can call any eio functions you first have to initialise the
69     library. The library integrates into any event loop, but can also be used
70     without one, including in polling mode.
71    
72     You have to provide the necessary glue yourself, however.
73    
74     =over 4
75    
76     =item int eio_init (void (*want_poll)(void), void (*done_poll)(void))
77    
78     This function initialises the library. On success it returns C<0>, on
79     failure it returns C<-1> and sets C<errno> appropriately.
80    
81     It accepts two function pointers specifying callbacks as argument, both of
82     which can be C<0>, in which case the callback isn't called.
83    
84     =item want_poll callback
85    
86     The C<want_poll> callback is invoked whenever libeio wants attention (i.e.
87     it wants to be polled by calling C<eio_poll>). It is "edge-triggered",
88     that is, it will only be called once when eio wants attention, until all
89     pending requests have been handled.
90    
91     This callback is called while locks are being held, so I<you must
92     not call any libeio functions inside this callback>. That includes
93     C<eio_poll>. What you should do is notify some other thread, or wake up
94     your event loop, and then call C<eio_poll>.
95    
96     =item done_poll callback
97    
98     This callback is invoked when libeio detects that all pending requests
99     have been handled. It is "edge-triggered", that is, it will only be
100     called once after C<want_poll>. To put it differently, C<want_poll> and
101     C<done_poll> are invoked in pairs: after C<want_poll> you have to call
102     C<eio_poll ()> until either C<eio_poll> indicates that everything has been
103     handled or C<done_poll> has been called, which signals the same.
104    
105     Note that C<eio_poll> might return after C<done_poll> and C<want_poll>
106     have been called again, so watch out for races in your code.
107    
108 sf-exg 1.6 As with C<want_poll>, this callback is called while locks are being held,
109 root 1.1 so you I<must not call any libeio functions form within this callback>.
110    
111     =item int eio_poll ()
112    
113     This function has to be called whenever there are pending requests that
114     need finishing. You usually call this after C<want_poll> has indicated
115     that you should do so, but you can also call this function regularly to
116     poll for new results.
117    
118     If any request invocation returns a non-zero value, then C<eio_poll ()>
119     immediately returns with that value as return value.
120    
121     Otherwise, if all requests could be handled, it returns C<0>. If for some
122     reason not all requests have been handled, i.e. some are still pending, it
123     returns C<-1>.
124    
125     =back
126    
127     For libev, you would typically use an C<ev_async> watcher: the
128     C<want_poll> callback would invoke C<ev_async_send> to wake up the event
129     loop. Inside the callback set for the watcher, one would call C<eio_poll
130 root 1.15 ()>.
131    
132     If C<eio_poll ()> is configured to not handle all results in one go
133     (i.e. it returns C<-1>) then you should start an idle watcher that calls
134     C<eio_poll> until it returns something C<!= -1>.
135    
136 sf-exg 1.20 A full-featured connector between libeio and libev would look as follows
137 root 1.16 (if C<eio_poll> is handling all requests, it can of course be simplified a
138     lot by removing the idle watcher logic):
139 root 1.15
140 root 1.17 static struct ev_loop *loop;
141     static ev_idle repeat_watcher;
142     static ev_async ready_watcher;
143 root 1.15
144 root 1.17 /* idle watcher callback, only used when eio_poll */
145     /* didn't handle all results in one call */
146     static void
147     repeat (EV_P_ ev_idle *w, int revents)
148     {
149     if (eio_poll () != -1)
150     ev_idle_stop (EV_A_ w);
151     }
152    
153     /* eio has some results, process them */
154     static void
155     ready (EV_P_ ev_async *w, int revents)
156     {
157     if (eio_poll () == -1)
158     ev_idle_start (EV_A_ &repeat_watcher);
159     }
160    
161     /* wake up the event loop */
162     static void
163     want_poll (void)
164     {
165     ev_async_send (loop, &ready_watcher)
166     }
167    
168     void
169     my_init_eio ()
170     {
171     loop = EV_DEFAULT;
172    
173     ev_idle_init (&repeat_watcher, repeat);
174     ev_async_init (&ready_watcher, ready);
175     ev_async_start (loop &watcher);
176    
177     eio_init (want_poll, 0);
178     }
179 root 1.1
180     For most other event loops, you would typically use a pipe - the event
181 sf-exg 1.6 loop should be told to wait for read readiness on the read end. In
182 root 1.1 C<want_poll> you would write a single byte, in C<done_poll> you would try
183     to read that byte, and in the callback for the read end, you would call
184 root 1.16 C<eio_poll>.
185    
186     You don't have to take special care in the case C<eio_poll> doesn't handle
187     all requests, as the done callback will not be invoked, so the event loop
188 root 1.18 will still signal readiness for the pipe until I<all> results have been
189 root 1.16 processed.
190 root 1.1
191    
192 root 1.7 =head1 HIGH LEVEL REQUEST API
193    
194     Libeio has both a high-level API, which consists of calling a request
195     function with a callback to be called on completion, and a low-level API
196     where you fill out request structures and submit them.
197    
198     This section describes the high-level API.
199    
200     =head2 REQUEST SUBMISSION AND RESULT PROCESSING
201    
202     You submit a request by calling the relevant C<eio_TYPE> function with the
203     required parameters, a callback of type C<int (*eio_cb)(eio_req *req)>
204     (called C<eio_cb> below) and a freely usable C<void *data> argument.
205    
206 root 1.12 The return value will either be 0, in case something went really wrong
207     (which can basically only happen on very fatal errors, such as C<malloc>
208     returning 0, which is rather unlikely), or a pointer to the newly-created
209     and submitted C<eio_req *>.
210 root 1.7
211     The callback will be called with an C<eio_req *> which contains the
212     results of the request. The members you can access inside that structure
213     vary from request to request, except for:
214    
215     =over 4
216    
217     =item C<ssize_t result>
218    
219     This contains the result value from the call (usually the same as the
220     syscall of the same name).
221    
222     =item C<int errorno>
223    
224     This contains the value of C<errno> after the call.
225    
226     =item C<void *data>
227    
228     The C<void *data> member simply stores the value of the C<data> argument.
229    
230     =back
231    
232     The return value of the callback is normally C<0>, which tells libeio to
233     continue normally. If a callback returns a nonzero value, libeio will
234     stop processing results (in C<eio_poll>) and will return the value to its
235     caller.
236    
237     Memory areas passed to libeio must stay valid as long as a request
238     executes, with the exception of paths, which are being copied
239     internally. Any memory libeio itself allocates will be freed after the
240     finish callback has been called. If you want to manage all memory passed
241     to libeio yourself you can use the low-level API.
242    
243     For example, to open a file, you could do this:
244    
245     static int
246     file_open_done (eio_req *req)
247     {
248     if (req->result < 0)
249     {
250     /* open() returned -1 */
251     errno = req->errorno;
252     perror ("open");
253     }
254     else
255     {
256     int fd = req->result;
257     /* now we have the new fd in fd */
258     }
259    
260     return 0;
261     }
262    
263     /* the first three arguments are passed to open(2) */
264     /* the remaining are priority, callback and data */
265     if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0))
266 root 1.18 abort (); /* something went wrong, we will all die!!! */
267 root 1.7
268     Note that you additionally need to call C<eio_poll> when the C<want_cb>
269     indicates that requests are ready to be processed.
270    
271 root 1.17 =head2 CANCELLING REQUESTS
272    
273     Sometimes the need for a request goes away before the request is
274 root 1.18 finished. In that case, one can cancel the request by a call to
275 root 1.17 C<eio_cancel>:
276    
277     =over 4
278    
279     =item eio_cancel (eio_req *req)
280    
281 root 1.19 Cancel the request (and all its subrequests). If the request is currently
282 root 1.18 executing it might still continue to execute, and in other cases it might
283     still take a while till the request is cancelled.
284 root 1.17
285     Even if cancelled, the finish callback will still be invoked - the
286     callbacks of all cancellable requests need to check whether the request
287     has been cancelled by calling C<EIO_CANCELLED (req)>:
288    
289     static int
290     my_eio_cb (eio_req *req)
291     {
292     if (EIO_CANCELLED (req))
293     return 0;
294     }
295    
296 root 1.18 In addition, cancelled requests will I<either> have C<< req->result >>
297     set to C<-1> and C<errno> to C<ECANCELED>, or I<otherwise> they were
298     successfully executed, despite being cancelled (e.g. when they have
299     already been executed at the time they were cancelled).
300    
301     C<EIO_CANCELLED> is still true for requests that have successfully
302     executed, as long as C<eio_cancel> was called on them at some point.
303 root 1.17
304     =back
305    
306 root 1.7 =head2 AVAILABLE REQUESTS
307    
308     The following request functions are available. I<All> of them return the
309     C<eio_req *> on success and C<0> on failure, and I<all> of them have the
310     same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is
311     mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some
312     custom data value as C<data>.
313    
314     =head3 POSIX API WRAPPERS
315    
316     These requests simply wrap the POSIX call of the same name, with the same
317 root 1.11 arguments. If a function is not implemented by the OS and cannot be emulated
318 root 1.10 in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>.
319 root 1.7
320     =over 4
321    
322     =item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data)
323    
324     =item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data)
325    
326     =item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
327    
328     =item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
329    
330     =item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
331    
332     =item eio_rmdir (const char *path, int pri, eio_cb cb, void *data)
333    
334     =item eio_unlink (const char *path, int pri, eio_cb cb, void *data)
335    
336 root 1.10 =item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
337 root 1.7
338     =item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data)
339    
340     =item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
341    
342     =item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
343    
344     =item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
345    
346     =item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data)
347    
348     =item eio_close (int fd, int pri, eio_cb cb, void *data)
349    
350     =item eio_sync (int pri, eio_cb cb, void *data)
351    
352     =item eio_fsync (int fd, int pri, eio_cb cb, void *data)
353    
354     =item eio_fdatasync (int fd, int pri, eio_cb cb, void *data)
355    
356     =item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
357    
358     =item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data)
359    
360     =item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data)
361    
362     =item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
363    
364     =item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data)
365    
366     These have the same semantics as the syscall of the same name, their
367     return value is available as C<< req->result >> later.
368    
369     =item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
370    
371     =item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
372    
373     These two requests are called C<read> and C<write>, but actually wrap
374     C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin),
375     libeio uses lseek/read_or_write/lseek and a mutex to serialise the
376     requests, so all these requests run serially and do not disturb each
377     other. However, they still disturb the file offset while they run, so it's
378     not safe to call these functions concurrently with non-libeio functions on
379     the same fd on these systems.
380    
381     Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X),
382     so it is advised not to submit multiple requests on the same fd on this
383     horrible pile of garbage.
384    
385 root 1.10 =item eio_mlockall (int flags, int pri, eio_cb cb, void *data)
386    
387     Like C<mlockall>, but the flag value constants are called
388     C<EIO_MCL_CURRENT> and C<EIO_MCL_FUTURE>.
389    
390     =item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
391    
392     Just like msync, except that the flag values are called C<EIO_MS_ASYNC>,
393     C<EIO_MS_INVALIDATE> and C<EIO_MS_SYNC>.
394    
395     =item eio_readlink (const char *path, int pri, eio_cb cb, void *data)
396    
397     If successful, the path read by C<readlink(2)> can be accessed via C<<
398     req->ptr2 >> and is I<NOT> null-terminated, with the length specified as
399     C<< req->result >>.
400    
401     if (req->result >= 0)
402     {
403     char *target = strndup ((char *)req->ptr2, req->result);
404    
405     free (target);
406     }
407    
408 root 1.13 =item eio_realpath (const char *path, int pri, eio_cb cb, void *data)
409    
410 root 1.22 Similar to the realpath libc function, but unlike that one, C<<
411     req->result >> is C<-1> on failure. On success, the result is the length
412     of the returned path in C<ptr2> (which is I<NOT> 0-terminated) - this is
413     similar to readlink.
414 root 1.13
415 root 1.10 =item eio_stat (const char *path, int pri, eio_cb cb, void *data)
416    
417     =item eio_lstat (const char *path, int pri, eio_cb cb, void *data)
418    
419 root 1.7 =item eio_fstat (int fd, int pri, eio_cb cb, void *data)
420    
421     Stats a file - if C<< req->result >> indicates success, then you can
422     access the C<struct stat>-like structure via C<< req->ptr2 >>:
423    
424 root 1.17 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2;
425 root 1.7
426 root 1.10 =item eio_statvfs (const char *path, int pri, eio_cb cb, void *data)
427    
428     =item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data)
429 root 1.7
430     Stats a filesystem - if C<< req->result >> indicates success, then you can
431     access the C<struct statvfs>-like structure via C<< req->ptr2 >>:
432    
433 root 1.17 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2;
434 root 1.7
435     =back
436    
437     =head3 READING DIRECTORIES
438    
439     Reading directories sounds simple, but can be rather demanding, especially
440 root 1.18 if you want to do stuff such as traversing a directory hierarchy or
441     processing all files in a directory. Libeio can assist these complex tasks
442 root 1.7 with it's C<eio_readdir> call.
443    
444     =over 4
445    
446     =item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data)
447    
448     This is a very complex call. It basically reads through a whole directory
449     (via the C<opendir>, C<readdir> and C<closedir> calls) and returns either
450     the names or an array of C<struct eio_dirent>, depending on the C<flags>
451     argument.
452    
453     The C<< req->result >> indicates either the number of files found, or
454 root 1.10 C<-1> on error. On success, null-terminated names can be found as C<< req->ptr2 >>,
455 root 1.7 and C<struct eio_dirents>, if requested by C<flags>, can be found via C<<
456     req->ptr1 >>.
457    
458     Here is an example that prints all the names:
459    
460     int i;
461     char *names = (char *)req->ptr2;
462    
463     for (i = 0; i < req->result; ++i)
464     {
465     printf ("name #%d: %s\n", i, names);
466    
467     /* move to next name */
468     names += strlen (names) + 1;
469     }
470    
471     Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>.
472    
473     C<flags> can be any combination of:
474    
475     =over 4
476    
477     =item EIO_READDIR_DENTS
478    
479     If this flag is specified, then, in addition to the names in C<ptr2>,
480     also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct
481     eio_dirent> looks like this:
482    
483 root 1.17 struct eio_dirent
484     {
485     int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */
486     unsigned short namelen; /* size of filename without trailing 0 */
487     unsigned char type; /* one of EIO_DT_* */
488     signed char score; /* internal use */
489     ino_t inode; /* the inode number, if available, otherwise unspecified */
490     };
491 root 1.7
492     The only members you normally would access are C<nameofs>, which is the
493     byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>.
494    
495     C<type> can be one of:
496    
497     C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat>
498     the name yourself if you need to know,
499     one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>,
500     C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>)
501     or some OS-specific type (currently
502     C<EIO_DT_MPC> - multiplexed char device (v7+coherent),
503     C<EIO_DT_NAM> - xenix special named file,
504     C<EIO_DT_MPB> - multiplexed block device (v7+coherent),
505     C<EIO_DT_NWK> - HP-UX network special,
506     C<EIO_DT_CMP> - VxFS compressed,
507     C<EIO_DT_DOOR> - solaris door, or
508     C<EIO_DT_WHT>).
509    
510     This example prints all names and their type:
511    
512     int i;
513     struct eio_dirent *ents = (struct eio_dirent *)req->ptr1;
514     char *names = (char *)req->ptr2;
515    
516     for (i = 0; i < req->result; ++i)
517     {
518     struct eio_dirent *ent = ents + i;
519     char *name = names + ent->nameofs;
520    
521     printf ("name #%d: %s (type %d)\n", i, name, ent->type);
522     }
523    
524     =item EIO_READDIR_DIRS_FIRST
525    
526     When this flag is specified, then the names will be returned in an order
527     where likely directories come first, in optimal C<stat> order. This is
528     useful when you need to quickly find directories, or you want to find all
529     directories while avoiding to stat() each entry.
530    
531     If the system returns type information in readdir, then this is used
532     to find directories directly. Otherwise, likely directories are names
533     beginning with ".", or otherwise names with no dots, of which names with
534     short names are tried first.
535    
536     =item EIO_READDIR_STAT_ORDER
537    
538     When this flag is specified, then the names will be returned in an order
539     suitable for stat()'ing each one. That is, when you plan to stat()
540     all files in the given directory, then the returned order will likely
541     be fastest.
542    
543 root 1.18 If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the
544     likely directories come first, resulting in a less optimal stat order.
545 root 1.7
546     =item EIO_READDIR_FOUND_UNKNOWN
547    
548     This flag should not be specified when calling C<eio_readdir>. Instead,
549     it is being set by C<eio_readdir> (you can access the C<flags> via C<<
550     req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The
551 root 1.18 absence of this flag therefore indicates that all C<type>'s are known,
552 root 1.7 which can be used to speed up some algorithms.
553    
554     A typical use case would be to identify all subdirectories within a
555     directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If
556     then this flag is I<NOT> set, then all the entries at the beginning of the
557     returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you
558     should start C<stat()>'ing the entries starting at the beginning of the
559     array, stopping as soon as you found all directories (the count can be
560     deduced by the link count of the directory).
561    
562     =back
563    
564     =back
565    
566     =head3 OS-SPECIFIC CALL WRAPPERS
567    
568     These wrap OS-specific calls (usually Linux ones), and might or might not
569     be emulated on other operating systems. Calls that are not emulated will
570     return C<-1> and set C<errno> to C<ENOSYS>.
571    
572     =over 4
573    
574     =item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data)
575    
576     Wraps the C<sendfile> syscall. The arguments follow the Linux version, but
577     libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and
578     Darwin.
579    
580     If the OS doesn't support some sendfile-like call, or the call fails,
581     indicating support for the given file descriptor type (for example,
582     Linux's sendfile might not support file to file copies), then libeio will
583     emulate the call in userspace, so there are almost no limitations on its
584     use.
585    
586     =item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data)
587    
588     Calls C<readahead(2)>. If the syscall is missing, then the call is
589     emulated by simply reading the data (currently in 64kiB chunks).
590    
591     =item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data)
592    
593     Calls C<sync_file_range>. If the syscall is missing, then this is the same
594     as calling C<fdatasync>.
595    
596 root 1.10 Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>,
597     C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>.
598    
599 root 1.21 =item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data)
600    
601     Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is
602     missing, then it returns failure and sets C<errno> to C<ENOSYS>.
603    
604     The C<mode> argument can be C<0> (for behaviour similar to
605     C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size
606     of the file unchanged (but still preallocates space beyond end of file).
607    
608 root 1.7 =back
609    
610     =head3 LIBEIO-SPECIFIC REQUESTS
611    
612     These requests are specific to libeio and do not correspond to any OS call.
613    
614     =over 4
615    
616 root 1.9 =item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
617 root 1.7
618 root 1.9 Reads (C<flags == 0>) or modifies (C<flags == EIO_MT_MODIFY) the given
619     memory area, page-wise, that is, it reads (or reads and writes back) the
620     first octet of every page that spans the memory area.
621    
622     This can be used to page in some mmapped file, or dirty some pages. Note
623     that dirtying is an unlocked read-write access, so races can ensue when
624     the some other thread modifies the data stored in that memory area.
625    
626     =item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data)
627 root 1.7
628     Executes a custom request, i.e., a user-specified callback.
629    
630     The callback gets the C<eio_req *> as parameter and is expected to read
631     and modify any request-specific members. Specifically, it should set C<<
632     req->result >> to the result value, just like other requests.
633    
634     Here is an example that simply calls C<open>, like C<eio_open>, but it
635     uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If
636     you want to pass more/other parameters, you either need to pass some
637     struct or so via C<data> or provide your own wrapper using the low-level
638     API.
639    
640     static int
641     my_open_done (eio_req *req)
642     {
643     int fd = req->result;
644    
645     return 0;
646     }
647    
648     static void
649     my_open (eio_req *req)
650     {
651     req->result = open (req->data, O_RDONLY);
652     }
653    
654     eio_custom (my_open, 0, my_open_done, "/etc/passwd");
655    
656 root 1.9 =item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data)
657 root 1.7
658 root 1.18 This is a request that takes C<delay> seconds to execute, but otherwise
659 root 1.7 does nothing - it simply puts one of the worker threads to sleep for this
660     long.
661    
662     This request can be used to artificially increase load, e.g. for debugging
663     or benchmarking reasons.
664    
665 root 1.9 =item eio_nop (int pri, eio_cb cb, void *data)
666 root 1.7
667     This request does nothing, except go through the whole request cycle. This
668     can be used to measure latency or in some cases to simplify code, but is
669     not really of much use.
670    
671     =back
672    
673     =head3 GROUPING AND LIMITING REQUESTS
674 root 1.1
675 root 1.12 There is one more rather special request, C<eio_grp>. It is a very special
676     aio request: Instead of doing something, it is a container for other eio
677     requests.
678    
679     There are two primary use cases for this: a) bundle many requests into a
680     single, composite, request with a definite callback and the ability to
681     cancel the whole request with its subrequests and b) limiting the number
682     of "active" requests.
683    
684 root 1.18 Further below you will find more discussion of these topics - first
685     follows the reference section detailing the request generator and other
686     methods.
687 root 1.12
688     =over 4
689    
690 root 1.17 =item eio_req *grp = eio_grp (eio_cb cb, void *data)
691    
692 root 1.23 Creates, submits and returns a group request. Note that it doesn't have a
693     priority, unlike all other requests.
694 root 1.17
695     =item eio_grp_add (eio_req *grp, eio_req *req)
696    
697     Adds a request to the request group.
698    
699     =item eio_grp_cancel (eio_req *grp)
700    
701     Cancels all requests I<in> the group, but I<not> the group request
702 root 1.23 itself. You can cancel the group request I<and> all subrequests via a
703     normal C<eio_cancel> call.
704 root 1.17
705 root 1.23 =back
706    
707     =head4 GROUP REQUEST LIFETIME
708    
709     Left alone, a group request will instantly move to the pending state and
710     will be finished at the next call of C<eio_poll>.
711    
712     There usefulness stems from the fact that, if a subrequest is added to a
713     group I<before> a call to C<eio_poll>, via C<eio_grp_add>, then the group
714     will not finish until all the subrequests have finished.
715    
716     So the usage cycle of a group request is like this: after it is created,
717     you normally instantly add a subrequest. If none is added, the group
718     request will finish on it's own. As long as subrequests are added before
719     the group request is finished it will be kept from finishing, that is the
720     callbacks of any subrequests can, in turn, add more requests to the group,
721     and as long as any requests are active, the group request itself will not
722     finish.
723    
724     =head4 CREATING COMPOSITE REQUESTS
725    
726     Imagine you wanted to create an C<eio_load> request that opens a file,
727     reads it and closes it. This means it has to execute at least three eio
728     requests, but for various reasons it might be nice if that request looked
729     like any other eio request.
730    
731     This can be done with groups:
732    
733     =over 4
734    
735     =item 1) create the request object
736    
737     Create a group that contains all further requests. This is the request you
738     can return as "the load request".
739 root 1.12
740 root 1.23 =item 2) open the file, maybe
741    
742     Next, open the file with C<eio_open> and add the request to the group
743     request and you are finished steting up the request.
744    
745     If, for some reason, you cannot C<eio_open> (path is a null ptr?) you
746     cna set C<< grp->result >> to C<-1> to signal an error and let the gorup
747     request finish on its own.
748    
749     =item 3) open callback adds more requests
750    
751     In the open callback, if the open was not successful, copy C<<
752     req->errorno >> to C<< grp->errorno >> and set C<< grp->errorno >> to
753     C<-1> to signal an error.
754    
755     Otherwise, malloc some memory or so and issue a read request, adding the
756     read request to the group.
757    
758     =item 4) continue issuign requests till finished
759    
760     In the real callback, check for errors and possibly continue with
761     C<eio_close> or any other eio request in the same way.
762    
763     As soon as no new requests are added the group request will finish. Make
764     sure you I<always> set C<< grp->result >> to some sensible value.
765 root 1.12
766     =back
767    
768 root 1.23 =head4 REQUEST LIMITING
769 root 1.12
770    
771 root 1.1 #TODO
772    
773 root 1.7 void eio_grp_limit (eio_req *grp, int limit);
774 root 1.1
775    
776     =back
777    
778    
779     =head1 LOW LEVEL REQUEST API
780    
781     #TODO
782    
783 root 1.7
784     =head1 ANATOMY AND LIFETIME OF AN EIO REQUEST
785    
786     A request is represented by a structure of type C<eio_req>. To initialise
787     it, clear it to all zero bytes:
788    
789 root 1.17 eio_req req;
790 root 1.7
791 root 1.17 memset (&req, 0, sizeof (req));
792 root 1.7
793     A more common way to initialise a new C<eio_req> is to use C<calloc>:
794    
795 root 1.17 eio_req *req = calloc (1, sizeof (*req));
796 root 1.7
797     In either case, libeio neither allocates, initialises or frees the
798     C<eio_req> structure for you - it merely uses it.
799    
800     zero
801    
802     #TODO
803    
804 root 1.8 =head2 CONFIGURATION
805    
806     The functions in this section can sometimes be useful, but the default
807     configuration will do in most case, so you should skip this section on
808     first reading.
809    
810     =over 4
811    
812     =item eio_set_max_poll_time (eio_tstamp nseconds)
813    
814     This causes C<eio_poll ()> to return after it has detected that it was
815     running for C<nsecond> seconds or longer (this number can be fractional).
816    
817     This can be used to limit the amount of time spent handling eio requests,
818     for example, in interactive programs, you might want to limit this time to
819     C<0.01> seconds or so.
820    
821     Note that:
822    
823 root 1.18 =over 4
824    
825     =item a) libeio doesn't know how long your request callbacks take, so the
826     time spent in C<eio_poll> is up to one callback invocation longer then
827     this interval.
828 root 1.8
829 root 1.18 =item b) this is implemented by calling C<gettimeofday> after each
830     request, which can be costly.
831 root 1.8
832 root 1.18 =item c) at least one request will be handled.
833    
834     =back
835 root 1.8
836     =item eio_set_max_poll_reqs (unsigned int nreqs)
837    
838     When C<nreqs> is non-zero, then C<eio_poll> will not handle more than
839     C<nreqs> requests per invocation. This is a less costly way to limit the
840     amount of work done by C<eio_poll> then setting a time limit.
841    
842     If you know your callbacks are generally fast, you could use this to
843     encourage interactiveness in your programs by setting it to C<10>, C<100>
844     or even C<1000>.
845    
846     =item eio_set_min_parallel (unsigned int nthreads)
847    
848     Make sure libeio can handle at least this many requests in parallel. It
849     might be able handle more.
850    
851     =item eio_set_max_parallel (unsigned int nthreads)
852    
853     Set the maximum number of threads that libeio will spawn.
854    
855     =item eio_set_max_idle (unsigned int nthreads)
856    
857     Libeio uses threads internally to handle most requests, and will start and stop threads on demand.
858    
859     This call can be used to limit the number of idle threads (threads without
860     work to do): libeio will keep some threads idle in preparation for more
861     requests, but never longer than C<nthreads> threads.
862    
863     In addition to this, libeio will also stop threads when they are idle for
864     a few seconds, regardless of this setting.
865    
866     =item unsigned int eio_nthreads ()
867    
868     Return the number of worker threads currently running.
869    
870     =item unsigned int eio_nreqs ()
871    
872     Return the number of requests currently handled by libeio. This is the
873     total number of requests that have been submitted to libeio, but not yet
874     destroyed.
875    
876     =item unsigned int eio_nready ()
877    
878     Returns the number of ready requests, i.e. requests that have been
879     submitted but have not yet entered the execution phase.
880    
881     =item unsigned int eio_npending ()
882    
883     Returns the number of pending requests, i.e. requests that have been
884     executed and have results, but have not been finished yet by a call to
885     C<eio_poll>).
886    
887     =back
888    
889 root 1.1 =head1 EMBEDDING
890    
891     Libeio can be embedded directly into programs. This functionality is not
892     documented and not (yet) officially supported.
893    
894 root 1.3 Note that, when including C<libeio.m4>, you are responsible for defining
895     the compilation environment (C<_LARGEFILE_SOURCE>, C<_GNU_SOURCE> etc.).
896    
897 root 1.2 If you need to know how, check the C<IO::AIO> perl module, which does
898 root 1.1 exactly that.
899    
900    
901 root 1.4 =head1 COMPILETIME CONFIGURATION
902    
903     These symbols, if used, must be defined when compiling F<eio.c>.
904    
905     =over 4
906    
907     =item EIO_STACKSIZE
908    
909     This symbol governs the stack size for each eio thread. Libeio itself
910     was written to use very little stackspace, but when using C<EIO_CUSTOM>
911     requests, you might want to increase this.
912    
913     If this symbol is undefined (the default) then libeio will use its default
914     stack size (C<sizeof (long) * 4096> currently). If it is defined, but
915     C<0>, then the default operating system stack size will be used. In all
916     other cases, the value must be an expression that evaluates to the desired
917     stack size.
918    
919     =back
920    
921    
922 root 1.1 =head1 PORTABILITY REQUIREMENTS
923    
924     In addition to a working ISO-C implementation, libeio relies on a few
925     additional extensions:
926    
927     =over 4
928    
929     =item POSIX threads
930    
931     To be portable, this module uses threads, specifically, the POSIX threads
932     library must be available (and working, which partially excludes many xBSD
933     systems, where C<fork ()> is buggy).
934    
935     =item POSIX-compatible filesystem API
936    
937     This is actually a harder portability requirement: The libeio API is quite
938     demanding regarding POSIX API calls (symlinks, user/group management
939     etc.).
940    
941     =item C<double> must hold a time value in seconds with enough accuracy
942    
943     The type C<double> is used to represent timestamps. It is required to
944     have at least 51 bits of mantissa (and 9 bits of exponent), which is good
945     enough for at least into the year 4000. This requirement is fulfilled by
946     implementations implementing IEEE 754 (basically all existing ones).
947    
948     =back
949    
950     If you know of other additional requirements drop me a note.
951    
952    
953     =head1 AUTHOR
954    
955     Marc Lehmann <libeio@schmorp.de>.
956