ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libeio/eio.pod
(Generate patch)

Comparing libeio/eio.pod (file contents):
Revision 1.2 by root, Fri Jul 11 10:54:50 2008 UTC vs.
Revision 1.34 by root, Mon Mar 11 07:59:41 2013 UTC

11The newest version of this document is also available as an html-formatted 11The newest version of this document is also available as an html-formatted
12web page you might find easier to navigate when reading it for the first 12web page you might find easier to navigate when reading it for the first
13time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>. 13time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>.
14 14
15Note that this library is a by-product of the C<IO::AIO> perl 15Note that this library is a by-product of the C<IO::AIO> perl
16module, and many of the subtler points regarding requets lifetime 16module, and many of the subtler points regarding requests lifetime
17and so on are only documented in its documentation at the 17and so on are only documented in its documentation at the
18moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>. 18moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>.
19 19
20=head2 FEATURES 20=head2 FEATURES
21 21
22This library provides fully asynchronous versions of most POSIX functions 22This library provides fully asynchronous versions of most POSIX functions
23dealign with I/O. Unlike most asynchronous libraries, this not only 23dealing with I/O. Unlike most asynchronous libraries, this not only
24includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and 24includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and
25similar functions, as well as less rarely ones such as C<mknod>, C<futime> 25similar functions, as well as less rarely ones such as C<mknod>, C<futime>
26or C<readlink>. 26or C<readlink>.
27 27
28It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and 28It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and
29FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with 29FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with
30emulation elsewhere>). 30emulation elsewhere).
31 31
32The goal is to enbale you to write fully non-blocking programs. For 32The goal is to enable you to write fully non-blocking programs. For
33example, in a game server, you would not want to freeze for a few seconds 33example, in a game server, you would not want to freeze for a few seconds
34just because the server is running a backup and you happen to call 34just because the server is running a backup and you happen to call
35C<readdir>. 35C<readdir>.
36 36
37=head2 TIME REPRESENTATION 37=head2 TIME REPRESENTATION
38 38
39Libeio represents time as a single floating point number, representing the 39Libeio represents time as a single floating point number, representing the
40(fractional) number of seconds since the (POSIX) epoch (somewhere near 40(fractional) number of seconds since the (POSIX) epoch (somewhere near
41the beginning of 1970, details are complicated, don't ask). This type is 41the beginning of 1970, details are complicated, don't ask). This type is
42called C<eio_tstamp>, but it is guarenteed to be of type C<double> (or 42called C<eio_tstamp>, but it is guaranteed to be of type C<double> (or
43better), so you can freely use C<double> yourself. 43better), so you can freely use C<double> yourself.
44 44
45Unlike the name component C<stamp> might indicate, it is also used for 45Unlike the name component C<stamp> might indicate, it is also used for
46time differences throughout libeio. 46time differences throughout libeio.
47 47
48=head2 FORK SUPPORT 48=head2 FORK SUPPORT
49 49
50Calling C<fork ()> is fully supported by this module. It is implemented in these steps: 50Usage of pthreads in a program changes the semantics of fork
51considerably. Specifically, only async-safe functions can be called after
52fork. Libeio uses pthreads, so this applies, and makes using fork hard for
53anything but relatively fork + exec uses.
51 54
52 1. wait till all requests in "execute" state have been handled 55This library only works in the process that initialised it: Forking is
53 (basically requests that are already handed over to the kernel). 56fully supported, but using libeio in any other process than the one that
54 2. fork 57called C<eio_init> is not.
55 3. in the parent, continue business as usual, done 58
56 4. in the child, destroy all ready and pending requests and free the 59You might get around by not I<using> libeio before (or after) forking in
57 memory used by the worker threads. This gives you a fully empty 60the parent, and using it in the child afterwards. You could also try to
58 libeio queue. 61call the L<eio_init> function again in the child, which will brutally
62reinitialise all data structures, which isn't POSIX conformant, but
63typically works.
64
65Otherwise, the only recommendation you should follow is: treat fork code
66the same way you treat signal handlers, and only ever call C<eio_init> in
67the process that uses it, and only once ever.
59 68
60=head1 INITIALISATION/INTEGRATION 69=head1 INITIALISATION/INTEGRATION
61 70
62Before you can call any eio functions you first have to initialise the 71Before you can call any eio functions you first have to initialise the
63library. The library integrates into any event loop, but can also be used 72library. The library integrates into any event loop, but can also be used
72This function initialises the library. On success it returns C<0>, on 81This function initialises the library. On success it returns C<0>, on
73failure it returns C<-1> and sets C<errno> appropriately. 82failure it returns C<-1> and sets C<errno> appropriately.
74 83
75It accepts two function pointers specifying callbacks as argument, both of 84It accepts two function pointers specifying callbacks as argument, both of
76which can be C<0>, in which case the callback isn't called. 85which can be C<0>, in which case the callback isn't called.
86
87There is currently no way to change these callbacks later, or to
88"uninitialise" the library again.
77 89
78=item want_poll callback 90=item want_poll callback
79 91
80The C<want_poll> callback is invoked whenever libeio wants attention (i.e. 92The C<want_poll> callback is invoked whenever libeio wants attention (i.e.
81it wants to be polled by calling C<eio_poll>). It is "edge-triggered", 93it wants to be polled by calling C<eio_poll>). It is "edge-triggered",
97handled or C<done_poll> has been called, which signals the same. 109handled or C<done_poll> has been called, which signals the same.
98 110
99Note that C<eio_poll> might return after C<done_poll> and C<want_poll> 111Note that C<eio_poll> might return after C<done_poll> and C<want_poll>
100have been called again, so watch out for races in your code. 112have been called again, so watch out for races in your code.
101 113
102As with C<want_poll>, this callback is called while lcoks are being held, 114As with C<want_poll>, this callback is called while locks are being held,
103so you I<must not call any libeio functions form within this callback>. 115so you I<must not call any libeio functions form within this callback>.
104 116
105=item int eio_poll () 117=item int eio_poll ()
106 118
107This function has to be called whenever there are pending requests that 119This function has to be called whenever there are pending requests that
119=back 131=back
120 132
121For libev, you would typically use an C<ev_async> watcher: the 133For libev, you would typically use an C<ev_async> watcher: the
122C<want_poll> callback would invoke C<ev_async_send> to wake up the event 134C<want_poll> callback would invoke C<ev_async_send> to wake up the event
123loop. Inside the callback set for the watcher, one would call C<eio_poll 135loop. Inside the callback set for the watcher, one would call C<eio_poll
124()> (followed by C<ev_async_send> again if C<eio_poll> indicates that not 136()>.
125all requests have been handled yet). The race is taken care of because 137
126libev resets/rearms the async watcher before calling your callback, 138If C<eio_poll ()> is configured to not handle all results in one go
127and therefore, before calling C<eio_poll>. This might result in (some) 139(i.e. it returns C<-1>) then you should start an idle watcher that calls
128spurious wake-ups, but is generally harmless. 140C<eio_poll> until it returns something C<!= -1>.
141
142A full-featured connector between libeio and libev would look as follows
143(if C<eio_poll> is handling all requests, it can of course be simplified a
144lot by removing the idle watcher logic):
145
146 static struct ev_loop *loop;
147 static ev_idle repeat_watcher;
148 static ev_async ready_watcher;
149
150 /* idle watcher callback, only used when eio_poll */
151 /* didn't handle all results in one call */
152 static void
153 repeat (EV_P_ ev_idle *w, int revents)
154 {
155 if (eio_poll () != -1)
156 ev_idle_stop (EV_A_ w);
157 }
158
159 /* eio has some results, process them */
160 static void
161 ready (EV_P_ ev_async *w, int revents)
162 {
163 if (eio_poll () == -1)
164 ev_idle_start (EV_A_ &repeat_watcher);
165 }
166
167 /* wake up the event loop */
168 static void
169 want_poll (void)
170 {
171 ev_async_send (loop, &ready_watcher)
172 }
173
174 void
175 my_init_eio ()
176 {
177 loop = EV_DEFAULT;
178
179 ev_idle_init (&repeat_watcher, repeat);
180 ev_async_init (&ready_watcher, ready);
181 ev_async_start (loop, &watcher);
182
183 eio_init (want_poll, 0);
184 }
129 185
130For most other event loops, you would typically use a pipe - the event 186For most other event loops, you would typically use a pipe - the event
131loop should be told to wait for read readyness on the read end. In 187loop should be told to wait for read readiness on the read end. In
132C<want_poll> you would write a single byte, in C<done_poll> you would try 188C<want_poll> you would write a single byte, in C<done_poll> you would try
133to read that byte, and in the callback for the read end, you would call 189to read that byte, and in the callback for the read end, you would call
134C<eio_poll>. The race is avoided here because the event loop should invoke 190C<eio_poll>.
135your callback again and again until the byte has been read (as the pipe 191
136read callback does not read it, only C<done_poll>). 192You don't have to take special care in the case C<eio_poll> doesn't handle
193all requests, as the done callback will not be invoked, so the event loop
194will still signal readiness for the pipe until I<all> results have been
195processed.
196
197
198=head1 HIGH LEVEL REQUEST API
199
200Libeio has both a high-level API, which consists of calling a request
201function with a callback to be called on completion, and a low-level API
202where you fill out request structures and submit them.
203
204This section describes the high-level API.
205
206=head2 REQUEST SUBMISSION AND RESULT PROCESSING
207
208You submit a request by calling the relevant C<eio_TYPE> function with the
209required parameters, a callback of type C<int (*eio_cb)(eio_req *req)>
210(called C<eio_cb> below) and a freely usable C<void *data> argument.
211
212The return value will either be 0, in case something went really wrong
213(which can basically only happen on very fatal errors, such as C<malloc>
214returning 0, which is rather unlikely), or a pointer to the newly-created
215and submitted C<eio_req *>.
216
217The callback will be called with an C<eio_req *> which contains the
218results of the request. The members you can access inside that structure
219vary from request to request, except for:
220
221=over 4
222
223=item C<ssize_t result>
224
225This contains the result value from the call (usually the same as the
226syscall of the same name).
227
228=item C<int errorno>
229
230This contains the value of C<errno> after the call.
231
232=item C<void *data>
233
234The C<void *data> member simply stores the value of the C<data> argument.
235
236=back
237
238Members not explicitly described as accessible must not be
239accessed. Specifically, there is no guarantee that any members will still
240have the value they had when the request was submitted.
241
242The return value of the callback is normally C<0>, which tells libeio to
243continue normally. If a callback returns a nonzero value, libeio will
244stop processing results (in C<eio_poll>) and will return the value to its
245caller.
246
247Memory areas passed to libeio wrappers must stay valid as long as a
248request executes, with the exception of paths, which are being copied
249internally. Any memory libeio itself allocates will be freed after the
250finish callback has been called. If you want to manage all memory passed
251to libeio yourself you can use the low-level API.
252
253For example, to open a file, you could do this:
254
255 static int
256 file_open_done (eio_req *req)
257 {
258 if (req->result < 0)
259 {
260 /* open() returned -1 */
261 errno = req->errorno;
262 perror ("open");
263 }
264 else
265 {
266 int fd = req->result;
267 /* now we have the new fd in fd */
268 }
269
270 return 0;
271 }
272
273 /* the first three arguments are passed to open(2) */
274 /* the remaining are priority, callback and data */
275 if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0))
276 abort (); /* something went wrong, we will all die!!! */
277
278Note that you additionally need to call C<eio_poll> when the C<want_cb>
279indicates that requests are ready to be processed.
280
281=head2 CANCELLING REQUESTS
282
283Sometimes the need for a request goes away before the request is
284finished. In that case, one can cancel the request by a call to
285C<eio_cancel>:
286
287=over 4
288
289=item eio_cancel (eio_req *req)
290
291Cancel the request (and all its subrequests). If the request is currently
292executing it might still continue to execute, and in other cases it might
293still take a while till the request is cancelled.
294
295Even if cancelled, the finish callback will still be invoked - the
296callbacks of all cancellable requests need to check whether the request
297has been cancelled by calling C<EIO_CANCELLED (req)>:
298
299 static int
300 my_eio_cb (eio_req *req)
301 {
302 if (EIO_CANCELLED (req))
303 return 0;
304 }
305
306In addition, cancelled requests will I<either> have C<< req->result >>
307set to C<-1> and C<errno> to C<ECANCELED>, or I<otherwise> they were
308successfully executed, despite being cancelled (e.g. when they have
309already been executed at the time they were cancelled).
310
311C<EIO_CANCELLED> is still true for requests that have successfully
312executed, as long as C<eio_cancel> was called on them at some point.
313
314=back
315
316=head2 AVAILABLE REQUESTS
317
318The following request functions are available. I<All> of them return the
319C<eio_req *> on success and C<0> on failure, and I<all> of them have the
320same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is
321mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some
322custom data value as C<data>.
323
324=head3 POSIX API WRAPPERS
325
326These requests simply wrap the POSIX call of the same name, with the same
327arguments. If a function is not implemented by the OS and cannot be emulated
328in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>.
329
330=over 4
331
332=item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data)
333
334=item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data)
335
336=item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
337
338=item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
339
340=item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
341
342=item eio_rmdir (const char *path, int pri, eio_cb cb, void *data)
343
344=item eio_unlink (const char *path, int pri, eio_cb cb, void *data)
345
346=item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
347
348=item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data)
349
350=item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
351
352=item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
353
354=item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
355
356=item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data)
357
358=item eio_close (int fd, int pri, eio_cb cb, void *data)
359
360=item eio_sync (int pri, eio_cb cb, void *data)
361
362=item eio_fsync (int fd, int pri, eio_cb cb, void *data)
363
364=item eio_fdatasync (int fd, int pri, eio_cb cb, void *data)
365
366=item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
367
368=item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data)
369
370=item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data)
371
372=item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
373
374=item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data)
375
376These have the same semantics as the syscall of the same name, their
377return value is available as C<< req->result >> later.
378
379=item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
380
381=item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
382
383These two requests are called C<read> and C<write>, but actually wrap
384C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin),
385libeio uses lseek/read_or_write/lseek and a mutex to serialise the
386requests, so all these requests run serially and do not disturb each
387other. However, they still disturb the file offset while they run, so it's
388not safe to call these functions concurrently with non-libeio functions on
389the same fd on these systems.
390
391Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X),
392so it is advised not to submit multiple requests on the same fd on this
393horrible pile of garbage.
394
395=item eio_mlockall (int flags, int pri, eio_cb cb, void *data)
396
397Like C<mlockall>, but the flag value constants are called
398C<EIO_MCL_CURRENT> and C<EIO_MCL_FUTURE>.
399
400=item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
401
402Just like msync, except that the flag values are called C<EIO_MS_ASYNC>,
403C<EIO_MS_INVALIDATE> and C<EIO_MS_SYNC>.
404
405=item eio_readlink (const char *path, int pri, eio_cb cb, void *data)
406
407If successful, the path read by C<readlink(2)> can be accessed via C<<
408req->ptr2 >> and is I<NOT> null-terminated, with the length specified as
409C<< req->result >>.
410
411 if (req->result >= 0)
412 {
413 char *target = strndup ((char *)req->ptr2, req->result);
414
415 free (target);
416 }
417
418=item eio_realpath (const char *path, int pri, eio_cb cb, void *data)
419
420Similar to the realpath libc function, but unlike that one, C<<
421req->result >> is C<-1> on failure. On success, the result is the length
422of the returned path in C<ptr2> (which is I<NOT> 0-terminated) - this is
423similar to readlink.
424
425=item eio_stat (const char *path, int pri, eio_cb cb, void *data)
426
427=item eio_lstat (const char *path, int pri, eio_cb cb, void *data)
428
429=item eio_fstat (int fd, int pri, eio_cb cb, void *data)
430
431Stats a file - if C<< req->result >> indicates success, then you can
432access the C<struct stat>-like structure via C<< req->ptr2 >>:
433
434 EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2;
435
436=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data)
437
438=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data)
439
440Stats a filesystem - if C<< req->result >> indicates success, then you can
441access the C<struct statvfs>-like structure via C<< req->ptr2 >>:
442
443 EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2;
444
445=back
446
447=head3 READING DIRECTORIES
448
449Reading directories sounds simple, but can be rather demanding, especially
450if you want to do stuff such as traversing a directory hierarchy or
451processing all files in a directory. Libeio can assist these complex tasks
452with it's C<eio_readdir> call.
453
454=over 4
455
456=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data)
457
458This is a very complex call. It basically reads through a whole directory
459(via the C<opendir>, C<readdir> and C<closedir> calls) and returns either
460the names or an array of C<struct eio_dirent>, depending on the C<flags>
461argument.
462
463The C<< req->result >> indicates either the number of files found, or
464C<-1> on error. On success, null-terminated names can be found as C<< req->ptr2 >>,
465and C<struct eio_dirents>, if requested by C<flags>, can be found via C<<
466req->ptr1 >>.
467
468Here is an example that prints all the names:
469
470 int i;
471 char *names = (char *)req->ptr2;
472
473 for (i = 0; i < req->result; ++i)
474 {
475 printf ("name #%d: %s\n", i, names);
476
477 /* move to next name */
478 names += strlen (names) + 1;
479 }
480
481Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>.
482
483C<flags> can be any combination of:
484
485=over 4
486
487=item EIO_READDIR_DENTS
488
489If this flag is specified, then, in addition to the names in C<ptr2>,
490also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct
491eio_dirent> looks like this:
492
493 struct eio_dirent
494 {
495 int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */
496 unsigned short namelen; /* size of filename without trailing 0 */
497 unsigned char type; /* one of EIO_DT_* */
498 signed char score; /* internal use */
499 ino_t inode; /* the inode number, if available, otherwise unspecified */
500 };
501
502The only members you normally would access are C<nameofs>, which is the
503byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>.
504
505C<type> can be one of:
506
507C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat>
508the name yourself if you need to know,
509one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>,
510C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>)
511or some OS-specific type (currently
512C<EIO_DT_MPC> - multiplexed char device (v7+coherent),
513C<EIO_DT_NAM> - xenix special named file,
514C<EIO_DT_MPB> - multiplexed block device (v7+coherent),
515C<EIO_DT_NWK> - HP-UX network special,
516C<EIO_DT_CMP> - VxFS compressed,
517C<EIO_DT_DOOR> - solaris door, or
518C<EIO_DT_WHT>).
519
520This example prints all names and their type:
521
522 int i;
523 struct eio_dirent *ents = (struct eio_dirent *)req->ptr1;
524 char *names = (char *)req->ptr2;
525
526 for (i = 0; i < req->result; ++i)
527 {
528 struct eio_dirent *ent = ents + i;
529 char *name = names + ent->nameofs;
530
531 printf ("name #%d: %s (type %d)\n", i, name, ent->type);
532 }
533
534=item EIO_READDIR_DIRS_FIRST
535
536When this flag is specified, then the names will be returned in an order
537where likely directories come first, in optimal C<stat> order. This is
538useful when you need to quickly find directories, or you want to find all
539directories while avoiding to stat() each entry.
540
541If the system returns type information in readdir, then this is used
542to find directories directly. Otherwise, likely directories are names
543beginning with ".", or otherwise names with no dots, of which names with
544short names are tried first.
545
546=item EIO_READDIR_STAT_ORDER
547
548When this flag is specified, then the names will be returned in an order
549suitable for stat()'ing each one. That is, when you plan to stat()
550all files in the given directory, then the returned order will likely
551be fastest.
552
553If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the
554likely directories come first, resulting in a less optimal stat order.
555
556=item EIO_READDIR_FOUND_UNKNOWN
557
558This flag should not be specified when calling C<eio_readdir>. Instead,
559it is being set by C<eio_readdir> (you can access the C<flags> via C<<
560req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The
561absence of this flag therefore indicates that all C<type>'s are known,
562which can be used to speed up some algorithms.
563
564A typical use case would be to identify all subdirectories within a
565directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If
566then this flag is I<NOT> set, then all the entries at the beginning of the
567returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you
568should start C<stat()>'ing the entries starting at the beginning of the
569array, stopping as soon as you found all directories (the count can be
570deduced by the link count of the directory).
571
572=back
573
574=back
575
576=head3 OS-SPECIFIC CALL WRAPPERS
577
578These wrap OS-specific calls (usually Linux ones), and might or might not
579be emulated on other operating systems. Calls that are not emulated will
580return C<-1> and set C<errno> to C<ENOSYS>.
581
582=over 4
583
584=item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data)
585
586Wraps the C<sendfile> syscall. The arguments follow the Linux version, but
587libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and
588Darwin.
589
590If the OS doesn't support some sendfile-like call, or the call fails,
591indicating support for the given file descriptor type (for example,
592Linux's sendfile might not support file to file copies), then libeio will
593emulate the call in userspace, so there are almost no limitations on its
594use.
595
596=item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data)
597
598Calls C<readahead(2)>. If the syscall is missing, then the call is
599emulated by simply reading the data (currently in 64kiB chunks).
600
601=item eio_syncfs (int fd, int pri, eio_cb cb, void *data)
602
603Calls Linux' C<syncfs> syscall, if available. Returns C<-1> and sets
604C<errno> to C<ENOSYS> if the call is missing I<but still calls sync()>,
605if the C<fd> is C<< >= 0 >>, so you can probe for the availability of the
606syscall with a negative C<fd> argument and checking for C<-1/ENOSYS>.
607
608=item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data)
609
610Calls C<sync_file_range>. If the syscall is missing, then this is the same
611as calling C<fdatasync>.
612
613Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>,
614C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>.
615
616=item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data)
617
618Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is
619missing, then it returns failure and sets C<errno> to C<ENOSYS>.
620
621The C<mode> argument can be C<0> (for behaviour similar to
622C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size
623of the file unchanged (but still preallocates space beyond end of file).
624
625=back
626
627=head3 LIBEIO-SPECIFIC REQUESTS
628
629These requests are specific to libeio and do not correspond to any OS call.
630
631=over 4
632
633=item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
634
635Reads (C<flags == 0>) or modifies (C<flags == EIO_MT_MODIFY>) the given
636memory area, page-wise, that is, it reads (or reads and writes back) the
637first octet of every page that spans the memory area.
638
639This can be used to page in some mmapped file, or dirty some pages. Note
640that dirtying is an unlocked read-write access, so races can ensue when
641the some other thread modifies the data stored in that memory area.
642
643=item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data)
644
645Executes a custom request, i.e., a user-specified callback.
646
647The callback gets the C<eio_req *> as parameter and is expected to read
648and modify any request-specific members. Specifically, it should set C<<
649req->result >> to the result value, just like other requests.
650
651Here is an example that simply calls C<open>, like C<eio_open>, but it
652uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If
653you want to pass more/other parameters, you either need to pass some
654struct or so via C<data> or provide your own wrapper using the low-level
655API.
656
657 static int
658 my_open_done (eio_req *req)
659 {
660 int fd = req->result;
661
662 return 0;
663 }
664
665 static void
666 my_open (eio_req *req)
667 {
668 req->result = open (req->data, O_RDONLY);
669 }
670
671 eio_custom (my_open, 0, my_open_done, "/etc/passwd");
672
673=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data)
674
675This is a request that takes C<delay> seconds to execute, but otherwise
676does nothing - it simply puts one of the worker threads to sleep for this
677long.
678
679This request can be used to artificially increase load, e.g. for debugging
680or benchmarking reasons.
681
682=item eio_nop (int pri, eio_cb cb, void *data)
683
684This request does nothing, except go through the whole request cycle. This
685can be used to measure latency or in some cases to simplify code, but is
686not really of much use.
687
688=back
689
690=head3 GROUPING AND LIMITING REQUESTS
691
692There is one more rather special request, C<eio_grp>. It is a very special
693aio request: Instead of doing something, it is a container for other eio
694requests.
695
696There are two primary use cases for this: a) bundle many requests into a
697single, composite, request with a definite callback and the ability to
698cancel the whole request with its subrequests and b) limiting the number
699of "active" requests.
700
701Further below you will find more discussion of these topics - first
702follows the reference section detailing the request generator and other
703methods.
704
705=over 4
706
707=item eio_req *grp = eio_grp (eio_cb cb, void *data)
708
709Creates, submits and returns a group request. Note that it doesn't have a
710priority, unlike all other requests.
711
712=item eio_grp_add (eio_req *grp, eio_req *req)
713
714Adds a request to the request group.
715
716=item eio_grp_cancel (eio_req *grp)
717
718Cancels all requests I<in> the group, but I<not> the group request
719itself. You can cancel the group request I<and> all subrequests via a
720normal C<eio_cancel> call.
721
722=back
723
724=head4 GROUP REQUEST LIFETIME
725
726Left alone, a group request will instantly move to the pending state and
727will be finished at the next call of C<eio_poll>.
728
729The usefulness stems from the fact that, if a subrequest is added to a
730group I<before> a call to C<eio_poll>, via C<eio_grp_add>, then the group
731will not finish until all the subrequests have finished.
732
733So the usage cycle of a group request is like this: after it is created,
734you normally instantly add a subrequest. If none is added, the group
735request will finish on it's own. As long as subrequests are added before
736the group request is finished it will be kept from finishing, that is the
737callbacks of any subrequests can, in turn, add more requests to the group,
738and as long as any requests are active, the group request itself will not
739finish.
740
741=head4 CREATING COMPOSITE REQUESTS
742
743Imagine you wanted to create an C<eio_load> request that opens a file,
744reads it and closes it. This means it has to execute at least three eio
745requests, but for various reasons it might be nice if that request looked
746like any other eio request.
747
748This can be done with groups:
749
750=over 4
751
752=item 1) create the request object
753
754Create a group that contains all further requests. This is the request you
755can return as "the load request".
756
757=item 2) open the file, maybe
758
759Next, open the file with C<eio_open> and add the request to the group
760request and you are finished setting up the request.
761
762If, for some reason, you cannot C<eio_open> (path is a null ptr?) you
763can set C<< grp->result >> to C<-1> to signal an error and let the group
764request finish on its own.
765
766=item 3) open callback adds more requests
767
768In the open callback, if the open was not successful, copy C<<
769req->errorno >> to C<< grp->errorno >> and set C<< grp->result >> to
770C<-1> to signal an error.
771
772Otherwise, malloc some memory or so and issue a read request, adding the
773read request to the group.
774
775=item 4) continue issuing requests till finished
776
777In the read callback, check for errors and possibly continue with
778C<eio_close> or any other eio request in the same way.
779
780As soon as no new requests are added, the group request will finish. Make
781sure you I<always> set C<< grp->result >> to some sensible value.
782
783=back
784
785=head4 REQUEST LIMITING
786
787
788#TODO
789
790void eio_grp_limit (eio_req *grp, int limit);
791
792
793
794=head1 LOW LEVEL REQUEST API
795
796#TODO
797
798
799=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST
800
801A request is represented by a structure of type C<eio_req>. To initialise
802it, clear it to all zero bytes:
803
804 eio_req req;
805
806 memset (&req, 0, sizeof (req));
807
808A more common way to initialise a new C<eio_req> is to use C<calloc>:
809
810 eio_req *req = calloc (1, sizeof (*req));
811
812In either case, libeio neither allocates, initialises or frees the
813C<eio_req> structure for you - it merely uses it.
814
815zero
816
817#TODO
137 818
138=head2 CONFIGURATION 819=head2 CONFIGURATION
139 820
140The functions in this section can sometimes be useful, but the default 821The functions in this section can sometimes be useful, but the default
141configuration will do in most case, so you should skip this section on 822configuration will do in most case, so you should skip this section on
152for example, in interactive programs, you might want to limit this time to 833for example, in interactive programs, you might want to limit this time to
153C<0.01> seconds or so. 834C<0.01> seconds or so.
154 835
155Note that: 836Note that:
156 837
838=over 4
839
157a) libeio doesn't know how long your request callbacks take, so the time 840=item a) libeio doesn't know how long your request callbacks take, so the
158spent in C<eio_poll> is up to one callback invocation longer then this 841time spent in C<eio_poll> is up to one callback invocation longer then
159interval. 842this interval.
160 843
161b) this is implemented by calling C<gettimeofday> after each request, 844=item b) this is implemented by calling C<gettimeofday> after each
162which can be costly. 845request, which can be costly.
163 846
164c) at least one request will be handled. 847=item c) at least one request will be handled.
848
849=back
165 850
166=item eio_set_max_poll_reqs (unsigned int nreqs) 851=item eio_set_max_poll_reqs (unsigned int nreqs)
167 852
168When C<nreqs> is non-zero, then C<eio_poll> will not handle more than 853When C<nreqs> is non-zero, then C<eio_poll> will not handle more than
169C<nreqs> requests per invocation. This is a less costly way to limit the 854C<nreqs> requests per invocation. This is a less costly way to limit the
185=item eio_set_max_idle (unsigned int nthreads) 870=item eio_set_max_idle (unsigned int nthreads)
186 871
187Libeio uses threads internally to handle most requests, and will start and stop threads on demand. 872Libeio uses threads internally to handle most requests, and will start and stop threads on demand.
188 873
189This call can be used to limit the number of idle threads (threads without 874This call can be used to limit the number of idle threads (threads without
190work to do): libeio will keep some threads idle in preperation for more 875work to do): libeio will keep some threads idle in preparation for more
191requests, but never longer than C<nthreads> threads. 876requests, but never longer than C<nthreads> threads.
192 877
193In addition to this, libeio will also stop threads when they are idle for 878In addition to this, libeio will also stop threads when they are idle for
194a few seconds, regardless of this setting. 879a few seconds, regardless of this setting.
195 880
214executed and have results, but have not been finished yet by a call to 899executed and have results, but have not been finished yet by a call to
215C<eio_poll>). 900C<eio_poll>).
216 901
217=back 902=back
218 903
219
220=head1 ANATOMY OF AN EIO REQUEST
221
222#TODO
223
224
225=head1 HIGH LEVEL REQUEST API
226
227#TODO
228
229=back
230
231
232=head1 LOW LEVEL REQUEST API
233
234#TODO
235
236=head1 EMBEDDING 904=head1 EMBEDDING
237 905
238Libeio can be embedded directly into programs. This functionality is not 906Libeio can be embedded directly into programs. This functionality is not
239documented and not (yet) officially supported. 907documented and not (yet) officially supported.
240 908
909Note that, when including C<libeio.m4>, you are responsible for defining
910the compilation environment (C<_LARGEFILE_SOURCE>, C<_GNU_SOURCE> etc.).
911
241If you need to know how, check the C<IO::AIO> perl module, which does 912If you need to know how, check the C<IO::AIO> perl module, which does
242exactly that. 913exactly that.
914
915
916=head1 COMPILETIME CONFIGURATION
917
918These symbols, if used, must be defined when compiling F<eio.c>.
919
920=over 4
921
922=item EIO_STACKSIZE
923
924This symbol governs the stack size for each eio thread. Libeio itself
925was written to use very little stackspace, but when using C<EIO_CUSTOM>
926requests, you might want to increase this.
927
928If this symbol is undefined (the default) then libeio will use its default
929stack size (C<sizeof (void *) * 4096> currently). In all other cases, the
930value must be an expression that evaluates to the desired stack size.
931
932=back
243 933
244 934
245=head1 PORTABILITY REQUIREMENTS 935=head1 PORTABILITY REQUIREMENTS
246 936
247In addition to a working ISO-C implementation, libeio relies on a few 937In addition to a working ISO-C implementation, libeio relies on a few

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines