--- libeio/eio.pod 2011/07/05 14:02:15 1.14 +++ libeio/eio.pod 2011/07/18 02:59:58 1.26 @@ -47,20 +47,24 @@ =head2 FORK SUPPORT -Calling C is fully supported by this module. It is implemented in these steps: - - 1. wait till all requests in "execute" state have been handled - (basically requests that are already handed over to the kernel). - 2. fork - 3. in the parent, continue business as usual, done - 4. in the child, destroy all ready and pending requests and free the - memory used by the worker threads. This gives you a fully empty - libeio queue. - -Note, however, since libeio does use threads, thr above guarantee doesn't -cover your libc, for example, malloc and other libc functions are not -fork-safe, so there is very little you can do after a fork, and in fatc, -the above might crash, and thus change. +Usage of pthreads in a program changes the semantics of fork +considerably. Specifically, only async-safe functions can be called after +fork. Libeio uses pthreads, so this applies, and makes using fork hard for +anything but relatively fork + exec uses. + +This library only works in the process that initialised it: Forking is +fully supported, but using libeio in any other process than the one that +called C is not. + +You might get around by not I libeio before (or after) forking in +the parent, and using it in the child afterwards. You could also try to +call the L function again in the child, which will brutally +reinitialise all data structures, which isn't POSIX conformant, but +typically works. + +Otherwise, the only recommendation you should follow is: treat fork code +the same way you treat signal handlers, and only ever call C in +the process that uses it, and only once ever. =head1 INITIALISATION/INTEGRATION @@ -80,6 +84,9 @@ It accepts two function pointers specifying callbacks as argument, both of which can be C<0>, in which case the callback isn't called. +There is currently no way to change these callbacks later, or to +"uninitialise" the library again. + =item want_poll callback The C callback is invoked whenever libeio wants attention (i.e. @@ -126,19 +133,66 @@ For libev, you would typically use an C watcher: the C callback would invoke C to wake up the event loop. Inside the callback set for the watcher, one would call C (followed by C again if C indicates that not -all requests have been handled yet). The race is taken care of because -libev resets/rearms the async watcher before calling your callback, -and therefore, before calling C. This might result in (some) -spurious wake-ups, but is generally harmless. +()>. + +If C is configured to not handle all results in one go +(i.e. it returns C<-1>) then you should start an idle watcher that calls +C until it returns something C. + +A full-featured connector between libeio and libev would look as follows +(if C is handling all requests, it can of course be simplified a +lot by removing the idle watcher logic): + + static struct ev_loop *loop; + static ev_idle repeat_watcher; + static ev_async ready_watcher; + + /* idle watcher callback, only used when eio_poll */ + /* didn't handle all results in one call */ + static void + repeat (EV_P_ ev_idle *w, int revents) + { + if (eio_poll () != -1) + ev_idle_stop (EV_A_ w); + } + + /* eio has some results, process them */ + static void + ready (EV_P_ ev_async *w, int revents) + { + if (eio_poll () == -1) + ev_idle_start (EV_A_ &repeat_watcher); + } + + /* wake up the event loop */ + static void + want_poll (void) + { + ev_async_send (loop, &ready_watcher) + } + + void + my_init_eio () + { + loop = EV_DEFAULT; + + ev_idle_init (&repeat_watcher, repeat); + ev_async_init (&ready_watcher, ready); + ev_async_start (loop &watcher); + + eio_init (want_poll, 0); + } For most other event loops, you would typically use a pipe - the event loop should be told to wait for read readiness on the read end. In C you would write a single byte, in C you would try to read that byte, and in the callback for the read end, you would call -C. The race is avoided here because the event loop should invoke -your callback again and again until the byte has been read (as the pipe -read callback does not read it, only C). +C. + +You don't have to take special care in the case C doesn't handle +all requests, as the done callback will not be invoked, so the event loop +will still signal readiness for the pipe until I results have been +processed. =head1 HIGH LEVEL REQUEST API @@ -215,11 +269,46 @@ /* the first three arguments are passed to open(2) */ /* the remaining are priority, callback and data */ if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0)) - abort (); /* something ent wrong, we will all die!!! */ + abort (); /* something went wrong, we will all die!!! */ Note that you additionally need to call C when the C indicates that requests are ready to be processed. +=head2 CANCELLING REQUESTS + +Sometimes the need for a request goes away before the request is +finished. In that case, one can cancel the request by a call to +C: + +=over 4 + +=item eio_cancel (eio_req *req) + +Cancel the request (and all its subrequests). If the request is currently +executing it might still continue to execute, and in other cases it might +still take a while till the request is cancelled. + +Even if cancelled, the finish callback will still be invoked - the +callbacks of all cancellable requests need to check whether the request +has been cancelled by calling C: + + static int + my_eio_cb (eio_req *req) + { + if (EIO_CANCELLED (req)) + return 0; + } + +In addition, cancelled requests will I have C<< req->result >> +set to C<-1> and C to C, or I they were +successfully executed, despite being cancelled (e.g. when they have +already been executed at the time they were cancelled). + +C is still true for requests that have successfully +executed, as long as C was called on them at some point. + +=back + =head2 AVAILABLE REQUESTS The following request functions are available. I of them return the @@ -324,9 +413,10 @@ =item eio_realpath (const char *path, int pri, eio_cb cb, void *data) -Similar to the realpath libc function, but unlike that one, result is -C<-1> on failure and the length of the returned path in C (which is -not 0-terminated) - this is similar to readlink. +Similar to the realpath libc function, but unlike that one, C<< +req->result >> is C<-1> on failure. On success, the result is the length +of the returned path in C (which is I 0-terminated) - this is +similar to readlink. =item eio_stat (const char *path, int pri, eio_cb cb, void *data) @@ -337,7 +427,7 @@ Stats a file - if C<< req->result >> indicates success, then you can access the C-like structure via C<< req->ptr2 >>: - EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; + EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; =item eio_statvfs (const char *path, int pri, eio_cb cb, void *data) @@ -346,15 +436,15 @@ Stats a filesystem - if C<< req->result >> indicates success, then you can access the C-like structure via C<< req->ptr2 >>: - EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; + EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; =back =head3 READING DIRECTORIES Reading directories sounds simple, but can be rather demanding, especially -if you want to do stuff such as traversing a diretcory hierarchy or -processing all files in a directory. Libeio can assist thess complex tasks +if you want to do stuff such as traversing a directory hierarchy or +processing all files in a directory. Libeio can assist these complex tasks with it's C call. =over 4 @@ -396,14 +486,14 @@ also an array of C is returned, in C. A C looks like this: - struct eio_dirent - { - int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ - unsigned short namelen; /* size of filename without trailing 0 */ - unsigned char type; /* one of EIO_DT_* */ - signed char score; /* internal use */ - ino_t inode; /* the inode number, if available, otherwise unspecified */ - }; + struct eio_dirent + { + int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ + unsigned short namelen; /* size of filename without trailing 0 */ + unsigned char type; /* one of EIO_DT_* */ + signed char score; /* internal use */ + ino_t inode; /* the inode number, if available, otherwise unspecified */ + }; The only members you normally would access are C, which is the byte-offset from C to the start of the name, C and C. @@ -456,15 +546,15 @@ all files in the given directory, then the returned order will likely be fastest. -If both this flag and C are specified, then -the likely dirs come first, resulting in a less optimal stat order. +If both this flag and C are specified, then the +likely directories come first, resulting in a less optimal stat order. =item EIO_READDIR_FOUND_UNKNOWN This flag should not be specified when calling C. Instead, it is being set by C (you can access the C via C<< req->int1 >>, when any of the C's found were C. The -absense of this flag therefore indicates that all C's are known, +absence of this flag therefore indicates that all C's are known, which can be used to speed up some algorithms. A typical use case would be to identify all subdirectories within a @@ -512,6 +602,15 @@ Flags can be any combination of C, C and C. +=item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data) + +Calls C (note: I C!). If the syscall is +missing, then it returns failure and sets C to C. + +The C argument can be C<0> (for behaviour similar to +C), or C, which keeps the size +of the file unchanged (but still preallocates space beyond end of file). + =back =head3 LIBEIO-SPECIFIC REQUESTS @@ -562,7 +661,7 @@ =item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data) -This is a a request that takes C seconds to execute, but otherwise +This is a request that takes C seconds to execute, but otherwise does nothing - it simply puts one of the worker threads to sleep for this long. @@ -588,29 +687,96 @@ cancel the whole request with its subrequests and b) limiting the number of "active" requests. -Further below you will find more dicussion of these topics - first follows -the reference section detailing the request generator and other methods. +Further below you will find more discussion of these topics - first +follows the reference section detailing the request generator and other +methods. =over 4 -=item eio_grp (eio_cb cb, void *data) +=item eio_req *grp = eio_grp (eio_cb cb, void *data) + +Creates, submits and returns a group request. Note that it doesn't have a +priority, unlike all other requests. + +=item eio_grp_add (eio_req *grp, eio_req *req) + +Adds a request to the request group. + +=item eio_grp_cancel (eio_req *grp) -Creates and submits a group request. +Cancels all requests I the group, but I the group request +itself. You can cancel the group request I all subrequests via a +normal C call. =back +=head4 GROUP REQUEST LIFETIME +Left alone, a group request will instantly move to the pending state and +will be finished at the next call of C. -#TODO +The usefulness stems from the fact that, if a subrequest is added to a +group I a call to C, via C, then the group +will not finish until all the subrequests have finished. + +So the usage cycle of a group request is like this: after it is created, +you normally instantly add a subrequest. If none is added, the group +request will finish on it's own. As long as subrequests are added before +the group request is finished it will be kept from finishing, that is the +callbacks of any subrequests can, in turn, add more requests to the group, +and as long as any requests are active, the group request itself will not +finish. + +=head4 CREATING COMPOSITE REQUESTS + +Imagine you wanted to create an C request that opens a file, +reads it and closes it. This means it has to execute at least three eio +requests, but for various reasons it might be nice if that request looked +like any other eio request. + +This can be done with groups: + +=over 4 + +=item 1) create the request object + +Create a group that contains all further requests. This is the request you +can return as "the load request". + +=item 2) open the file, maybe + +Next, open the file with C and add the request to the group +request and you are finished setting up the request. -/*****************************************************************************/ -/* groups */ +If, for some reason, you cannot C (path is a null ptr?) you +can set C<< grp->result >> to C<-1> to signal an error and let the group +request finish on its own. + +=item 3) open callback adds more requests + +In the open callback, if the open was not successful, copy C<< +req->errorno >> to C<< grp->errorno >> and set C<< grp->errorno >> to +C<-1> to signal an error. + +Otherwise, malloc some memory or so and issue a read request, adding the +read request to the group. + +=item 4) continue issuing requests till finished + +In the real callback, check for errors and possibly continue with +C or any other eio request in the same way. + +As soon as no new requests are added the group request will finish. Make +sure you I set C<< grp->result >> to some sensible value. + +=back + +=head4 REQUEST LIMITING + + +#TODO -eio_req *eio_grp (eio_cb cb, void *data); -void eio_grp_feed (eio_req *grp, void (*feed)(eio_req *req), int limit); void eio_grp_limit (eio_req *grp, int limit); -void eio_grp_add (eio_req *grp, eio_req *req); -void eio_grp_cancel (eio_req *grp); /* cancels all sub requests but not the group */ =back @@ -626,13 +792,13 @@ A request is represented by a structure of type C. To initialise it, clear it to all zero bytes: - eio_req req; + eio_req req; - memset (&req, 0, sizeof (req)); + memset (&req, 0, sizeof (req)); A more common way to initialise a new C is to use C: - eio_req *req = calloc (1, sizeof (*req)); + eio_req *req = calloc (1, sizeof (*req)); In either case, libeio neither allocates, initialises or frees the C structure for you - it merely uses it. @@ -660,14 +826,18 @@ Note that: -a) libeio doesn't know how long your request callbacks take, so the time -spent in C is up to one callback invocation longer then this -interval. +=over 4 + +=item a) libeio doesn't know how long your request callbacks take, so the +time spent in C is up to one callback invocation longer then +this interval. -b) this is implemented by calling C after each request, -which can be costly. +=item b) this is implemented by calling C after each +request, which can be costly. -c) at least one request will be handled. +=item c) at least one request will be handled. + +=back =item eio_set_max_poll_reqs (unsigned int nreqs) @@ -747,7 +917,7 @@ requests, you might want to increase this. If this symbol is undefined (the default) then libeio will use its default -stack size (C currently). If it is defined, but +stack size (C currently). If it is defined, but C<0>, then the default operating system stack size will be used. In all other cases, the value must be an expression that evaluates to the desired stack size.