… | |
… | |
45 | Unlike the name component C<stamp> might indicate, it is also used for |
45 | Unlike the name component C<stamp> might indicate, it is also used for |
46 | time differences throughout libeio. |
46 | time differences throughout libeio. |
47 | |
47 | |
48 | =head2 FORK SUPPORT |
48 | =head2 FORK SUPPORT |
49 | |
49 | |
50 | Calling C<fork ()> is fully supported by this module. It is implemented in these steps: |
50 | Usage of pthreads in a program changes the semantics of fork |
|
|
51 | considerably. Specifically, only async-safe functions can be called after |
|
|
52 | fork. Libeio uses pthreads, so this applies, and makes using fork hard for |
|
|
53 | anything but relatively fork + exec uses. |
51 | |
54 | |
52 | 1. wait till all requests in "execute" state have been handled |
55 | This library only works in the process that initialised it: Forking is |
53 | (basically requests that are already handed over to the kernel). |
56 | fully supported, but using libeio in any other process than the one that |
54 | 2. fork |
57 | called C<eio_init> is not. |
55 | 3. in the parent, continue business as usual, done |
58 | |
56 | 4. in the child, destroy all ready and pending requests and free the |
59 | You might get around by not I<using> libeio before (or after) forking in |
57 | memory used by the worker threads. This gives you a fully empty |
60 | the parent, and using it in the child afterwards. You could also try to |
58 | libeio queue. |
61 | call the L<eio_init> function again in the child, which will brutally |
|
|
62 | reinitialise all data structures, which isn't POSIX conformant, but |
|
|
63 | typically works. |
|
|
64 | |
|
|
65 | Otherwise, the only recommendation you should follow is: treat fork code |
|
|
66 | the same way you treat signal handlers, and only ever call C<eio_init> in |
|
|
67 | the process that uses it, and only once ever. |
59 | |
68 | |
60 | =head1 INITIALISATION/INTEGRATION |
69 | =head1 INITIALISATION/INTEGRATION |
61 | |
70 | |
62 | Before you can call any eio functions you first have to initialise the |
71 | Before you can call any eio functions you first have to initialise the |
63 | library. The library integrates into any event loop, but can also be used |
72 | library. The library integrates into any event loop, but can also be used |
… | |
… | |
72 | This function initialises the library. On success it returns C<0>, on |
81 | This function initialises the library. On success it returns C<0>, on |
73 | failure it returns C<-1> and sets C<errno> appropriately. |
82 | failure it returns C<-1> and sets C<errno> appropriately. |
74 | |
83 | |
75 | It accepts two function pointers specifying callbacks as argument, both of |
84 | It accepts two function pointers specifying callbacks as argument, both of |
76 | which can be C<0>, in which case the callback isn't called. |
85 | which can be C<0>, in which case the callback isn't called. |
|
|
86 | |
|
|
87 | There is currently no way to change these callbacks later, or to |
|
|
88 | "uninitialise" the library again. |
77 | |
89 | |
78 | =item want_poll callback |
90 | =item want_poll callback |
79 | |
91 | |
80 | The C<want_poll> callback is invoked whenever libeio wants attention (i.e. |
92 | The C<want_poll> callback is invoked whenever libeio wants attention (i.e. |
81 | it wants to be polled by calling C<eio_poll>). It is "edge-triggered", |
93 | it wants to be polled by calling C<eio_poll>). It is "edge-triggered", |
… | |
… | |
119 | =back |
131 | =back |
120 | |
132 | |
121 | For libev, you would typically use an C<ev_async> watcher: the |
133 | For libev, you would typically use an C<ev_async> watcher: the |
122 | C<want_poll> callback would invoke C<ev_async_send> to wake up the event |
134 | C<want_poll> callback would invoke C<ev_async_send> to wake up the event |
123 | loop. Inside the callback set for the watcher, one would call C<eio_poll |
135 | loop. Inside the callback set for the watcher, one would call C<eio_poll |
124 | ()> (followed by C<ev_async_send> again if C<eio_poll> indicates that not |
136 | ()>. |
125 | all requests have been handled yet). The race is taken care of because |
137 | |
126 | libev resets/rearms the async watcher before calling your callback, |
138 | If C<eio_poll ()> is configured to not handle all results in one go |
127 | and therefore, before calling C<eio_poll>. This might result in (some) |
139 | (i.e. it returns C<-1>) then you should start an idle watcher that calls |
128 | spurious wake-ups, but is generally harmless. |
140 | C<eio_poll> until it returns something C<!= -1>. |
|
|
141 | |
|
|
142 | A full-featured connector between libeio and libev would look as follows |
|
|
143 | (if C<eio_poll> is handling all requests, it can of course be simplified a |
|
|
144 | lot by removing the idle watcher logic): |
|
|
145 | |
|
|
146 | static struct ev_loop *loop; |
|
|
147 | static ev_idle repeat_watcher; |
|
|
148 | static ev_async ready_watcher; |
|
|
149 | |
|
|
150 | /* idle watcher callback, only used when eio_poll */ |
|
|
151 | /* didn't handle all results in one call */ |
|
|
152 | static void |
|
|
153 | repeat (EV_P_ ev_idle *w, int revents) |
|
|
154 | { |
|
|
155 | if (eio_poll () != -1) |
|
|
156 | ev_idle_stop (EV_A_ w); |
|
|
157 | } |
|
|
158 | |
|
|
159 | /* eio has some results, process them */ |
|
|
160 | static void |
|
|
161 | ready (EV_P_ ev_async *w, int revents) |
|
|
162 | { |
|
|
163 | if (eio_poll () == -1) |
|
|
164 | ev_idle_start (EV_A_ &repeat_watcher); |
|
|
165 | } |
|
|
166 | |
|
|
167 | /* wake up the event loop */ |
|
|
168 | static void |
|
|
169 | want_poll (void) |
|
|
170 | { |
|
|
171 | ev_async_send (loop, &ready_watcher) |
|
|
172 | } |
|
|
173 | |
|
|
174 | void |
|
|
175 | my_init_eio () |
|
|
176 | { |
|
|
177 | loop = EV_DEFAULT; |
|
|
178 | |
|
|
179 | ev_idle_init (&repeat_watcher, repeat); |
|
|
180 | ev_async_init (&ready_watcher, ready); |
|
|
181 | ev_async_start (loop &watcher); |
|
|
182 | |
|
|
183 | eio_init (want_poll, 0); |
|
|
184 | } |
129 | |
185 | |
130 | For most other event loops, you would typically use a pipe - the event |
186 | For most other event loops, you would typically use a pipe - the event |
131 | loop should be told to wait for read readiness on the read end. In |
187 | loop should be told to wait for read readiness on the read end. In |
132 | C<want_poll> you would write a single byte, in C<done_poll> you would try |
188 | C<want_poll> you would write a single byte, in C<done_poll> you would try |
133 | to read that byte, and in the callback for the read end, you would call |
189 | to read that byte, and in the callback for the read end, you would call |
134 | C<eio_poll>. The race is avoided here because the event loop should invoke |
190 | C<eio_poll>. |
135 | your callback again and again until the byte has been read (as the pipe |
191 | |
136 | read callback does not read it, only C<done_poll>). |
192 | You don't have to take special care in the case C<eio_poll> doesn't handle |
|
|
193 | all requests, as the done callback will not be invoked, so the event loop |
|
|
194 | will still signal readiness for the pipe until I<all> results have been |
|
|
195 | processed. |
|
|
196 | |
|
|
197 | |
|
|
198 | =head1 HIGH LEVEL REQUEST API |
|
|
199 | |
|
|
200 | Libeio has both a high-level API, which consists of calling a request |
|
|
201 | function with a callback to be called on completion, and a low-level API |
|
|
202 | where you fill out request structures and submit them. |
|
|
203 | |
|
|
204 | This section describes the high-level API. |
|
|
205 | |
|
|
206 | =head2 REQUEST SUBMISSION AND RESULT PROCESSING |
|
|
207 | |
|
|
208 | You submit a request by calling the relevant C<eio_TYPE> function with the |
|
|
209 | required parameters, a callback of type C<int (*eio_cb)(eio_req *req)> |
|
|
210 | (called C<eio_cb> below) and a freely usable C<void *data> argument. |
|
|
211 | |
|
|
212 | The return value will either be 0, in case something went really wrong |
|
|
213 | (which can basically only happen on very fatal errors, such as C<malloc> |
|
|
214 | returning 0, which is rather unlikely), or a pointer to the newly-created |
|
|
215 | and submitted C<eio_req *>. |
|
|
216 | |
|
|
217 | The callback will be called with an C<eio_req *> which contains the |
|
|
218 | results of the request. The members you can access inside that structure |
|
|
219 | vary from request to request, except for: |
|
|
220 | |
|
|
221 | =over 4 |
|
|
222 | |
|
|
223 | =item C<ssize_t result> |
|
|
224 | |
|
|
225 | This contains the result value from the call (usually the same as the |
|
|
226 | syscall of the same name). |
|
|
227 | |
|
|
228 | =item C<int errorno> |
|
|
229 | |
|
|
230 | This contains the value of C<errno> after the call. |
|
|
231 | |
|
|
232 | =item C<void *data> |
|
|
233 | |
|
|
234 | The C<void *data> member simply stores the value of the C<data> argument. |
|
|
235 | |
|
|
236 | =back |
|
|
237 | |
|
|
238 | Members not explicitly described as accessible must not be |
|
|
239 | accessed. Specifically, there is no guarantee that any members will still |
|
|
240 | have the value they had when the request was submitted. |
|
|
241 | |
|
|
242 | The return value of the callback is normally C<0>, which tells libeio to |
|
|
243 | continue normally. If a callback returns a nonzero value, libeio will |
|
|
244 | stop processing results (in C<eio_poll>) and will return the value to its |
|
|
245 | caller. |
|
|
246 | |
|
|
247 | Memory areas passed to libeio wrappers must stay valid as long as a |
|
|
248 | request executes, with the exception of paths, which are being copied |
|
|
249 | internally. Any memory libeio itself allocates will be freed after the |
|
|
250 | finish callback has been called. If you want to manage all memory passed |
|
|
251 | to libeio yourself you can use the low-level API. |
|
|
252 | |
|
|
253 | For example, to open a file, you could do this: |
|
|
254 | |
|
|
255 | static int |
|
|
256 | file_open_done (eio_req *req) |
|
|
257 | { |
|
|
258 | if (req->result < 0) |
|
|
259 | { |
|
|
260 | /* open() returned -1 */ |
|
|
261 | errno = req->errorno; |
|
|
262 | perror ("open"); |
|
|
263 | } |
|
|
264 | else |
|
|
265 | { |
|
|
266 | int fd = req->result; |
|
|
267 | /* now we have the new fd in fd */ |
|
|
268 | } |
|
|
269 | |
|
|
270 | return 0; |
|
|
271 | } |
|
|
272 | |
|
|
273 | /* the first three arguments are passed to open(2) */ |
|
|
274 | /* the remaining are priority, callback and data */ |
|
|
275 | if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0)) |
|
|
276 | abort (); /* something went wrong, we will all die!!! */ |
|
|
277 | |
|
|
278 | Note that you additionally need to call C<eio_poll> when the C<want_cb> |
|
|
279 | indicates that requests are ready to be processed. |
|
|
280 | |
|
|
281 | =head2 CANCELLING REQUESTS |
|
|
282 | |
|
|
283 | Sometimes the need for a request goes away before the request is |
|
|
284 | finished. In that case, one can cancel the request by a call to |
|
|
285 | C<eio_cancel>: |
|
|
286 | |
|
|
287 | =over 4 |
|
|
288 | |
|
|
289 | =item eio_cancel (eio_req *req) |
|
|
290 | |
|
|
291 | Cancel the request (and all its subrequests). If the request is currently |
|
|
292 | executing it might still continue to execute, and in other cases it might |
|
|
293 | still take a while till the request is cancelled. |
|
|
294 | |
|
|
295 | Even if cancelled, the finish callback will still be invoked - the |
|
|
296 | callbacks of all cancellable requests need to check whether the request |
|
|
297 | has been cancelled by calling C<EIO_CANCELLED (req)>: |
|
|
298 | |
|
|
299 | static int |
|
|
300 | my_eio_cb (eio_req *req) |
|
|
301 | { |
|
|
302 | if (EIO_CANCELLED (req)) |
|
|
303 | return 0; |
|
|
304 | } |
|
|
305 | |
|
|
306 | In addition, cancelled requests will I<either> have C<< req->result >> |
|
|
307 | set to C<-1> and C<errno> to C<ECANCELED>, or I<otherwise> they were |
|
|
308 | successfully executed, despite being cancelled (e.g. when they have |
|
|
309 | already been executed at the time they were cancelled). |
|
|
310 | |
|
|
311 | C<EIO_CANCELLED> is still true for requests that have successfully |
|
|
312 | executed, as long as C<eio_cancel> was called on them at some point. |
|
|
313 | |
|
|
314 | =back |
|
|
315 | |
|
|
316 | =head2 AVAILABLE REQUESTS |
|
|
317 | |
|
|
318 | The following request functions are available. I<All> of them return the |
|
|
319 | C<eio_req *> on success and C<0> on failure, and I<all> of them have the |
|
|
320 | same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is |
|
|
321 | mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some |
|
|
322 | custom data value as C<data>. |
|
|
323 | |
|
|
324 | =head3 POSIX API WRAPPERS |
|
|
325 | |
|
|
326 | These requests simply wrap the POSIX call of the same name, with the same |
|
|
327 | arguments. If a function is not implemented by the OS and cannot be emulated |
|
|
328 | in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>. |
|
|
329 | |
|
|
330 | =over 4 |
|
|
331 | |
|
|
332 | =item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
333 | |
|
|
334 | =item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data) |
|
|
335 | |
|
|
336 | =item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data) |
|
|
337 | |
|
|
338 | =item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
339 | |
|
|
340 | =item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
341 | |
|
|
342 | =item eio_rmdir (const char *path, int pri, eio_cb cb, void *data) |
|
|
343 | |
|
|
344 | =item eio_unlink (const char *path, int pri, eio_cb cb, void *data) |
|
|
345 | |
|
|
346 | =item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data) |
|
|
347 | |
|
|
348 | =item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data) |
|
|
349 | |
|
|
350 | =item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
|
|
351 | |
|
|
352 | =item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
|
|
353 | |
|
|
354 | =item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
|
|
355 | |
|
|
356 | =item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data) |
|
|
357 | |
|
|
358 | =item eio_close (int fd, int pri, eio_cb cb, void *data) |
|
|
359 | |
|
|
360 | =item eio_sync (int pri, eio_cb cb, void *data) |
|
|
361 | |
|
|
362 | =item eio_fsync (int fd, int pri, eio_cb cb, void *data) |
|
|
363 | |
|
|
364 | =item eio_fdatasync (int fd, int pri, eio_cb cb, void *data) |
|
|
365 | |
|
|
366 | =item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data) |
|
|
367 | |
|
|
368 | =item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data) |
|
|
369 | |
|
|
370 | =item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
371 | |
|
|
372 | =item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data) |
|
|
373 | |
|
|
374 | =item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data) |
|
|
375 | |
|
|
376 | These have the same semantics as the syscall of the same name, their |
|
|
377 | return value is available as C<< req->result >> later. |
|
|
378 | |
|
|
379 | =item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data) |
|
|
380 | |
|
|
381 | =item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data) |
|
|
382 | |
|
|
383 | These two requests are called C<read> and C<write>, but actually wrap |
|
|
384 | C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin), |
|
|
385 | libeio uses lseek/read_or_write/lseek and a mutex to serialise the |
|
|
386 | requests, so all these requests run serially and do not disturb each |
|
|
387 | other. However, they still disturb the file offset while they run, so it's |
|
|
388 | not safe to call these functions concurrently with non-libeio functions on |
|
|
389 | the same fd on these systems. |
|
|
390 | |
|
|
391 | Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X), |
|
|
392 | so it is advised not to submit multiple requests on the same fd on this |
|
|
393 | horrible pile of garbage. |
|
|
394 | |
|
|
395 | =item eio_mlockall (int flags, int pri, eio_cb cb, void *data) |
|
|
396 | |
|
|
397 | Like C<mlockall>, but the flag value constants are called |
|
|
398 | C<EIO_MCL_CURRENT> and C<EIO_MCL_FUTURE>. |
|
|
399 | |
|
|
400 | =item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data) |
|
|
401 | |
|
|
402 | Just like msync, except that the flag values are called C<EIO_MS_ASYNC>, |
|
|
403 | C<EIO_MS_INVALIDATE> and C<EIO_MS_SYNC>. |
|
|
404 | |
|
|
405 | =item eio_readlink (const char *path, int pri, eio_cb cb, void *data) |
|
|
406 | |
|
|
407 | If successful, the path read by C<readlink(2)> can be accessed via C<< |
|
|
408 | req->ptr2 >> and is I<NOT> null-terminated, with the length specified as |
|
|
409 | C<< req->result >>. |
|
|
410 | |
|
|
411 | if (req->result >= 0) |
|
|
412 | { |
|
|
413 | char *target = strndup ((char *)req->ptr2, req->result); |
|
|
414 | |
|
|
415 | free (target); |
|
|
416 | } |
|
|
417 | |
|
|
418 | =item eio_realpath (const char *path, int pri, eio_cb cb, void *data) |
|
|
419 | |
|
|
420 | Similar to the realpath libc function, but unlike that one, C<< |
|
|
421 | req->result >> is C<-1> on failure. On success, the result is the length |
|
|
422 | of the returned path in C<ptr2> (which is I<NOT> 0-terminated) - this is |
|
|
423 | similar to readlink. |
|
|
424 | |
|
|
425 | =item eio_stat (const char *path, int pri, eio_cb cb, void *data) |
|
|
426 | |
|
|
427 | =item eio_lstat (const char *path, int pri, eio_cb cb, void *data) |
|
|
428 | |
|
|
429 | =item eio_fstat (int fd, int pri, eio_cb cb, void *data) |
|
|
430 | |
|
|
431 | Stats a file - if C<< req->result >> indicates success, then you can |
|
|
432 | access the C<struct stat>-like structure via C<< req->ptr2 >>: |
|
|
433 | |
|
|
434 | EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; |
|
|
435 | |
|
|
436 | =item eio_statvfs (const char *path, int pri, eio_cb cb, void *data) |
|
|
437 | |
|
|
438 | =item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data) |
|
|
439 | |
|
|
440 | Stats a filesystem - if C<< req->result >> indicates success, then you can |
|
|
441 | access the C<struct statvfs>-like structure via C<< req->ptr2 >>: |
|
|
442 | |
|
|
443 | EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; |
|
|
444 | |
|
|
445 | =back |
|
|
446 | |
|
|
447 | =head3 READING DIRECTORIES |
|
|
448 | |
|
|
449 | Reading directories sounds simple, but can be rather demanding, especially |
|
|
450 | if you want to do stuff such as traversing a directory hierarchy or |
|
|
451 | processing all files in a directory. Libeio can assist these complex tasks |
|
|
452 | with it's C<eio_readdir> call. |
|
|
453 | |
|
|
454 | =over 4 |
|
|
455 | |
|
|
456 | =item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data) |
|
|
457 | |
|
|
458 | This is a very complex call. It basically reads through a whole directory |
|
|
459 | (via the C<opendir>, C<readdir> and C<closedir> calls) and returns either |
|
|
460 | the names or an array of C<struct eio_dirent>, depending on the C<flags> |
|
|
461 | argument. |
|
|
462 | |
|
|
463 | The C<< req->result >> indicates either the number of files found, or |
|
|
464 | C<-1> on error. On success, null-terminated names can be found as C<< req->ptr2 >>, |
|
|
465 | and C<struct eio_dirents>, if requested by C<flags>, can be found via C<< |
|
|
466 | req->ptr1 >>. |
|
|
467 | |
|
|
468 | Here is an example that prints all the names: |
|
|
469 | |
|
|
470 | int i; |
|
|
471 | char *names = (char *)req->ptr2; |
|
|
472 | |
|
|
473 | for (i = 0; i < req->result; ++i) |
|
|
474 | { |
|
|
475 | printf ("name #%d: %s\n", i, names); |
|
|
476 | |
|
|
477 | /* move to next name */ |
|
|
478 | names += strlen (names) + 1; |
|
|
479 | } |
|
|
480 | |
|
|
481 | Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>. |
|
|
482 | |
|
|
483 | C<flags> can be any combination of: |
|
|
484 | |
|
|
485 | =over 4 |
|
|
486 | |
|
|
487 | =item EIO_READDIR_DENTS |
|
|
488 | |
|
|
489 | If this flag is specified, then, in addition to the names in C<ptr2>, |
|
|
490 | also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct |
|
|
491 | eio_dirent> looks like this: |
|
|
492 | |
|
|
493 | struct eio_dirent |
|
|
494 | { |
|
|
495 | int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ |
|
|
496 | unsigned short namelen; /* size of filename without trailing 0 */ |
|
|
497 | unsigned char type; /* one of EIO_DT_* */ |
|
|
498 | signed char score; /* internal use */ |
|
|
499 | ino_t inode; /* the inode number, if available, otherwise unspecified */ |
|
|
500 | }; |
|
|
501 | |
|
|
502 | The only members you normally would access are C<nameofs>, which is the |
|
|
503 | byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>. |
|
|
504 | |
|
|
505 | C<type> can be one of: |
|
|
506 | |
|
|
507 | C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat> |
|
|
508 | the name yourself if you need to know, |
|
|
509 | one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>, |
|
|
510 | C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>) |
|
|
511 | or some OS-specific type (currently |
|
|
512 | C<EIO_DT_MPC> - multiplexed char device (v7+coherent), |
|
|
513 | C<EIO_DT_NAM> - xenix special named file, |
|
|
514 | C<EIO_DT_MPB> - multiplexed block device (v7+coherent), |
|
|
515 | C<EIO_DT_NWK> - HP-UX network special, |
|
|
516 | C<EIO_DT_CMP> - VxFS compressed, |
|
|
517 | C<EIO_DT_DOOR> - solaris door, or |
|
|
518 | C<EIO_DT_WHT>). |
|
|
519 | |
|
|
520 | This example prints all names and their type: |
|
|
521 | |
|
|
522 | int i; |
|
|
523 | struct eio_dirent *ents = (struct eio_dirent *)req->ptr1; |
|
|
524 | char *names = (char *)req->ptr2; |
|
|
525 | |
|
|
526 | for (i = 0; i < req->result; ++i) |
|
|
527 | { |
|
|
528 | struct eio_dirent *ent = ents + i; |
|
|
529 | char *name = names + ent->nameofs; |
|
|
530 | |
|
|
531 | printf ("name #%d: %s (type %d)\n", i, name, ent->type); |
|
|
532 | } |
|
|
533 | |
|
|
534 | =item EIO_READDIR_DIRS_FIRST |
|
|
535 | |
|
|
536 | When this flag is specified, then the names will be returned in an order |
|
|
537 | where likely directories come first, in optimal C<stat> order. This is |
|
|
538 | useful when you need to quickly find directories, or you want to find all |
|
|
539 | directories while avoiding to stat() each entry. |
|
|
540 | |
|
|
541 | If the system returns type information in readdir, then this is used |
|
|
542 | to find directories directly. Otherwise, likely directories are names |
|
|
543 | beginning with ".", or otherwise names with no dots, of which names with |
|
|
544 | short names are tried first. |
|
|
545 | |
|
|
546 | =item EIO_READDIR_STAT_ORDER |
|
|
547 | |
|
|
548 | When this flag is specified, then the names will be returned in an order |
|
|
549 | suitable for stat()'ing each one. That is, when you plan to stat() |
|
|
550 | all files in the given directory, then the returned order will likely |
|
|
551 | be fastest. |
|
|
552 | |
|
|
553 | If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the |
|
|
554 | likely directories come first, resulting in a less optimal stat order. |
|
|
555 | |
|
|
556 | =item EIO_READDIR_FOUND_UNKNOWN |
|
|
557 | |
|
|
558 | This flag should not be specified when calling C<eio_readdir>. Instead, |
|
|
559 | it is being set by C<eio_readdir> (you can access the C<flags> via C<< |
|
|
560 | req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The |
|
|
561 | absence of this flag therefore indicates that all C<type>'s are known, |
|
|
562 | which can be used to speed up some algorithms. |
|
|
563 | |
|
|
564 | A typical use case would be to identify all subdirectories within a |
|
|
565 | directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If |
|
|
566 | then this flag is I<NOT> set, then all the entries at the beginning of the |
|
|
567 | returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you |
|
|
568 | should start C<stat()>'ing the entries starting at the beginning of the |
|
|
569 | array, stopping as soon as you found all directories (the count can be |
|
|
570 | deduced by the link count of the directory). |
|
|
571 | |
|
|
572 | =back |
|
|
573 | |
|
|
574 | =back |
|
|
575 | |
|
|
576 | =head3 OS-SPECIFIC CALL WRAPPERS |
|
|
577 | |
|
|
578 | These wrap OS-specific calls (usually Linux ones), and might or might not |
|
|
579 | be emulated on other operating systems. Calls that are not emulated will |
|
|
580 | return C<-1> and set C<errno> to C<ENOSYS>. |
|
|
581 | |
|
|
582 | =over 4 |
|
|
583 | |
|
|
584 | =item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data) |
|
|
585 | |
|
|
586 | Wraps the C<sendfile> syscall. The arguments follow the Linux version, but |
|
|
587 | libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and |
|
|
588 | Darwin. |
|
|
589 | |
|
|
590 | If the OS doesn't support some sendfile-like call, or the call fails, |
|
|
591 | indicating support for the given file descriptor type (for example, |
|
|
592 | Linux's sendfile might not support file to file copies), then libeio will |
|
|
593 | emulate the call in userspace, so there are almost no limitations on its |
|
|
594 | use. |
|
|
595 | |
|
|
596 | =item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data) |
|
|
597 | |
|
|
598 | Calls C<readahead(2)>. If the syscall is missing, then the call is |
|
|
599 | emulated by simply reading the data (currently in 64kiB chunks). |
|
|
600 | |
|
|
601 | =item eio_syncfs (int fd, int pri, eio_cb cb, void *data) |
|
|
602 | |
|
|
603 | Calls Linux' C<syncfs> syscall, if available. Returns C<-1> and sets |
|
|
604 | C<errno> to C<ENOSYS> if the call is missing I<but still calls sync()>, |
|
|
605 | if the C<fd> is C<< >= 0 >>, so you can probe for the availability of the |
|
|
606 | syscall with a negative C<fd> argument and checking for C<-1/ENOSYS>. |
|
|
607 | |
|
|
608 | =item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data) |
|
|
609 | |
|
|
610 | Calls C<sync_file_range>. If the syscall is missing, then this is the same |
|
|
611 | as calling C<fdatasync>. |
|
|
612 | |
|
|
613 | Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>, |
|
|
614 | C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>. |
|
|
615 | |
|
|
616 | =item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data) |
|
|
617 | |
|
|
618 | Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is |
|
|
619 | missing, then it returns failure and sets C<errno> to C<ENOSYS>. |
|
|
620 | |
|
|
621 | The C<mode> argument can be C<0> (for behaviour similar to |
|
|
622 | C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size |
|
|
623 | of the file unchanged (but still preallocates space beyond end of file). |
|
|
624 | |
|
|
625 | =back |
|
|
626 | |
|
|
627 | =head3 LIBEIO-SPECIFIC REQUESTS |
|
|
628 | |
|
|
629 | These requests are specific to libeio and do not correspond to any OS call. |
|
|
630 | |
|
|
631 | =over 4 |
|
|
632 | |
|
|
633 | =item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data) |
|
|
634 | |
|
|
635 | Reads (C<flags == 0>) or modifies (C<flags == EIO_MT_MODIFY) the given |
|
|
636 | memory area, page-wise, that is, it reads (or reads and writes back) the |
|
|
637 | first octet of every page that spans the memory area. |
|
|
638 | |
|
|
639 | This can be used to page in some mmapped file, or dirty some pages. Note |
|
|
640 | that dirtying is an unlocked read-write access, so races can ensue when |
|
|
641 | the some other thread modifies the data stored in that memory area. |
|
|
642 | |
|
|
643 | =item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data) |
|
|
644 | |
|
|
645 | Executes a custom request, i.e., a user-specified callback. |
|
|
646 | |
|
|
647 | The callback gets the C<eio_req *> as parameter and is expected to read |
|
|
648 | and modify any request-specific members. Specifically, it should set C<< |
|
|
649 | req->result >> to the result value, just like other requests. |
|
|
650 | |
|
|
651 | Here is an example that simply calls C<open>, like C<eio_open>, but it |
|
|
652 | uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If |
|
|
653 | you want to pass more/other parameters, you either need to pass some |
|
|
654 | struct or so via C<data> or provide your own wrapper using the low-level |
|
|
655 | API. |
|
|
656 | |
|
|
657 | static int |
|
|
658 | my_open_done (eio_req *req) |
|
|
659 | { |
|
|
660 | int fd = req->result; |
|
|
661 | |
|
|
662 | return 0; |
|
|
663 | } |
|
|
664 | |
|
|
665 | static void |
|
|
666 | my_open (eio_req *req) |
|
|
667 | { |
|
|
668 | req->result = open (req->data, O_RDONLY); |
|
|
669 | } |
|
|
670 | |
|
|
671 | eio_custom (my_open, 0, my_open_done, "/etc/passwd"); |
|
|
672 | |
|
|
673 | =item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data) |
|
|
674 | |
|
|
675 | This is a request that takes C<delay> seconds to execute, but otherwise |
|
|
676 | does nothing - it simply puts one of the worker threads to sleep for this |
|
|
677 | long. |
|
|
678 | |
|
|
679 | This request can be used to artificially increase load, e.g. for debugging |
|
|
680 | or benchmarking reasons. |
|
|
681 | |
|
|
682 | =item eio_nop (int pri, eio_cb cb, void *data) |
|
|
683 | |
|
|
684 | This request does nothing, except go through the whole request cycle. This |
|
|
685 | can be used to measure latency or in some cases to simplify code, but is |
|
|
686 | not really of much use. |
|
|
687 | |
|
|
688 | =back |
|
|
689 | |
|
|
690 | =head3 GROUPING AND LIMITING REQUESTS |
|
|
691 | |
|
|
692 | There is one more rather special request, C<eio_grp>. It is a very special |
|
|
693 | aio request: Instead of doing something, it is a container for other eio |
|
|
694 | requests. |
|
|
695 | |
|
|
696 | There are two primary use cases for this: a) bundle many requests into a |
|
|
697 | single, composite, request with a definite callback and the ability to |
|
|
698 | cancel the whole request with its subrequests and b) limiting the number |
|
|
699 | of "active" requests. |
|
|
700 | |
|
|
701 | Further below you will find more discussion of these topics - first |
|
|
702 | follows the reference section detailing the request generator and other |
|
|
703 | methods. |
|
|
704 | |
|
|
705 | =over 4 |
|
|
706 | |
|
|
707 | =item eio_req *grp = eio_grp (eio_cb cb, void *data) |
|
|
708 | |
|
|
709 | Creates, submits and returns a group request. Note that it doesn't have a |
|
|
710 | priority, unlike all other requests. |
|
|
711 | |
|
|
712 | =item eio_grp_add (eio_req *grp, eio_req *req) |
|
|
713 | |
|
|
714 | Adds a request to the request group. |
|
|
715 | |
|
|
716 | =item eio_grp_cancel (eio_req *grp) |
|
|
717 | |
|
|
718 | Cancels all requests I<in> the group, but I<not> the group request |
|
|
719 | itself. You can cancel the group request I<and> all subrequests via a |
|
|
720 | normal C<eio_cancel> call. |
|
|
721 | |
|
|
722 | =back |
|
|
723 | |
|
|
724 | =head4 GROUP REQUEST LIFETIME |
|
|
725 | |
|
|
726 | Left alone, a group request will instantly move to the pending state and |
|
|
727 | will be finished at the next call of C<eio_poll>. |
|
|
728 | |
|
|
729 | The usefulness stems from the fact that, if a subrequest is added to a |
|
|
730 | group I<before> a call to C<eio_poll>, via C<eio_grp_add>, then the group |
|
|
731 | will not finish until all the subrequests have finished. |
|
|
732 | |
|
|
733 | So the usage cycle of a group request is like this: after it is created, |
|
|
734 | you normally instantly add a subrequest. If none is added, the group |
|
|
735 | request will finish on it's own. As long as subrequests are added before |
|
|
736 | the group request is finished it will be kept from finishing, that is the |
|
|
737 | callbacks of any subrequests can, in turn, add more requests to the group, |
|
|
738 | and as long as any requests are active, the group request itself will not |
|
|
739 | finish. |
|
|
740 | |
|
|
741 | =head4 CREATING COMPOSITE REQUESTS |
|
|
742 | |
|
|
743 | Imagine you wanted to create an C<eio_load> request that opens a file, |
|
|
744 | reads it and closes it. This means it has to execute at least three eio |
|
|
745 | requests, but for various reasons it might be nice if that request looked |
|
|
746 | like any other eio request. |
|
|
747 | |
|
|
748 | This can be done with groups: |
|
|
749 | |
|
|
750 | =over 4 |
|
|
751 | |
|
|
752 | =item 1) create the request object |
|
|
753 | |
|
|
754 | Create a group that contains all further requests. This is the request you |
|
|
755 | can return as "the load request". |
|
|
756 | |
|
|
757 | =item 2) open the file, maybe |
|
|
758 | |
|
|
759 | Next, open the file with C<eio_open> and add the request to the group |
|
|
760 | request and you are finished setting up the request. |
|
|
761 | |
|
|
762 | If, for some reason, you cannot C<eio_open> (path is a null ptr?) you |
|
|
763 | can set C<< grp->result >> to C<-1> to signal an error and let the group |
|
|
764 | request finish on its own. |
|
|
765 | |
|
|
766 | =item 3) open callback adds more requests |
|
|
767 | |
|
|
768 | In the open callback, if the open was not successful, copy C<< |
|
|
769 | req->errorno >> to C<< grp->errorno >> and set C<< grp->result >> to |
|
|
770 | C<-1> to signal an error. |
|
|
771 | |
|
|
772 | Otherwise, malloc some memory or so and issue a read request, adding the |
|
|
773 | read request to the group. |
|
|
774 | |
|
|
775 | =item 4) continue issuing requests till finished |
|
|
776 | |
|
|
777 | In the read callback, check for errors and possibly continue with |
|
|
778 | C<eio_close> or any other eio request in the same way. |
|
|
779 | |
|
|
780 | As soon as no new requests are added, the group request will finish. Make |
|
|
781 | sure you I<always> set C<< grp->result >> to some sensible value. |
|
|
782 | |
|
|
783 | =back |
|
|
784 | |
|
|
785 | =head4 REQUEST LIMITING |
|
|
786 | |
|
|
787 | |
|
|
788 | #TODO |
|
|
789 | |
|
|
790 | void eio_grp_limit (eio_req *grp, int limit); |
|
|
791 | |
|
|
792 | |
|
|
793 | =back |
|
|
794 | |
|
|
795 | |
|
|
796 | =head1 LOW LEVEL REQUEST API |
|
|
797 | |
|
|
798 | #TODO |
|
|
799 | |
|
|
800 | |
|
|
801 | =head1 ANATOMY AND LIFETIME OF AN EIO REQUEST |
|
|
802 | |
|
|
803 | A request is represented by a structure of type C<eio_req>. To initialise |
|
|
804 | it, clear it to all zero bytes: |
|
|
805 | |
|
|
806 | eio_req req; |
|
|
807 | |
|
|
808 | memset (&req, 0, sizeof (req)); |
|
|
809 | |
|
|
810 | A more common way to initialise a new C<eio_req> is to use C<calloc>: |
|
|
811 | |
|
|
812 | eio_req *req = calloc (1, sizeof (*req)); |
|
|
813 | |
|
|
814 | In either case, libeio neither allocates, initialises or frees the |
|
|
815 | C<eio_req> structure for you - it merely uses it. |
|
|
816 | |
|
|
817 | zero |
|
|
818 | |
|
|
819 | #TODO |
137 | |
820 | |
138 | =head2 CONFIGURATION |
821 | =head2 CONFIGURATION |
139 | |
822 | |
140 | The functions in this section can sometimes be useful, but the default |
823 | The functions in this section can sometimes be useful, but the default |
141 | configuration will do in most case, so you should skip this section on |
824 | configuration will do in most case, so you should skip this section on |
… | |
… | |
152 | for example, in interactive programs, you might want to limit this time to |
835 | for example, in interactive programs, you might want to limit this time to |
153 | C<0.01> seconds or so. |
836 | C<0.01> seconds or so. |
154 | |
837 | |
155 | Note that: |
838 | Note that: |
156 | |
839 | |
|
|
840 | =over 4 |
|
|
841 | |
157 | a) libeio doesn't know how long your request callbacks take, so the time |
842 | =item a) libeio doesn't know how long your request callbacks take, so the |
158 | spent in C<eio_poll> is up to one callback invocation longer then this |
843 | time spent in C<eio_poll> is up to one callback invocation longer then |
159 | interval. |
844 | this interval. |
160 | |
845 | |
161 | b) this is implemented by calling C<gettimeofday> after each request, |
846 | =item b) this is implemented by calling C<gettimeofday> after each |
162 | which can be costly. |
847 | request, which can be costly. |
163 | |
848 | |
164 | c) at least one request will be handled. |
849 | =item c) at least one request will be handled. |
|
|
850 | |
|
|
851 | =back |
165 | |
852 | |
166 | =item eio_set_max_poll_reqs (unsigned int nreqs) |
853 | =item eio_set_max_poll_reqs (unsigned int nreqs) |
167 | |
854 | |
168 | When C<nreqs> is non-zero, then C<eio_poll> will not handle more than |
855 | When C<nreqs> is non-zero, then C<eio_poll> will not handle more than |
169 | C<nreqs> requests per invocation. This is a less costly way to limit the |
856 | C<nreqs> requests per invocation. This is a less costly way to limit the |
… | |
… | |
214 | executed and have results, but have not been finished yet by a call to |
901 | executed and have results, but have not been finished yet by a call to |
215 | C<eio_poll>). |
902 | C<eio_poll>). |
216 | |
903 | |
217 | =back |
904 | =back |
218 | |
905 | |
219 | |
|
|
220 | =head1 ANATOMY OF AN EIO REQUEST |
|
|
221 | |
|
|
222 | #TODO |
|
|
223 | |
|
|
224 | |
|
|
225 | =head1 HIGH LEVEL REQUEST API |
|
|
226 | |
|
|
227 | #TODO |
|
|
228 | |
|
|
229 | =back |
|
|
230 | |
|
|
231 | |
|
|
232 | =head1 LOW LEVEL REQUEST API |
|
|
233 | |
|
|
234 | #TODO |
|
|
235 | |
|
|
236 | =head1 EMBEDDING |
906 | =head1 EMBEDDING |
237 | |
907 | |
238 | Libeio can be embedded directly into programs. This functionality is not |
908 | Libeio can be embedded directly into programs. This functionality is not |
239 | documented and not (yet) officially supported. |
909 | documented and not (yet) officially supported. |
240 | |
910 | |
… | |
… | |
256 | This symbol governs the stack size for each eio thread. Libeio itself |
926 | This symbol governs the stack size for each eio thread. Libeio itself |
257 | was written to use very little stackspace, but when using C<EIO_CUSTOM> |
927 | was written to use very little stackspace, but when using C<EIO_CUSTOM> |
258 | requests, you might want to increase this. |
928 | requests, you might want to increase this. |
259 | |
929 | |
260 | If this symbol is undefined (the default) then libeio will use its default |
930 | If this symbol is undefined (the default) then libeio will use its default |
261 | stack size (C<sizeof (long) * 4096> currently). If it is defined, but |
931 | stack size (C<sizeof (void *) * 4096> currently). If it is defined, but |
262 | C<0>, then the default operating system stack size will be used. In all |
932 | C<0>, then the default operating system stack size will be used. In all |
263 | other cases, the value must be an expression that evaluates to the desired |
933 | other cases, the value must be an expression that evaluates to the desired |
264 | stack size. |
934 | stack size. |
265 | |
935 | |
266 | =back |
936 | =back |