… | |
… | |
11 | The newest version of this document is also available as an html-formatted |
11 | The newest version of this document is also available as an html-formatted |
12 | web page you might find easier to navigate when reading it for the first |
12 | web page you might find easier to navigate when reading it for the first |
13 | time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>. |
13 | time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>. |
14 | |
14 | |
15 | Note that this library is a by-product of the C<IO::AIO> perl |
15 | Note that this library is a by-product of the C<IO::AIO> perl |
16 | module, and many of the subtler points regarding requets lifetime |
16 | module, and many of the subtler points regarding requests lifetime |
17 | and so on are only documented in its documentation at the |
17 | and so on are only documented in its documentation at the |
18 | moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>. |
18 | moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>. |
19 | |
19 | |
20 | =head2 FEATURES |
20 | =head2 FEATURES |
21 | |
21 | |
22 | This library provides fully asynchronous versions of most POSIX functions |
22 | This library provides fully asynchronous versions of most POSIX functions |
23 | dealign with I/O. Unlike most asynchronous libraries, this not only |
23 | dealing with I/O. Unlike most asynchronous libraries, this not only |
24 | includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and |
24 | includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and |
25 | similar functions, as well as less rarely ones such as C<mknod>, C<futime> |
25 | similar functions, as well as less rarely ones such as C<mknod>, C<futime> |
26 | or C<readlink>. |
26 | or C<readlink>. |
27 | |
27 | |
28 | It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and |
28 | It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and |
29 | FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with |
29 | FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with |
30 | emulation elsewhere>). |
30 | emulation elsewhere>). |
31 | |
31 | |
32 | The goal is to enbale you to write fully non-blocking programs. For |
32 | The goal is to enable you to write fully non-blocking programs. For |
33 | example, in a game server, you would not want to freeze for a few seconds |
33 | example, in a game server, you would not want to freeze for a few seconds |
34 | just because the server is running a backup and you happen to call |
34 | just because the server is running a backup and you happen to call |
35 | C<readdir>. |
35 | C<readdir>. |
36 | |
36 | |
37 | =head2 TIME REPRESENTATION |
37 | =head2 TIME REPRESENTATION |
38 | |
38 | |
39 | Libeio represents time as a single floating point number, representing the |
39 | Libeio represents time as a single floating point number, representing the |
40 | (fractional) number of seconds since the (POSIX) epoch (somewhere near |
40 | (fractional) number of seconds since the (POSIX) epoch (somewhere near |
41 | the beginning of 1970, details are complicated, don't ask). This type is |
41 | the beginning of 1970, details are complicated, don't ask). This type is |
42 | called C<eio_tstamp>, but it is guarenteed to be of type C<double> (or |
42 | called C<eio_tstamp>, but it is guaranteed to be of type C<double> (or |
43 | better), so you can freely use C<double> yourself. |
43 | better), so you can freely use C<double> yourself. |
44 | |
44 | |
45 | Unlike the name component C<stamp> might indicate, it is also used for |
45 | Unlike the name component C<stamp> might indicate, it is also used for |
46 | time differences throughout libeio. |
46 | time differences throughout libeio. |
47 | |
47 | |
… | |
… | |
55 | 3. in the parent, continue business as usual, done |
55 | 3. in the parent, continue business as usual, done |
56 | 4. in the child, destroy all ready and pending requests and free the |
56 | 4. in the child, destroy all ready and pending requests and free the |
57 | memory used by the worker threads. This gives you a fully empty |
57 | memory used by the worker threads. This gives you a fully empty |
58 | libeio queue. |
58 | libeio queue. |
59 | |
59 | |
|
|
60 | Note, however, since libeio does use threads, thr above guarantee doesn't |
|
|
61 | cover your libc, for example, malloc and other libc functions are not |
|
|
62 | fork-safe, so there is very little you can do after a fork, and in fatc, |
|
|
63 | the above might crash, and thus change. |
|
|
64 | |
60 | =head1 INITIALISATION/INTEGRATION |
65 | =head1 INITIALISATION/INTEGRATION |
61 | |
66 | |
62 | Before you can call any eio functions you first have to initialise the |
67 | Before you can call any eio functions you first have to initialise the |
63 | library. The library integrates into any event loop, but can also be used |
68 | library. The library integrates into any event loop, but can also be used |
64 | without one, including in polling mode. |
69 | without one, including in polling mode. |
… | |
… | |
97 | handled or C<done_poll> has been called, which signals the same. |
102 | handled or C<done_poll> has been called, which signals the same. |
98 | |
103 | |
99 | Note that C<eio_poll> might return after C<done_poll> and C<want_poll> |
104 | Note that C<eio_poll> might return after C<done_poll> and C<want_poll> |
100 | have been called again, so watch out for races in your code. |
105 | have been called again, so watch out for races in your code. |
101 | |
106 | |
102 | As with C<want_poll>, this callback is called while lcoks are being held, |
107 | As with C<want_poll>, this callback is called while locks are being held, |
103 | so you I<must not call any libeio functions form within this callback>. |
108 | so you I<must not call any libeio functions form within this callback>. |
104 | |
109 | |
105 | =item int eio_poll () |
110 | =item int eio_poll () |
106 | |
111 | |
107 | This function has to be called whenever there are pending requests that |
112 | This function has to be called whenever there are pending requests that |
… | |
… | |
119 | =back |
124 | =back |
120 | |
125 | |
121 | For libev, you would typically use an C<ev_async> watcher: the |
126 | For libev, you would typically use an C<ev_async> watcher: the |
122 | C<want_poll> callback would invoke C<ev_async_send> to wake up the event |
127 | C<want_poll> callback would invoke C<ev_async_send> to wake up the event |
123 | loop. Inside the callback set for the watcher, one would call C<eio_poll |
128 | loop. Inside the callback set for the watcher, one would call C<eio_poll |
124 | ()> (followed by C<ev_async_send> again if C<eio_poll> indicates that not |
129 | ()>. |
125 | all requests have been handled yet). The race is taken care of because |
130 | |
126 | libev resets/rearms the async watcher before calling your callback, |
131 | If C<eio_poll ()> is configured to not handle all results in one go |
127 | and therefore, before calling C<eio_poll>. This might result in (some) |
132 | (i.e. it returns C<-1>) then you should start an idle watcher that calls |
128 | spurious wake-ups, but is generally harmless. |
133 | C<eio_poll> until it returns something C<!= -1>. |
|
|
134 | |
|
|
135 | A full-featured conenctor between libeio and libev would look as follows |
|
|
136 | (if C<eio_poll> is handling all requests, it can of course be simplified a |
|
|
137 | lot by removing the idle watcher logic): |
|
|
138 | |
|
|
139 | static struct ev_loop *loop; |
|
|
140 | static ev_idle repeat_watcher; |
|
|
141 | static ev_async ready_watcher; |
|
|
142 | |
|
|
143 | /* idle watcher callback, only used when eio_poll */ |
|
|
144 | /* didn't handle all results in one call */ |
|
|
145 | static void |
|
|
146 | repeat (EV_P_ ev_idle *w, int revents) |
|
|
147 | { |
|
|
148 | if (eio_poll () != -1) |
|
|
149 | ev_idle_stop (EV_A_ w); |
|
|
150 | } |
|
|
151 | |
|
|
152 | /* eio has some results, process them */ |
|
|
153 | static void |
|
|
154 | ready (EV_P_ ev_async *w, int revents) |
|
|
155 | { |
|
|
156 | if (eio_poll () == -1) |
|
|
157 | ev_idle_start (EV_A_ &repeat_watcher); |
|
|
158 | } |
|
|
159 | |
|
|
160 | /* wake up the event loop */ |
|
|
161 | static void |
|
|
162 | want_poll (void) |
|
|
163 | { |
|
|
164 | ev_async_send (loop, &ready_watcher) |
|
|
165 | } |
|
|
166 | |
|
|
167 | void |
|
|
168 | my_init_eio () |
|
|
169 | { |
|
|
170 | loop = EV_DEFAULT; |
|
|
171 | |
|
|
172 | ev_idle_init (&repeat_watcher, repeat); |
|
|
173 | ev_async_init (&ready_watcher, ready); |
|
|
174 | ev_async_start (loop &watcher); |
|
|
175 | |
|
|
176 | eio_init (want_poll, 0); |
|
|
177 | } |
129 | |
178 | |
130 | For most other event loops, you would typically use a pipe - the event |
179 | For most other event loops, you would typically use a pipe - the event |
131 | loop should be told to wait for read readyness on the read end. In |
180 | loop should be told to wait for read readiness on the read end. In |
132 | C<want_poll> you would write a single byte, in C<done_poll> you would try |
181 | C<want_poll> you would write a single byte, in C<done_poll> you would try |
133 | to read that byte, and in the callback for the read end, you would call |
182 | to read that byte, and in the callback for the read end, you would call |
134 | C<eio_poll>. The race is avoided here because the event loop should invoke |
183 | C<eio_poll>. |
135 | your callback again and again until the byte has been read (as the pipe |
184 | |
136 | read callback does not read it, only C<done_poll>). |
185 | You don't have to take special care in the case C<eio_poll> doesn't handle |
|
|
186 | all requests, as the done callback will not be invoked, so the event loop |
|
|
187 | will still signal readyness for the pipe until I<all> results have been |
|
|
188 | processed. |
|
|
189 | |
|
|
190 | |
|
|
191 | =head1 HIGH LEVEL REQUEST API |
|
|
192 | |
|
|
193 | Libeio has both a high-level API, which consists of calling a request |
|
|
194 | function with a callback to be called on completion, and a low-level API |
|
|
195 | where you fill out request structures and submit them. |
|
|
196 | |
|
|
197 | This section describes the high-level API. |
|
|
198 | |
|
|
199 | =head2 REQUEST SUBMISSION AND RESULT PROCESSING |
|
|
200 | |
|
|
201 | You submit a request by calling the relevant C<eio_TYPE> function with the |
|
|
202 | required parameters, a callback of type C<int (*eio_cb)(eio_req *req)> |
|
|
203 | (called C<eio_cb> below) and a freely usable C<void *data> argument. |
|
|
204 | |
|
|
205 | The return value will either be 0, in case something went really wrong |
|
|
206 | (which can basically only happen on very fatal errors, such as C<malloc> |
|
|
207 | returning 0, which is rather unlikely), or a pointer to the newly-created |
|
|
208 | and submitted C<eio_req *>. |
|
|
209 | |
|
|
210 | The callback will be called with an C<eio_req *> which contains the |
|
|
211 | results of the request. The members you can access inside that structure |
|
|
212 | vary from request to request, except for: |
|
|
213 | |
|
|
214 | =over 4 |
|
|
215 | |
|
|
216 | =item C<ssize_t result> |
|
|
217 | |
|
|
218 | This contains the result value from the call (usually the same as the |
|
|
219 | syscall of the same name). |
|
|
220 | |
|
|
221 | =item C<int errorno> |
|
|
222 | |
|
|
223 | This contains the value of C<errno> after the call. |
|
|
224 | |
|
|
225 | =item C<void *data> |
|
|
226 | |
|
|
227 | The C<void *data> member simply stores the value of the C<data> argument. |
|
|
228 | |
|
|
229 | =back |
|
|
230 | |
|
|
231 | The return value of the callback is normally C<0>, which tells libeio to |
|
|
232 | continue normally. If a callback returns a nonzero value, libeio will |
|
|
233 | stop processing results (in C<eio_poll>) and will return the value to its |
|
|
234 | caller. |
|
|
235 | |
|
|
236 | Memory areas passed to libeio must stay valid as long as a request |
|
|
237 | executes, with the exception of paths, which are being copied |
|
|
238 | internally. Any memory libeio itself allocates will be freed after the |
|
|
239 | finish callback has been called. If you want to manage all memory passed |
|
|
240 | to libeio yourself you can use the low-level API. |
|
|
241 | |
|
|
242 | For example, to open a file, you could do this: |
|
|
243 | |
|
|
244 | static int |
|
|
245 | file_open_done (eio_req *req) |
|
|
246 | { |
|
|
247 | if (req->result < 0) |
|
|
248 | { |
|
|
249 | /* open() returned -1 */ |
|
|
250 | errno = req->errorno; |
|
|
251 | perror ("open"); |
|
|
252 | } |
|
|
253 | else |
|
|
254 | { |
|
|
255 | int fd = req->result; |
|
|
256 | /* now we have the new fd in fd */ |
|
|
257 | } |
|
|
258 | |
|
|
259 | return 0; |
|
|
260 | } |
|
|
261 | |
|
|
262 | /* the first three arguments are passed to open(2) */ |
|
|
263 | /* the remaining are priority, callback and data */ |
|
|
264 | if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0)) |
|
|
265 | abort (); /* something ent wrong, we will all die!!! */ |
|
|
266 | |
|
|
267 | Note that you additionally need to call C<eio_poll> when the C<want_cb> |
|
|
268 | indicates that requests are ready to be processed. |
|
|
269 | |
|
|
270 | =head2 AVAILABLE REQUESTS |
|
|
271 | |
|
|
272 | The following request functions are available. I<All> of them return the |
|
|
273 | C<eio_req *> on success and C<0> on failure, and I<all> of them have the |
|
|
274 | same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is |
|
|
275 | mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some |
|
|
276 | custom data value as C<data>. |
|
|
277 | |
|
|
278 | =head3 POSIX API WRAPPERS |
|
|
279 | |
|
|
280 | These requests simply wrap the POSIX call of the same name, with the same |
|
|
281 | arguments. If a function is not implemented by the OS and cannot be emulated |
|
|
282 | in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>. |
|
|
283 | |
|
|
284 | =over 4 |
|
|
285 | |
|
|
286 | =item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
287 | |
|
|
288 | =item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data) |
|
|
289 | |
|
|
290 | =item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data) |
|
|
291 | |
|
|
292 | =item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
293 | |
|
|
294 | =item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
295 | |
|
|
296 | =item eio_rmdir (const char *path, int pri, eio_cb cb, void *data) |
|
|
297 | |
|
|
298 | =item eio_unlink (const char *path, int pri, eio_cb cb, void *data) |
|
|
299 | |
|
|
300 | =item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data) |
|
|
301 | |
|
|
302 | =item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data) |
|
|
303 | |
|
|
304 | =item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
|
|
305 | |
|
|
306 | =item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
|
|
307 | |
|
|
308 | =item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
|
|
309 | |
|
|
310 | =item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data) |
|
|
311 | |
|
|
312 | =item eio_close (int fd, int pri, eio_cb cb, void *data) |
|
|
313 | |
|
|
314 | =item eio_sync (int pri, eio_cb cb, void *data) |
|
|
315 | |
|
|
316 | =item eio_fsync (int fd, int pri, eio_cb cb, void *data) |
|
|
317 | |
|
|
318 | =item eio_fdatasync (int fd, int pri, eio_cb cb, void *data) |
|
|
319 | |
|
|
320 | =item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data) |
|
|
321 | |
|
|
322 | =item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data) |
|
|
323 | |
|
|
324 | =item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data) |
|
|
325 | |
|
|
326 | =item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data) |
|
|
327 | |
|
|
328 | =item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data) |
|
|
329 | |
|
|
330 | These have the same semantics as the syscall of the same name, their |
|
|
331 | return value is available as C<< req->result >> later. |
|
|
332 | |
|
|
333 | =item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data) |
|
|
334 | |
|
|
335 | =item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data) |
|
|
336 | |
|
|
337 | These two requests are called C<read> and C<write>, but actually wrap |
|
|
338 | C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin), |
|
|
339 | libeio uses lseek/read_or_write/lseek and a mutex to serialise the |
|
|
340 | requests, so all these requests run serially and do not disturb each |
|
|
341 | other. However, they still disturb the file offset while they run, so it's |
|
|
342 | not safe to call these functions concurrently with non-libeio functions on |
|
|
343 | the same fd on these systems. |
|
|
344 | |
|
|
345 | Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X), |
|
|
346 | so it is advised not to submit multiple requests on the same fd on this |
|
|
347 | horrible pile of garbage. |
|
|
348 | |
|
|
349 | =item eio_mlockall (int flags, int pri, eio_cb cb, void *data) |
|
|
350 | |
|
|
351 | Like C<mlockall>, but the flag value constants are called |
|
|
352 | C<EIO_MCL_CURRENT> and C<EIO_MCL_FUTURE>. |
|
|
353 | |
|
|
354 | =item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data) |
|
|
355 | |
|
|
356 | Just like msync, except that the flag values are called C<EIO_MS_ASYNC>, |
|
|
357 | C<EIO_MS_INVALIDATE> and C<EIO_MS_SYNC>. |
|
|
358 | |
|
|
359 | =item eio_readlink (const char *path, int pri, eio_cb cb, void *data) |
|
|
360 | |
|
|
361 | If successful, the path read by C<readlink(2)> can be accessed via C<< |
|
|
362 | req->ptr2 >> and is I<NOT> null-terminated, with the length specified as |
|
|
363 | C<< req->result >>. |
|
|
364 | |
|
|
365 | if (req->result >= 0) |
|
|
366 | { |
|
|
367 | char *target = strndup ((char *)req->ptr2, req->result); |
|
|
368 | |
|
|
369 | free (target); |
|
|
370 | } |
|
|
371 | |
|
|
372 | =item eio_realpath (const char *path, int pri, eio_cb cb, void *data) |
|
|
373 | |
|
|
374 | Similar to the realpath libc function, but unlike that one, result is |
|
|
375 | C<-1> on failure and the length of the returned path in C<ptr2> (which is |
|
|
376 | not 0-terminated) - this is similar to readlink. |
|
|
377 | |
|
|
378 | =item eio_stat (const char *path, int pri, eio_cb cb, void *data) |
|
|
379 | |
|
|
380 | =item eio_lstat (const char *path, int pri, eio_cb cb, void *data) |
|
|
381 | |
|
|
382 | =item eio_fstat (int fd, int pri, eio_cb cb, void *data) |
|
|
383 | |
|
|
384 | Stats a file - if C<< req->result >> indicates success, then you can |
|
|
385 | access the C<struct stat>-like structure via C<< req->ptr2 >>: |
|
|
386 | |
|
|
387 | EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; |
|
|
388 | |
|
|
389 | =item eio_statvfs (const char *path, int pri, eio_cb cb, void *data) |
|
|
390 | |
|
|
391 | =item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data) |
|
|
392 | |
|
|
393 | Stats a filesystem - if C<< req->result >> indicates success, then you can |
|
|
394 | access the C<struct statvfs>-like structure via C<< req->ptr2 >>: |
|
|
395 | |
|
|
396 | EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; |
|
|
397 | |
|
|
398 | =back |
|
|
399 | |
|
|
400 | =head3 READING DIRECTORIES |
|
|
401 | |
|
|
402 | Reading directories sounds simple, but can be rather demanding, especially |
|
|
403 | if you want to do stuff such as traversing a diretcory hierarchy or |
|
|
404 | processing all files in a directory. Libeio can assist thess complex tasks |
|
|
405 | with it's C<eio_readdir> call. |
|
|
406 | |
|
|
407 | =over 4 |
|
|
408 | |
|
|
409 | =item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data) |
|
|
410 | |
|
|
411 | This is a very complex call. It basically reads through a whole directory |
|
|
412 | (via the C<opendir>, C<readdir> and C<closedir> calls) and returns either |
|
|
413 | the names or an array of C<struct eio_dirent>, depending on the C<flags> |
|
|
414 | argument. |
|
|
415 | |
|
|
416 | The C<< req->result >> indicates either the number of files found, or |
|
|
417 | C<-1> on error. On success, null-terminated names can be found as C<< req->ptr2 >>, |
|
|
418 | and C<struct eio_dirents>, if requested by C<flags>, can be found via C<< |
|
|
419 | req->ptr1 >>. |
|
|
420 | |
|
|
421 | Here is an example that prints all the names: |
|
|
422 | |
|
|
423 | int i; |
|
|
424 | char *names = (char *)req->ptr2; |
|
|
425 | |
|
|
426 | for (i = 0; i < req->result; ++i) |
|
|
427 | { |
|
|
428 | printf ("name #%d: %s\n", i, names); |
|
|
429 | |
|
|
430 | /* move to next name */ |
|
|
431 | names += strlen (names) + 1; |
|
|
432 | } |
|
|
433 | |
|
|
434 | Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>. |
|
|
435 | |
|
|
436 | C<flags> can be any combination of: |
|
|
437 | |
|
|
438 | =over 4 |
|
|
439 | |
|
|
440 | =item EIO_READDIR_DENTS |
|
|
441 | |
|
|
442 | If this flag is specified, then, in addition to the names in C<ptr2>, |
|
|
443 | also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct |
|
|
444 | eio_dirent> looks like this: |
|
|
445 | |
|
|
446 | struct eio_dirent |
|
|
447 | { |
|
|
448 | int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ |
|
|
449 | unsigned short namelen; /* size of filename without trailing 0 */ |
|
|
450 | unsigned char type; /* one of EIO_DT_* */ |
|
|
451 | signed char score; /* internal use */ |
|
|
452 | ino_t inode; /* the inode number, if available, otherwise unspecified */ |
|
|
453 | }; |
|
|
454 | |
|
|
455 | The only members you normally would access are C<nameofs>, which is the |
|
|
456 | byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>. |
|
|
457 | |
|
|
458 | C<type> can be one of: |
|
|
459 | |
|
|
460 | C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat> |
|
|
461 | the name yourself if you need to know, |
|
|
462 | one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>, |
|
|
463 | C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>) |
|
|
464 | or some OS-specific type (currently |
|
|
465 | C<EIO_DT_MPC> - multiplexed char device (v7+coherent), |
|
|
466 | C<EIO_DT_NAM> - xenix special named file, |
|
|
467 | C<EIO_DT_MPB> - multiplexed block device (v7+coherent), |
|
|
468 | C<EIO_DT_NWK> - HP-UX network special, |
|
|
469 | C<EIO_DT_CMP> - VxFS compressed, |
|
|
470 | C<EIO_DT_DOOR> - solaris door, or |
|
|
471 | C<EIO_DT_WHT>). |
|
|
472 | |
|
|
473 | This example prints all names and their type: |
|
|
474 | |
|
|
475 | int i; |
|
|
476 | struct eio_dirent *ents = (struct eio_dirent *)req->ptr1; |
|
|
477 | char *names = (char *)req->ptr2; |
|
|
478 | |
|
|
479 | for (i = 0; i < req->result; ++i) |
|
|
480 | { |
|
|
481 | struct eio_dirent *ent = ents + i; |
|
|
482 | char *name = names + ent->nameofs; |
|
|
483 | |
|
|
484 | printf ("name #%d: %s (type %d)\n", i, name, ent->type); |
|
|
485 | } |
|
|
486 | |
|
|
487 | =item EIO_READDIR_DIRS_FIRST |
|
|
488 | |
|
|
489 | When this flag is specified, then the names will be returned in an order |
|
|
490 | where likely directories come first, in optimal C<stat> order. This is |
|
|
491 | useful when you need to quickly find directories, or you want to find all |
|
|
492 | directories while avoiding to stat() each entry. |
|
|
493 | |
|
|
494 | If the system returns type information in readdir, then this is used |
|
|
495 | to find directories directly. Otherwise, likely directories are names |
|
|
496 | beginning with ".", or otherwise names with no dots, of which names with |
|
|
497 | short names are tried first. |
|
|
498 | |
|
|
499 | =item EIO_READDIR_STAT_ORDER |
|
|
500 | |
|
|
501 | When this flag is specified, then the names will be returned in an order |
|
|
502 | suitable for stat()'ing each one. That is, when you plan to stat() |
|
|
503 | all files in the given directory, then the returned order will likely |
|
|
504 | be fastest. |
|
|
505 | |
|
|
506 | If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then |
|
|
507 | the likely dirs come first, resulting in a less optimal stat order. |
|
|
508 | |
|
|
509 | =item EIO_READDIR_FOUND_UNKNOWN |
|
|
510 | |
|
|
511 | This flag should not be specified when calling C<eio_readdir>. Instead, |
|
|
512 | it is being set by C<eio_readdir> (you can access the C<flags> via C<< |
|
|
513 | req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The |
|
|
514 | absense of this flag therefore indicates that all C<type>'s are known, |
|
|
515 | which can be used to speed up some algorithms. |
|
|
516 | |
|
|
517 | A typical use case would be to identify all subdirectories within a |
|
|
518 | directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If |
|
|
519 | then this flag is I<NOT> set, then all the entries at the beginning of the |
|
|
520 | returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you |
|
|
521 | should start C<stat()>'ing the entries starting at the beginning of the |
|
|
522 | array, stopping as soon as you found all directories (the count can be |
|
|
523 | deduced by the link count of the directory). |
|
|
524 | |
|
|
525 | =back |
|
|
526 | |
|
|
527 | =back |
|
|
528 | |
|
|
529 | =head3 OS-SPECIFIC CALL WRAPPERS |
|
|
530 | |
|
|
531 | These wrap OS-specific calls (usually Linux ones), and might or might not |
|
|
532 | be emulated on other operating systems. Calls that are not emulated will |
|
|
533 | return C<-1> and set C<errno> to C<ENOSYS>. |
|
|
534 | |
|
|
535 | =over 4 |
|
|
536 | |
|
|
537 | =item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data) |
|
|
538 | |
|
|
539 | Wraps the C<sendfile> syscall. The arguments follow the Linux version, but |
|
|
540 | libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and |
|
|
541 | Darwin. |
|
|
542 | |
|
|
543 | If the OS doesn't support some sendfile-like call, or the call fails, |
|
|
544 | indicating support for the given file descriptor type (for example, |
|
|
545 | Linux's sendfile might not support file to file copies), then libeio will |
|
|
546 | emulate the call in userspace, so there are almost no limitations on its |
|
|
547 | use. |
|
|
548 | |
|
|
549 | =item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data) |
|
|
550 | |
|
|
551 | Calls C<readahead(2)>. If the syscall is missing, then the call is |
|
|
552 | emulated by simply reading the data (currently in 64kiB chunks). |
|
|
553 | |
|
|
554 | =item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data) |
|
|
555 | |
|
|
556 | Calls C<sync_file_range>. If the syscall is missing, then this is the same |
|
|
557 | as calling C<fdatasync>. |
|
|
558 | |
|
|
559 | Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>, |
|
|
560 | C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>. |
|
|
561 | |
|
|
562 | =back |
|
|
563 | |
|
|
564 | =head3 LIBEIO-SPECIFIC REQUESTS |
|
|
565 | |
|
|
566 | These requests are specific to libeio and do not correspond to any OS call. |
|
|
567 | |
|
|
568 | =over 4 |
|
|
569 | |
|
|
570 | =item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data) |
|
|
571 | |
|
|
572 | Reads (C<flags == 0>) or modifies (C<flags == EIO_MT_MODIFY) the given |
|
|
573 | memory area, page-wise, that is, it reads (or reads and writes back) the |
|
|
574 | first octet of every page that spans the memory area. |
|
|
575 | |
|
|
576 | This can be used to page in some mmapped file, or dirty some pages. Note |
|
|
577 | that dirtying is an unlocked read-write access, so races can ensue when |
|
|
578 | the some other thread modifies the data stored in that memory area. |
|
|
579 | |
|
|
580 | =item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data) |
|
|
581 | |
|
|
582 | Executes a custom request, i.e., a user-specified callback. |
|
|
583 | |
|
|
584 | The callback gets the C<eio_req *> as parameter and is expected to read |
|
|
585 | and modify any request-specific members. Specifically, it should set C<< |
|
|
586 | req->result >> to the result value, just like other requests. |
|
|
587 | |
|
|
588 | Here is an example that simply calls C<open>, like C<eio_open>, but it |
|
|
589 | uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If |
|
|
590 | you want to pass more/other parameters, you either need to pass some |
|
|
591 | struct or so via C<data> or provide your own wrapper using the low-level |
|
|
592 | API. |
|
|
593 | |
|
|
594 | static int |
|
|
595 | my_open_done (eio_req *req) |
|
|
596 | { |
|
|
597 | int fd = req->result; |
|
|
598 | |
|
|
599 | return 0; |
|
|
600 | } |
|
|
601 | |
|
|
602 | static void |
|
|
603 | my_open (eio_req *req) |
|
|
604 | { |
|
|
605 | req->result = open (req->data, O_RDONLY); |
|
|
606 | } |
|
|
607 | |
|
|
608 | eio_custom (my_open, 0, my_open_done, "/etc/passwd"); |
|
|
609 | |
|
|
610 | =item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data) |
|
|
611 | |
|
|
612 | This is a a request that takes C<delay> seconds to execute, but otherwise |
|
|
613 | does nothing - it simply puts one of the worker threads to sleep for this |
|
|
614 | long. |
|
|
615 | |
|
|
616 | This request can be used to artificially increase load, e.g. for debugging |
|
|
617 | or benchmarking reasons. |
|
|
618 | |
|
|
619 | =item eio_nop (int pri, eio_cb cb, void *data) |
|
|
620 | |
|
|
621 | This request does nothing, except go through the whole request cycle. This |
|
|
622 | can be used to measure latency or in some cases to simplify code, but is |
|
|
623 | not really of much use. |
|
|
624 | |
|
|
625 | =back |
|
|
626 | |
|
|
627 | =head3 GROUPING AND LIMITING REQUESTS |
|
|
628 | |
|
|
629 | There is one more rather special request, C<eio_grp>. It is a very special |
|
|
630 | aio request: Instead of doing something, it is a container for other eio |
|
|
631 | requests. |
|
|
632 | |
|
|
633 | There are two primary use cases for this: a) bundle many requests into a |
|
|
634 | single, composite, request with a definite callback and the ability to |
|
|
635 | cancel the whole request with its subrequests and b) limiting the number |
|
|
636 | of "active" requests. |
|
|
637 | |
|
|
638 | Further below you will find more dicussion of these topics - first follows |
|
|
639 | the reference section detailing the request generator and other methods. |
|
|
640 | |
|
|
641 | =over 4 |
|
|
642 | |
|
|
643 | =item eio_grp (eio_cb cb, void *data) |
|
|
644 | |
|
|
645 | Creates and submits a group request. |
|
|
646 | |
|
|
647 | =back |
|
|
648 | |
|
|
649 | |
|
|
650 | |
|
|
651 | #TODO |
|
|
652 | |
|
|
653 | /*****************************************************************************/ |
|
|
654 | /* groups */ |
|
|
655 | |
|
|
656 | eio_req *eio_grp (eio_cb cb, void *data); |
|
|
657 | void eio_grp_feed (eio_req *grp, void (*feed)(eio_req *req), int limit); |
|
|
658 | void eio_grp_limit (eio_req *grp, int limit); |
|
|
659 | void eio_grp_add (eio_req *grp, eio_req *req); |
|
|
660 | void eio_grp_cancel (eio_req *grp); /* cancels all sub requests but not the group */ |
|
|
661 | |
|
|
662 | |
|
|
663 | =back |
|
|
664 | |
|
|
665 | |
|
|
666 | =head1 LOW LEVEL REQUEST API |
|
|
667 | |
|
|
668 | #TODO |
|
|
669 | |
|
|
670 | |
|
|
671 | =head1 ANATOMY AND LIFETIME OF AN EIO REQUEST |
|
|
672 | |
|
|
673 | A request is represented by a structure of type C<eio_req>. To initialise |
|
|
674 | it, clear it to all zero bytes: |
|
|
675 | |
|
|
676 | eio_req req; |
|
|
677 | |
|
|
678 | memset (&req, 0, sizeof (req)); |
|
|
679 | |
|
|
680 | A more common way to initialise a new C<eio_req> is to use C<calloc>: |
|
|
681 | |
|
|
682 | eio_req *req = calloc (1, sizeof (*req)); |
|
|
683 | |
|
|
684 | In either case, libeio neither allocates, initialises or frees the |
|
|
685 | C<eio_req> structure for you - it merely uses it. |
|
|
686 | |
|
|
687 | zero |
|
|
688 | |
|
|
689 | #TODO |
137 | |
690 | |
138 | =head2 CONFIGURATION |
691 | =head2 CONFIGURATION |
139 | |
692 | |
140 | The functions in this section can sometimes be useful, but the default |
693 | The functions in this section can sometimes be useful, but the default |
141 | configuration will do in most case, so you should skip this section on |
694 | configuration will do in most case, so you should skip this section on |
… | |
… | |
185 | =item eio_set_max_idle (unsigned int nthreads) |
738 | =item eio_set_max_idle (unsigned int nthreads) |
186 | |
739 | |
187 | Libeio uses threads internally to handle most requests, and will start and stop threads on demand. |
740 | Libeio uses threads internally to handle most requests, and will start and stop threads on demand. |
188 | |
741 | |
189 | This call can be used to limit the number of idle threads (threads without |
742 | This call can be used to limit the number of idle threads (threads without |
190 | work to do): libeio will keep some threads idle in preperation for more |
743 | work to do): libeio will keep some threads idle in preparation for more |
191 | requests, but never longer than C<nthreads> threads. |
744 | requests, but never longer than C<nthreads> threads. |
192 | |
745 | |
193 | In addition to this, libeio will also stop threads when they are idle for |
746 | In addition to this, libeio will also stop threads when they are idle for |
194 | a few seconds, regardless of this setting. |
747 | a few seconds, regardless of this setting. |
195 | |
748 | |
… | |
… | |
214 | executed and have results, but have not been finished yet by a call to |
767 | executed and have results, but have not been finished yet by a call to |
215 | C<eio_poll>). |
768 | C<eio_poll>). |
216 | |
769 | |
217 | =back |
770 | =back |
218 | |
771 | |
219 | |
|
|
220 | =head1 ANATOMY OF AN EIO REQUEST |
|
|
221 | |
|
|
222 | #TODO |
|
|
223 | |
|
|
224 | |
|
|
225 | =head1 HIGH LEVEL REQUEST API |
|
|
226 | |
|
|
227 | #TODO |
|
|
228 | |
|
|
229 | =back |
|
|
230 | |
|
|
231 | |
|
|
232 | =head1 LOW LEVEL REQUEST API |
|
|
233 | |
|
|
234 | #TODO |
|
|
235 | |
|
|
236 | =head1 EMBEDDING |
772 | =head1 EMBEDDING |
237 | |
773 | |
238 | Libeio can be embedded directly into programs. This functionality is not |
774 | Libeio can be embedded directly into programs. This functionality is not |
239 | documented and not (yet) officially supported. |
775 | documented and not (yet) officially supported. |
240 | |
776 | |
|
|
777 | Note that, when including C<libeio.m4>, you are responsible for defining |
|
|
778 | the compilation environment (C<_LARGEFILE_SOURCE>, C<_GNU_SOURCE> etc.). |
|
|
779 | |
241 | If you ened to know how, cehck the C<IO::AIO> perl module, which does |
780 | If you need to know how, check the C<IO::AIO> perl module, which does |
242 | exactly that. |
781 | exactly that. |
|
|
782 | |
|
|
783 | |
|
|
784 | =head1 COMPILETIME CONFIGURATION |
|
|
785 | |
|
|
786 | These symbols, if used, must be defined when compiling F<eio.c>. |
|
|
787 | |
|
|
788 | =over 4 |
|
|
789 | |
|
|
790 | =item EIO_STACKSIZE |
|
|
791 | |
|
|
792 | This symbol governs the stack size for each eio thread. Libeio itself |
|
|
793 | was written to use very little stackspace, but when using C<EIO_CUSTOM> |
|
|
794 | requests, you might want to increase this. |
|
|
795 | |
|
|
796 | If this symbol is undefined (the default) then libeio will use its default |
|
|
797 | stack size (C<sizeof (long) * 4096> currently). If it is defined, but |
|
|
798 | C<0>, then the default operating system stack size will be used. In all |
|
|
799 | other cases, the value must be an expression that evaluates to the desired |
|
|
800 | stack size. |
|
|
801 | |
|
|
802 | =back |
243 | |
803 | |
244 | |
804 | |
245 | =head1 PORTABILITY REQUIREMENTS |
805 | =head1 PORTABILITY REQUIREMENTS |
246 | |
806 | |
247 | In addition to a working ISO-C implementation, libeio relies on a few |
807 | In addition to a working ISO-C implementation, libeio relies on a few |