1 |
root |
1.1 |
=head1 NAME |
2 |
|
|
|
3 |
|
|
libeio - truly asynchronous POSIX I/O |
4 |
|
|
|
5 |
|
|
=head1 SYNOPSIS |
6 |
|
|
|
7 |
|
|
#include <eio.h> |
8 |
|
|
|
9 |
|
|
=head1 DESCRIPTION |
10 |
|
|
|
11 |
|
|
The newest version of this document is also available as an html-formatted |
12 |
|
|
web page you might find easier to navigate when reading it for the first |
13 |
|
|
time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>. |
14 |
|
|
|
15 |
|
|
Note that this library is a by-product of the C<IO::AIO> perl |
16 |
sf-exg |
1.6 |
module, and many of the subtler points regarding requests lifetime |
17 |
root |
1.1 |
and so on are only documented in its documentation at the |
18 |
|
|
moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>. |
19 |
|
|
|
20 |
|
|
=head2 FEATURES |
21 |
|
|
|
22 |
|
|
This library provides fully asynchronous versions of most POSIX functions |
23 |
sf-exg |
1.6 |
dealing with I/O. Unlike most asynchronous libraries, this not only |
24 |
root |
1.1 |
includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and |
25 |
|
|
similar functions, as well as less rarely ones such as C<mknod>, C<futime> |
26 |
|
|
or C<readlink>. |
27 |
|
|
|
28 |
|
|
It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and |
29 |
|
|
FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with |
30 |
|
|
emulation elsewhere>). |
31 |
|
|
|
32 |
root |
1.5 |
The goal is to enable you to write fully non-blocking programs. For |
33 |
root |
1.1 |
example, in a game server, you would not want to freeze for a few seconds |
34 |
|
|
just because the server is running a backup and you happen to call |
35 |
|
|
C<readdir>. |
36 |
|
|
|
37 |
|
|
=head2 TIME REPRESENTATION |
38 |
|
|
|
39 |
|
|
Libeio represents time as a single floating point number, representing the |
40 |
|
|
(fractional) number of seconds since the (POSIX) epoch (somewhere near |
41 |
|
|
the beginning of 1970, details are complicated, don't ask). This type is |
42 |
sf-exg |
1.6 |
called C<eio_tstamp>, but it is guaranteed to be of type C<double> (or |
43 |
root |
1.1 |
better), so you can freely use C<double> yourself. |
44 |
|
|
|
45 |
|
|
Unlike the name component C<stamp> might indicate, it is also used for |
46 |
|
|
time differences throughout libeio. |
47 |
|
|
|
48 |
|
|
=head2 FORK SUPPORT |
49 |
|
|
|
50 |
|
|
Calling C<fork ()> is fully supported by this module. It is implemented in these steps: |
51 |
|
|
|
52 |
|
|
1. wait till all requests in "execute" state have been handled |
53 |
|
|
(basically requests that are already handed over to the kernel). |
54 |
|
|
2. fork |
55 |
|
|
3. in the parent, continue business as usual, done |
56 |
|
|
4. in the child, destroy all ready and pending requests and free the |
57 |
|
|
memory used by the worker threads. This gives you a fully empty |
58 |
|
|
libeio queue. |
59 |
|
|
|
60 |
root |
1.7 |
Note, however, since libeio does use threads, thr above guarantee doesn't |
61 |
|
|
cover your libc, for example, malloc and other libc functions are not |
62 |
|
|
fork-safe, so there is very little you can do after a fork, and in fatc, |
63 |
|
|
the above might crash, and thus change. |
64 |
|
|
|
65 |
root |
1.1 |
=head1 INITIALISATION/INTEGRATION |
66 |
|
|
|
67 |
|
|
Before you can call any eio functions you first have to initialise the |
68 |
|
|
library. The library integrates into any event loop, but can also be used |
69 |
|
|
without one, including in polling mode. |
70 |
|
|
|
71 |
|
|
You have to provide the necessary glue yourself, however. |
72 |
|
|
|
73 |
|
|
=over 4 |
74 |
|
|
|
75 |
|
|
=item int eio_init (void (*want_poll)(void), void (*done_poll)(void)) |
76 |
|
|
|
77 |
|
|
This function initialises the library. On success it returns C<0>, on |
78 |
|
|
failure it returns C<-1> and sets C<errno> appropriately. |
79 |
|
|
|
80 |
|
|
It accepts two function pointers specifying callbacks as argument, both of |
81 |
|
|
which can be C<0>, in which case the callback isn't called. |
82 |
|
|
|
83 |
|
|
=item want_poll callback |
84 |
|
|
|
85 |
|
|
The C<want_poll> callback is invoked whenever libeio wants attention (i.e. |
86 |
|
|
it wants to be polled by calling C<eio_poll>). It is "edge-triggered", |
87 |
|
|
that is, it will only be called once when eio wants attention, until all |
88 |
|
|
pending requests have been handled. |
89 |
|
|
|
90 |
|
|
This callback is called while locks are being held, so I<you must |
91 |
|
|
not call any libeio functions inside this callback>. That includes |
92 |
|
|
C<eio_poll>. What you should do is notify some other thread, or wake up |
93 |
|
|
your event loop, and then call C<eio_poll>. |
94 |
|
|
|
95 |
|
|
=item done_poll callback |
96 |
|
|
|
97 |
|
|
This callback is invoked when libeio detects that all pending requests |
98 |
|
|
have been handled. It is "edge-triggered", that is, it will only be |
99 |
|
|
called once after C<want_poll>. To put it differently, C<want_poll> and |
100 |
|
|
C<done_poll> are invoked in pairs: after C<want_poll> you have to call |
101 |
|
|
C<eio_poll ()> until either C<eio_poll> indicates that everything has been |
102 |
|
|
handled or C<done_poll> has been called, which signals the same. |
103 |
|
|
|
104 |
|
|
Note that C<eio_poll> might return after C<done_poll> and C<want_poll> |
105 |
|
|
have been called again, so watch out for races in your code. |
106 |
|
|
|
107 |
sf-exg |
1.6 |
As with C<want_poll>, this callback is called while locks are being held, |
108 |
root |
1.1 |
so you I<must not call any libeio functions form within this callback>. |
109 |
|
|
|
110 |
|
|
=item int eio_poll () |
111 |
|
|
|
112 |
|
|
This function has to be called whenever there are pending requests that |
113 |
|
|
need finishing. You usually call this after C<want_poll> has indicated |
114 |
|
|
that you should do so, but you can also call this function regularly to |
115 |
|
|
poll for new results. |
116 |
|
|
|
117 |
|
|
If any request invocation returns a non-zero value, then C<eio_poll ()> |
118 |
|
|
immediately returns with that value as return value. |
119 |
|
|
|
120 |
|
|
Otherwise, if all requests could be handled, it returns C<0>. If for some |
121 |
|
|
reason not all requests have been handled, i.e. some are still pending, it |
122 |
|
|
returns C<-1>. |
123 |
|
|
|
124 |
|
|
=back |
125 |
|
|
|
126 |
|
|
For libev, you would typically use an C<ev_async> watcher: the |
127 |
|
|
C<want_poll> callback would invoke C<ev_async_send> to wake up the event |
128 |
|
|
loop. Inside the callback set for the watcher, one would call C<eio_poll |
129 |
|
|
()> (followed by C<ev_async_send> again if C<eio_poll> indicates that not |
130 |
|
|
all requests have been handled yet). The race is taken care of because |
131 |
|
|
libev resets/rearms the async watcher before calling your callback, |
132 |
|
|
and therefore, before calling C<eio_poll>. This might result in (some) |
133 |
|
|
spurious wake-ups, but is generally harmless. |
134 |
|
|
|
135 |
|
|
For most other event loops, you would typically use a pipe - the event |
136 |
sf-exg |
1.6 |
loop should be told to wait for read readiness on the read end. In |
137 |
root |
1.1 |
C<want_poll> you would write a single byte, in C<done_poll> you would try |
138 |
|
|
to read that byte, and in the callback for the read end, you would call |
139 |
|
|
C<eio_poll>. The race is avoided here because the event loop should invoke |
140 |
|
|
your callback again and again until the byte has been read (as the pipe |
141 |
|
|
read callback does not read it, only C<done_poll>). |
142 |
|
|
|
143 |
|
|
|
144 |
root |
1.7 |
=head1 HIGH LEVEL REQUEST API |
145 |
|
|
|
146 |
|
|
Libeio has both a high-level API, which consists of calling a request |
147 |
|
|
function with a callback to be called on completion, and a low-level API |
148 |
|
|
where you fill out request structures and submit them. |
149 |
|
|
|
150 |
|
|
This section describes the high-level API. |
151 |
|
|
|
152 |
|
|
=head2 REQUEST SUBMISSION AND RESULT PROCESSING |
153 |
|
|
|
154 |
|
|
You submit a request by calling the relevant C<eio_TYPE> function with the |
155 |
|
|
required parameters, a callback of type C<int (*eio_cb)(eio_req *req)> |
156 |
|
|
(called C<eio_cb> below) and a freely usable C<void *data> argument. |
157 |
|
|
|
158 |
|
|
The return value will either be 0 |
159 |
|
|
|
160 |
|
|
The callback will be called with an C<eio_req *> which contains the |
161 |
|
|
results of the request. The members you can access inside that structure |
162 |
|
|
vary from request to request, except for: |
163 |
|
|
|
164 |
|
|
=over 4 |
165 |
|
|
|
166 |
|
|
=item C<ssize_t result> |
167 |
|
|
|
168 |
|
|
This contains the result value from the call (usually the same as the |
169 |
|
|
syscall of the same name). |
170 |
|
|
|
171 |
|
|
=item C<int errorno> |
172 |
|
|
|
173 |
|
|
This contains the value of C<errno> after the call. |
174 |
|
|
|
175 |
|
|
=item C<void *data> |
176 |
|
|
|
177 |
|
|
The C<void *data> member simply stores the value of the C<data> argument. |
178 |
|
|
|
179 |
|
|
=back |
180 |
|
|
|
181 |
|
|
The return value of the callback is normally C<0>, which tells libeio to |
182 |
|
|
continue normally. If a callback returns a nonzero value, libeio will |
183 |
|
|
stop processing results (in C<eio_poll>) and will return the value to its |
184 |
|
|
caller. |
185 |
|
|
|
186 |
|
|
Memory areas passed to libeio must stay valid as long as a request |
187 |
|
|
executes, with the exception of paths, which are being copied |
188 |
|
|
internally. Any memory libeio itself allocates will be freed after the |
189 |
|
|
finish callback has been called. If you want to manage all memory passed |
190 |
|
|
to libeio yourself you can use the low-level API. |
191 |
|
|
|
192 |
|
|
For example, to open a file, you could do this: |
193 |
|
|
|
194 |
|
|
static int |
195 |
|
|
file_open_done (eio_req *req) |
196 |
|
|
{ |
197 |
|
|
if (req->result < 0) |
198 |
|
|
{ |
199 |
|
|
/* open() returned -1 */ |
200 |
|
|
errno = req->errorno; |
201 |
|
|
perror ("open"); |
202 |
|
|
} |
203 |
|
|
else |
204 |
|
|
{ |
205 |
|
|
int fd = req->result; |
206 |
|
|
/* now we have the new fd in fd */ |
207 |
|
|
} |
208 |
|
|
|
209 |
|
|
return 0; |
210 |
|
|
} |
211 |
|
|
|
212 |
|
|
/* the first three arguments are passed to open(2) */ |
213 |
|
|
/* the remaining are priority, callback and data */ |
214 |
|
|
if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0)) |
215 |
|
|
abort (); /* something ent wrong, we will all die!!! */ |
216 |
|
|
|
217 |
|
|
Note that you additionally need to call C<eio_poll> when the C<want_cb> |
218 |
|
|
indicates that requests are ready to be processed. |
219 |
|
|
|
220 |
|
|
=head2 AVAILABLE REQUESTS |
221 |
|
|
|
222 |
|
|
The following request functions are available. I<All> of them return the |
223 |
|
|
C<eio_req *> on success and C<0> on failure, and I<all> of them have the |
224 |
|
|
same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is |
225 |
|
|
mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some |
226 |
|
|
custom data value as C<data>. |
227 |
|
|
|
228 |
|
|
=head3 POSIX API WRAPPERS |
229 |
|
|
|
230 |
|
|
These requests simply wrap the POSIX call of the same name, with the same |
231 |
|
|
arguments: |
232 |
|
|
|
233 |
|
|
=over 4 |
234 |
|
|
|
235 |
|
|
=item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data) |
236 |
|
|
|
237 |
|
|
=item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data) |
238 |
|
|
|
239 |
|
|
=item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data) |
240 |
|
|
|
241 |
|
|
=item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data) |
242 |
|
|
|
243 |
|
|
=item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data) |
244 |
|
|
|
245 |
|
|
=item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data) |
246 |
|
|
|
247 |
|
|
=item eio_rmdir (const char *path, int pri, eio_cb cb, void *data) |
248 |
|
|
|
249 |
|
|
=item eio_unlink (const char *path, int pri, eio_cb cb, void *data) |
250 |
|
|
|
251 |
|
|
=item eio_readlink (const char *path, int pri, eio_cb cb, void *data) /* result=ptr2 allocated dynamically */ |
252 |
|
|
|
253 |
|
|
=item eio_stat (const char *path, int pri, eio_cb cb, void *data) /* stat buffer=ptr2 allocated dynamically */ |
254 |
|
|
|
255 |
|
|
=item eio_lstat (const char *path, int pri, eio_cb cb, void *data) /* stat buffer=ptr2 allocated dynamically */ |
256 |
|
|
|
257 |
|
|
=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data) /* stat buffer=ptr2 allocated dynamically */ |
258 |
|
|
|
259 |
|
|
=item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data) |
260 |
|
|
|
261 |
|
|
=item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
262 |
|
|
|
263 |
|
|
=item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
264 |
|
|
|
265 |
|
|
=item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data) |
266 |
|
|
|
267 |
|
|
=item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data) |
268 |
|
|
|
269 |
|
|
=item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data) |
270 |
|
|
|
271 |
|
|
=item eio_mlockall (int flags, int pri, eio_cb cb, void *data) |
272 |
|
|
|
273 |
|
|
=item eio_close (int fd, int pri, eio_cb cb, void *data) |
274 |
|
|
|
275 |
|
|
=item eio_sync (int pri, eio_cb cb, void *data) |
276 |
|
|
|
277 |
|
|
=item eio_fsync (int fd, int pri, eio_cb cb, void *data) |
278 |
|
|
|
279 |
|
|
=item eio_fdatasync (int fd, int pri, eio_cb cb, void *data) |
280 |
|
|
|
281 |
|
|
=item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data) |
282 |
|
|
|
283 |
|
|
=item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data) |
284 |
|
|
|
285 |
|
|
=item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data) |
286 |
|
|
|
287 |
|
|
=item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data) |
288 |
|
|
|
289 |
|
|
=item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data) |
290 |
|
|
|
291 |
|
|
These have the same semantics as the syscall of the same name, their |
292 |
|
|
return value is available as C<< req->result >> later. |
293 |
|
|
|
294 |
|
|
=item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data) |
295 |
|
|
|
296 |
|
|
=item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data) |
297 |
|
|
|
298 |
|
|
These two requests are called C<read> and C<write>, but actually wrap |
299 |
|
|
C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin), |
300 |
|
|
libeio uses lseek/read_or_write/lseek and a mutex to serialise the |
301 |
|
|
requests, so all these requests run serially and do not disturb each |
302 |
|
|
other. However, they still disturb the file offset while they run, so it's |
303 |
|
|
not safe to call these functions concurrently with non-libeio functions on |
304 |
|
|
the same fd on these systems. |
305 |
|
|
|
306 |
|
|
Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X), |
307 |
|
|
so it is advised not to submit multiple requests on the same fd on this |
308 |
|
|
horrible pile of garbage. |
309 |
|
|
|
310 |
|
|
=item eio_fstat (int fd, int pri, eio_cb cb, void *data) |
311 |
|
|
|
312 |
|
|
Stats a file - if C<< req->result >> indicates success, then you can |
313 |
|
|
access the C<struct stat>-like structure via C<< req->ptr2 >>: |
314 |
|
|
|
315 |
|
|
EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2; |
316 |
|
|
|
317 |
|
|
=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data) /* stat buffer=ptr2 allocated dynamically */ |
318 |
|
|
|
319 |
|
|
Stats a filesystem - if C<< req->result >> indicates success, then you can |
320 |
|
|
access the C<struct statvfs>-like structure via C<< req->ptr2 >>: |
321 |
|
|
|
322 |
|
|
EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2; |
323 |
|
|
|
324 |
|
|
=back |
325 |
|
|
|
326 |
|
|
=head3 READING DIRECTORIES |
327 |
|
|
|
328 |
|
|
Reading directories sounds simple, but can be rather demanding, especially |
329 |
|
|
if you want to do stuff such as traversing a diretcory hierarchy or |
330 |
|
|
processing all files in a directory. Libeio can assist thess complex tasks |
331 |
|
|
with it's C<eio_readdir> call. |
332 |
|
|
|
333 |
|
|
=over 4 |
334 |
|
|
|
335 |
|
|
=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data) |
336 |
|
|
|
337 |
|
|
This is a very complex call. It basically reads through a whole directory |
338 |
|
|
(via the C<opendir>, C<readdir> and C<closedir> calls) and returns either |
339 |
|
|
the names or an array of C<struct eio_dirent>, depending on the C<flags> |
340 |
|
|
argument. |
341 |
|
|
|
342 |
|
|
The C<< req->result >> indicates either the number of files found, or |
343 |
|
|
C<-1> on error. On success, zero-terminated names can be found as C<< req->ptr2 >>, |
344 |
|
|
and C<struct eio_dirents>, if requested by C<flags>, can be found via C<< |
345 |
|
|
req->ptr1 >>. |
346 |
|
|
|
347 |
|
|
Here is an example that prints all the names: |
348 |
|
|
|
349 |
|
|
int i; |
350 |
|
|
char *names = (char *)req->ptr2; |
351 |
|
|
|
352 |
|
|
for (i = 0; i < req->result; ++i) |
353 |
|
|
{ |
354 |
|
|
printf ("name #%d: %s\n", i, names); |
355 |
|
|
|
356 |
|
|
/* move to next name */ |
357 |
|
|
names += strlen (names) + 1; |
358 |
|
|
} |
359 |
|
|
|
360 |
|
|
Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>. |
361 |
|
|
|
362 |
|
|
C<flags> can be any combination of: |
363 |
|
|
|
364 |
|
|
=over 4 |
365 |
|
|
|
366 |
|
|
=item EIO_READDIR_DENTS |
367 |
|
|
|
368 |
|
|
If this flag is specified, then, in addition to the names in C<ptr2>, |
369 |
|
|
also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct |
370 |
|
|
eio_dirent> looks like this: |
371 |
|
|
|
372 |
|
|
struct eio_dirent |
373 |
|
|
{ |
374 |
|
|
int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */ |
375 |
|
|
unsigned short namelen; /* size of filename without trailing 0 */ |
376 |
|
|
unsigned char type; /* one of EIO_DT_* */ |
377 |
|
|
signed char score; /* internal use */ |
378 |
|
|
ino_t inode; /* the inode number, if available, otherwise unspecified */ |
379 |
|
|
}; |
380 |
|
|
|
381 |
|
|
The only members you normally would access are C<nameofs>, which is the |
382 |
|
|
byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>. |
383 |
|
|
|
384 |
|
|
C<type> can be one of: |
385 |
|
|
|
386 |
|
|
C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat> |
387 |
|
|
the name yourself if you need to know, |
388 |
|
|
one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>, |
389 |
|
|
C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>) |
390 |
|
|
or some OS-specific type (currently |
391 |
|
|
C<EIO_DT_MPC> - multiplexed char device (v7+coherent), |
392 |
|
|
C<EIO_DT_NAM> - xenix special named file, |
393 |
|
|
C<EIO_DT_MPB> - multiplexed block device (v7+coherent), |
394 |
|
|
C<EIO_DT_NWK> - HP-UX network special, |
395 |
|
|
C<EIO_DT_CMP> - VxFS compressed, |
396 |
|
|
C<EIO_DT_DOOR> - solaris door, or |
397 |
|
|
C<EIO_DT_WHT>). |
398 |
|
|
|
399 |
|
|
This example prints all names and their type: |
400 |
|
|
|
401 |
|
|
int i; |
402 |
|
|
struct eio_dirent *ents = (struct eio_dirent *)req->ptr1; |
403 |
|
|
char *names = (char *)req->ptr2; |
404 |
|
|
|
405 |
|
|
for (i = 0; i < req->result; ++i) |
406 |
|
|
{ |
407 |
|
|
struct eio_dirent *ent = ents + i; |
408 |
|
|
char *name = names + ent->nameofs; |
409 |
|
|
|
410 |
|
|
printf ("name #%d: %s (type %d)\n", i, name, ent->type); |
411 |
|
|
} |
412 |
|
|
|
413 |
|
|
=item EIO_READDIR_DIRS_FIRST |
414 |
|
|
|
415 |
|
|
When this flag is specified, then the names will be returned in an order |
416 |
|
|
where likely directories come first, in optimal C<stat> order. This is |
417 |
|
|
useful when you need to quickly find directories, or you want to find all |
418 |
|
|
directories while avoiding to stat() each entry. |
419 |
|
|
|
420 |
|
|
If the system returns type information in readdir, then this is used |
421 |
|
|
to find directories directly. Otherwise, likely directories are names |
422 |
|
|
beginning with ".", or otherwise names with no dots, of which names with |
423 |
|
|
short names are tried first. |
424 |
|
|
|
425 |
|
|
=item EIO_READDIR_STAT_ORDER |
426 |
|
|
|
427 |
|
|
When this flag is specified, then the names will be returned in an order |
428 |
|
|
suitable for stat()'ing each one. That is, when you plan to stat() |
429 |
|
|
all files in the given directory, then the returned order will likely |
430 |
|
|
be fastest. |
431 |
|
|
|
432 |
|
|
If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then |
433 |
|
|
the likely dirs come first, resulting in a less optimal stat order. |
434 |
|
|
|
435 |
|
|
=item EIO_READDIR_FOUND_UNKNOWN |
436 |
|
|
|
437 |
|
|
This flag should not be specified when calling C<eio_readdir>. Instead, |
438 |
|
|
it is being set by C<eio_readdir> (you can access the C<flags> via C<< |
439 |
|
|
req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The |
440 |
|
|
absense of this flag therefore indicates that all C<type>'s are known, |
441 |
|
|
which can be used to speed up some algorithms. |
442 |
|
|
|
443 |
|
|
A typical use case would be to identify all subdirectories within a |
444 |
|
|
directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If |
445 |
|
|
then this flag is I<NOT> set, then all the entries at the beginning of the |
446 |
|
|
returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you |
447 |
|
|
should start C<stat()>'ing the entries starting at the beginning of the |
448 |
|
|
array, stopping as soon as you found all directories (the count can be |
449 |
|
|
deduced by the link count of the directory). |
450 |
|
|
|
451 |
|
|
=back |
452 |
|
|
|
453 |
|
|
=back |
454 |
|
|
|
455 |
|
|
=head3 OS-SPECIFIC CALL WRAPPERS |
456 |
|
|
|
457 |
|
|
These wrap OS-specific calls (usually Linux ones), and might or might not |
458 |
|
|
be emulated on other operating systems. Calls that are not emulated will |
459 |
|
|
return C<-1> and set C<errno> to C<ENOSYS>. |
460 |
|
|
|
461 |
|
|
=over 4 |
462 |
|
|
|
463 |
|
|
=item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data) |
464 |
|
|
|
465 |
|
|
Wraps the C<sendfile> syscall. The arguments follow the Linux version, but |
466 |
|
|
libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and |
467 |
|
|
Darwin. |
468 |
|
|
|
469 |
|
|
If the OS doesn't support some sendfile-like call, or the call fails, |
470 |
|
|
indicating support for the given file descriptor type (for example, |
471 |
|
|
Linux's sendfile might not support file to file copies), then libeio will |
472 |
|
|
emulate the call in userspace, so there are almost no limitations on its |
473 |
|
|
use. |
474 |
|
|
|
475 |
|
|
=item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data) |
476 |
|
|
|
477 |
|
|
Calls C<readahead(2)>. If the syscall is missing, then the call is |
478 |
|
|
emulated by simply reading the data (currently in 64kiB chunks). |
479 |
|
|
|
480 |
|
|
=item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data) |
481 |
|
|
|
482 |
|
|
Calls C<sync_file_range>. If the syscall is missing, then this is the same |
483 |
|
|
as calling C<fdatasync>. |
484 |
|
|
|
485 |
|
|
=back |
486 |
|
|
|
487 |
|
|
=head3 LIBEIO-SPECIFIC REQUESTS |
488 |
|
|
|
489 |
|
|
These requests are specific to libeio and do not correspond to any OS call. |
490 |
|
|
|
491 |
|
|
=over 4 |
492 |
|
|
|
493 |
|
|
=item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data) |
494 |
|
|
|
495 |
|
|
=item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data) |
496 |
|
|
|
497 |
|
|
Executes a custom request, i.e., a user-specified callback. |
498 |
|
|
|
499 |
|
|
The callback gets the C<eio_req *> as parameter and is expected to read |
500 |
|
|
and modify any request-specific members. Specifically, it should set C<< |
501 |
|
|
req->result >> to the result value, just like other requests. |
502 |
|
|
|
503 |
|
|
Here is an example that simply calls C<open>, like C<eio_open>, but it |
504 |
|
|
uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If |
505 |
|
|
you want to pass more/other parameters, you either need to pass some |
506 |
|
|
struct or so via C<data> or provide your own wrapper using the low-level |
507 |
|
|
API. |
508 |
|
|
|
509 |
|
|
static int |
510 |
|
|
my_open_done (eio_req *req) |
511 |
|
|
{ |
512 |
|
|
int fd = req->result; |
513 |
|
|
|
514 |
|
|
return 0; |
515 |
|
|
} |
516 |
|
|
|
517 |
|
|
static void |
518 |
|
|
my_open (eio_req *req) |
519 |
|
|
{ |
520 |
|
|
req->result = open (req->data, O_RDONLY); |
521 |
|
|
} |
522 |
|
|
|
523 |
|
|
eio_custom (my_open, 0, my_open_done, "/etc/passwd"); |
524 |
|
|
|
525 |
|
|
=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data) |
526 |
|
|
|
527 |
|
|
This is a a request that takes C<delay> seconds to execute, but otherwise |
528 |
|
|
does nothing - it simply puts one of the worker threads to sleep for this |
529 |
|
|
long. |
530 |
|
|
|
531 |
|
|
This request can be used to artificially increase load, e.g. for debugging |
532 |
|
|
or benchmarking reasons. |
533 |
|
|
|
534 |
|
|
=item eio_nop (int pri, eio_cb cb, void *data) |
535 |
|
|
|
536 |
|
|
This request does nothing, except go through the whole request cycle. This |
537 |
|
|
can be used to measure latency or in some cases to simplify code, but is |
538 |
|
|
not really of much use. |
539 |
|
|
|
540 |
|
|
=back |
541 |
|
|
|
542 |
|
|
=head3 GROUPING AND LIMITING REQUESTS |
543 |
root |
1.1 |
|
544 |
|
|
#TODO |
545 |
|
|
|
546 |
root |
1.7 |
/*****************************************************************************/ |
547 |
|
|
/* groups */ |
548 |
root |
1.1 |
|
549 |
root |
1.7 |
eio_req *eio_grp (eio_cb cb, void *data); |
550 |
|
|
void eio_grp_feed (eio_req *grp, void (*feed)(eio_req *req), int limit); |
551 |
|
|
void eio_grp_limit (eio_req *grp, int limit); |
552 |
|
|
void eio_grp_add (eio_req *grp, eio_req *req); |
553 |
|
|
void eio_grp_cancel (eio_req *grp); /* cancels all sub requests but not the group */ |
554 |
root |
1.1 |
|
555 |
|
|
|
556 |
|
|
=back |
557 |
|
|
|
558 |
|
|
|
559 |
|
|
=head1 LOW LEVEL REQUEST API |
560 |
|
|
|
561 |
|
|
#TODO |
562 |
|
|
|
563 |
root |
1.7 |
|
564 |
|
|
=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST |
565 |
|
|
|
566 |
|
|
A request is represented by a structure of type C<eio_req>. To initialise |
567 |
|
|
it, clear it to all zero bytes: |
568 |
|
|
|
569 |
|
|
eio_req req; |
570 |
|
|
|
571 |
|
|
memset (&req, 0, sizeof (req)); |
572 |
|
|
|
573 |
|
|
A more common way to initialise a new C<eio_req> is to use C<calloc>: |
574 |
|
|
|
575 |
|
|
eio_req *req = calloc (1, sizeof (*req)); |
576 |
|
|
|
577 |
|
|
In either case, libeio neither allocates, initialises or frees the |
578 |
|
|
C<eio_req> structure for you - it merely uses it. |
579 |
|
|
|
580 |
|
|
zero |
581 |
|
|
|
582 |
|
|
#TODO |
583 |
|
|
|
584 |
root |
1.8 |
=head2 CONFIGURATION |
585 |
|
|
|
586 |
|
|
The functions in this section can sometimes be useful, but the default |
587 |
|
|
configuration will do in most case, so you should skip this section on |
588 |
|
|
first reading. |
589 |
|
|
|
590 |
|
|
=over 4 |
591 |
|
|
|
592 |
|
|
=item eio_set_max_poll_time (eio_tstamp nseconds) |
593 |
|
|
|
594 |
|
|
This causes C<eio_poll ()> to return after it has detected that it was |
595 |
|
|
running for C<nsecond> seconds or longer (this number can be fractional). |
596 |
|
|
|
597 |
|
|
This can be used to limit the amount of time spent handling eio requests, |
598 |
|
|
for example, in interactive programs, you might want to limit this time to |
599 |
|
|
C<0.01> seconds or so. |
600 |
|
|
|
601 |
|
|
Note that: |
602 |
|
|
|
603 |
|
|
a) libeio doesn't know how long your request callbacks take, so the time |
604 |
|
|
spent in C<eio_poll> is up to one callback invocation longer then this |
605 |
|
|
interval. |
606 |
|
|
|
607 |
|
|
b) this is implemented by calling C<gettimeofday> after each request, |
608 |
|
|
which can be costly. |
609 |
|
|
|
610 |
|
|
c) at least one request will be handled. |
611 |
|
|
|
612 |
|
|
=item eio_set_max_poll_reqs (unsigned int nreqs) |
613 |
|
|
|
614 |
|
|
When C<nreqs> is non-zero, then C<eio_poll> will not handle more than |
615 |
|
|
C<nreqs> requests per invocation. This is a less costly way to limit the |
616 |
|
|
amount of work done by C<eio_poll> then setting a time limit. |
617 |
|
|
|
618 |
|
|
If you know your callbacks are generally fast, you could use this to |
619 |
|
|
encourage interactiveness in your programs by setting it to C<10>, C<100> |
620 |
|
|
or even C<1000>. |
621 |
|
|
|
622 |
|
|
=item eio_set_min_parallel (unsigned int nthreads) |
623 |
|
|
|
624 |
|
|
Make sure libeio can handle at least this many requests in parallel. It |
625 |
|
|
might be able handle more. |
626 |
|
|
|
627 |
|
|
=item eio_set_max_parallel (unsigned int nthreads) |
628 |
|
|
|
629 |
|
|
Set the maximum number of threads that libeio will spawn. |
630 |
|
|
|
631 |
|
|
=item eio_set_max_idle (unsigned int nthreads) |
632 |
|
|
|
633 |
|
|
Libeio uses threads internally to handle most requests, and will start and stop threads on demand. |
634 |
|
|
|
635 |
|
|
This call can be used to limit the number of idle threads (threads without |
636 |
|
|
work to do): libeio will keep some threads idle in preparation for more |
637 |
|
|
requests, but never longer than C<nthreads> threads. |
638 |
|
|
|
639 |
|
|
In addition to this, libeio will also stop threads when they are idle for |
640 |
|
|
a few seconds, regardless of this setting. |
641 |
|
|
|
642 |
|
|
=item unsigned int eio_nthreads () |
643 |
|
|
|
644 |
|
|
Return the number of worker threads currently running. |
645 |
|
|
|
646 |
|
|
=item unsigned int eio_nreqs () |
647 |
|
|
|
648 |
|
|
Return the number of requests currently handled by libeio. This is the |
649 |
|
|
total number of requests that have been submitted to libeio, but not yet |
650 |
|
|
destroyed. |
651 |
|
|
|
652 |
|
|
=item unsigned int eio_nready () |
653 |
|
|
|
654 |
|
|
Returns the number of ready requests, i.e. requests that have been |
655 |
|
|
submitted but have not yet entered the execution phase. |
656 |
|
|
|
657 |
|
|
=item unsigned int eio_npending () |
658 |
|
|
|
659 |
|
|
Returns the number of pending requests, i.e. requests that have been |
660 |
|
|
executed and have results, but have not been finished yet by a call to |
661 |
|
|
C<eio_poll>). |
662 |
|
|
|
663 |
|
|
=back |
664 |
|
|
|
665 |
root |
1.1 |
=head1 EMBEDDING |
666 |
|
|
|
667 |
|
|
Libeio can be embedded directly into programs. This functionality is not |
668 |
|
|
documented and not (yet) officially supported. |
669 |
|
|
|
670 |
root |
1.3 |
Note that, when including C<libeio.m4>, you are responsible for defining |
671 |
|
|
the compilation environment (C<_LARGEFILE_SOURCE>, C<_GNU_SOURCE> etc.). |
672 |
|
|
|
673 |
root |
1.2 |
If you need to know how, check the C<IO::AIO> perl module, which does |
674 |
root |
1.1 |
exactly that. |
675 |
|
|
|
676 |
|
|
|
677 |
root |
1.4 |
=head1 COMPILETIME CONFIGURATION |
678 |
|
|
|
679 |
|
|
These symbols, if used, must be defined when compiling F<eio.c>. |
680 |
|
|
|
681 |
|
|
=over 4 |
682 |
|
|
|
683 |
|
|
=item EIO_STACKSIZE |
684 |
|
|
|
685 |
|
|
This symbol governs the stack size for each eio thread. Libeio itself |
686 |
|
|
was written to use very little stackspace, but when using C<EIO_CUSTOM> |
687 |
|
|
requests, you might want to increase this. |
688 |
|
|
|
689 |
|
|
If this symbol is undefined (the default) then libeio will use its default |
690 |
|
|
stack size (C<sizeof (long) * 4096> currently). If it is defined, but |
691 |
|
|
C<0>, then the default operating system stack size will be used. In all |
692 |
|
|
other cases, the value must be an expression that evaluates to the desired |
693 |
|
|
stack size. |
694 |
|
|
|
695 |
|
|
=back |
696 |
|
|
|
697 |
|
|
|
698 |
root |
1.1 |
=head1 PORTABILITY REQUIREMENTS |
699 |
|
|
|
700 |
|
|
In addition to a working ISO-C implementation, libeio relies on a few |
701 |
|
|
additional extensions: |
702 |
|
|
|
703 |
|
|
=over 4 |
704 |
|
|
|
705 |
|
|
=item POSIX threads |
706 |
|
|
|
707 |
|
|
To be portable, this module uses threads, specifically, the POSIX threads |
708 |
|
|
library must be available (and working, which partially excludes many xBSD |
709 |
|
|
systems, where C<fork ()> is buggy). |
710 |
|
|
|
711 |
|
|
=item POSIX-compatible filesystem API |
712 |
|
|
|
713 |
|
|
This is actually a harder portability requirement: The libeio API is quite |
714 |
|
|
demanding regarding POSIX API calls (symlinks, user/group management |
715 |
|
|
etc.). |
716 |
|
|
|
717 |
|
|
=item C<double> must hold a time value in seconds with enough accuracy |
718 |
|
|
|
719 |
|
|
The type C<double> is used to represent timestamps. It is required to |
720 |
|
|
have at least 51 bits of mantissa (and 9 bits of exponent), which is good |
721 |
|
|
enough for at least into the year 4000. This requirement is fulfilled by |
722 |
|
|
implementations implementing IEEE 754 (basically all existing ones). |
723 |
|
|
|
724 |
|
|
=back |
725 |
|
|
|
726 |
|
|
If you know of other additional requirements drop me a note. |
727 |
|
|
|
728 |
|
|
|
729 |
|
|
=head1 AUTHOR |
730 |
|
|
|
731 |
|
|
Marc Lehmann <libeio@schmorp.de>. |
732 |
|
|
|