--- IO-AIO/README 2011/03/27 10:26:08 1.46 +++ IO-AIO/README 2011/07/18 03:09:06 1.49 @@ -170,6 +170,7 @@ aio_link $srcpath, $dstpath, $callback->($status) aio_symlink $srcpath, $dstpath, $callback->($status) aio_readlink $path, $callback->($link) + aio_realpath $path, $callback->($link) aio_rename $srcpath, $dstpath, $callback->($status) aio_mkdir $pathname, $mode, $callback->($status) aio_rmdir $pathname, $callback->($status) @@ -308,6 +309,15 @@ } }; + In addition to all the common open modes/flags ("O_RDONLY", + "O_WRONLY", "O_RDWR", "O_CREAT", "O_TRUNC", "O_EXCL" and + "O_APPEND"), the following POSIX and non-POSIX constants are + available (missing ones on your system are, as usual, 0): + + "O_ASYNC", "O_DIRECT", "O_NOATIME", "O_CLOEXEC", "O_NOCTTY", + "O_NOFOLLOW", "O_NONBLOCK", "O_EXEC", "O_SEARCH", "O_DIRECTORY", + "O_DSYNC", "O_RSYNC", "O_SYNC" and "O_TTY_INIT". + aio_close $fh, $callback->($status) Asynchronously close a file and call the callback with the result code. @@ -360,13 +370,15 @@ reading at byte offset $in_offset, and starts writing at the current file offset of $out_fh. Because of that, it is not safe to issue more than one "aio_sendfile" per $out_fh, as they will interfere - with each other. + with each other. The same $in_fh works fine though, as this function + does not move or use the file offset of $in_fh. Please note that "aio_sendfile" can read more bytes from $in_fh than - are written, and there is no way to find out how many bytes have - been read from "aio_sendfile" alone, as "aio_sendfile" only provides - the number of bytes written to $out_fh. Only if the result value - equals $length one can assume that $length bytes have been read. + are written, and there is no way to find out how many more bytes + have been read from "aio_sendfile" alone, as "aio_sendfile" only + provides the number of bytes written to $out_fh. Only if the result + value equals $length one can assume that $length bytes have been + read. Unlike with other "aio_" functions, it makes a lot of sense to use "aio_sendfile" on non-blocking sockets, as long as one end @@ -376,17 +388,25 @@ some data with readahead, then fails to write all data, and when the socket is ready the next time, the data in the cache is already lost, forcing "aio_sendfile" to again hit the disk. Explicit - "aio_read" + "aio_write" let's you control resource usage much - better. + "aio_read" + "aio_write" let's you better control resource usage. - This call tries to make use of a native "sendfile" syscall to + This call tries to make use of a native "sendfile"-like syscall to provide zero-copy operation. For this to work, $out_fh should refer to a socket, and $in_fh should refer to an mmap'able file. If a native sendfile cannot be found or it fails with "ENOSYS", - "ENOTSUP", "EOPNOTSUPP", "EAFNOSUPPORT", "EPROTOTYPE" or "ENOTSOCK", - it will be emulated, so you can call "aio_sendfile" on any type of - filehandle regardless of the limitations of the operating system. + "EINVAL", "ENOTSUP", "EOPNOTSUPP", "EAFNOSUPPORT", "EPROTOTYPE" or + "ENOTSOCK", it will be emulated, so you can call "aio_sendfile" on + any type of filehandle regardless of the limitations of the + operating system. + + As native sendfile syscalls (as practically any non-POSIX interface + hacked together in a hurry to improve benchmark numbers) tend to be + rather buggy on many systems, this implementation tries to work + around some known bugs in Linux and FreeBSD kernels (probably + others, too), but that might fail, so you really really should check + the return value of "aio_sendfile" - fewre bytes than expected might + have been transferred. aio_readahead $fh,$offset,$length, $callback->($retval) "aio_readahead" populates the page cache with data from a file so @@ -540,6 +560,14 @@ the callback. If an error occurs, nothing or undef gets passed to the callback. + aio_realpath $path, $callback->($path) + Asynchronously make the path absolute and resolve any symlinks in + $path. The resulting path only consists of directories (Same as + Cwd::realpath). + + This request can be used to get the absolute path of the current + working directory by passing it a path of . (a single dot). + aio_rename $srcpath, $dstpath, $callback->($status) Asynchronously rename the object at $srcpath to $dstpath, just as rename(2) and call the callback with the result code. @@ -571,9 +599,9 @@ modified): IO::AIO::READDIR_DENTS - When this flag is off, then the callback gets an arrayref with - of names only (as with "aio_readdir"), otherwise it gets an - arrayref with "[$name, $type, $inode]" arrayrefs, each + When this flag is off, then the callback gets an arrayref + consisting of names only (as with "aio_readdir"), otherwise it + gets an arrayref with "[$name, $type, $inode]" arrayrefs, each describing a single directory entry in more detail. $name is the name of the entry. @@ -596,14 +624,15 @@ IO::AIO::READDIR_DIRS_FIRST When this flag is set, then the names will be returned in an - order where likely directories come first. This is useful when - you need to quickly find directories, or you want to find all - directories while avoiding to stat() each entry. + order where likely directories come first, in optimal stat + order. This is useful when you need to quickly find directories, + or you want to find all directories while avoiding to stat() + each entry. If the system returns type information in readdir, then this is used to find directories directly. Otherwise, likely directories - are files beginning with ".", or otherwise files with no dots, - of which files with short names are tried first. + are names beginning with ".", or otherwise names with no dots, + of which names with short names are tried first. IO::AIO::READDIR_STAT_ORDER When this flag is set, then the names will be returned in an @@ -1024,16 +1053,23 @@ IO::AIO::poll_cb Process some outstanding events on the result pipe. You have to call - this regularly. Returns 0 if all events could be processed, or -1 if - it returned earlier for whatever reason. Returns immediately when no - events are outstanding. The amount of events processed depends on - the settings of "IO::AIO::max_poll_req" and - "IO::AIO::max_poll_time". + this regularly. Returns 0 if all events could be processed (or there + were no events to process), or -1 if it returned earlier for + whatever reason. Returns immediately when no events are outstanding. + The amount of events processed depends on the settings of + "IO::AIO::max_poll_req" and "IO::AIO::max_poll_time". If not all requests were processed for whatever reason, the filehandle will still be ready when "poll_cb" returns, so normally you don't have to do anything special to have it called later. + Apart from calling "IO::AIO::poll_cb" when the event filehandle + becomes ready, it can be beneficial to call this function from loops + which submit a lot of requests, to make sure the results get + processed when they become available and not just when the loop is + finished and the event loop takes over again. This function returns + very fast when there are no outstanding requests. + Example: Install an Event watcher that automatically calls IO::AIO::poll_cb with high priority (more examples can be found in the SYNOPSIS section, at the top of this document): @@ -1155,22 +1191,39 @@ threads are allowed to exit. SEe "IO::AIO::max_idle". IO::AIO::max_outstanding $maxreqs + Sets the maximum number of outstanding requests to $nreqs. If you do + queue up more than this number of requests, the next call to + "IO::AIO::poll_cb" (and other functions calling "poll_cb", such as + "IO::AIO::flush" or "IO::AIO::poll") will block until the limit is + no longer exceeded. + + In other words, this setting does not enforce a queue limit, but can + be used to make poll functions block if the limit is exceeded. + This is a very bad function to use in interactive programs because it blocks, and a bad way to reduce concurrency because it is inexact: Better use an "aio_group" together with a feed callback. - Sets the maximum number of outstanding requests to $nreqs. If you do - queue up more than this number of requests, the next call to the - "poll_cb" (and "poll_some" and other functions calling "poll_cb") - function will block until the limit is no longer exceeded. - - The default value is very large, so there is no practical limit on - the number of outstanding requests. - - You can still queue as many requests as you want. Therefore, - "max_outstanding" is mainly useful in simple scripts (with low - values) or as a stop gap to shield against fatal memory overflow - (with large values). + It's main use is in scripts without an event loop - when you want to + stat a lot of files, you can write somehting like this: + + IO::AIO::max_outstanding 32; + + for my $path (...) { + aio_stat $path , ...; + IO::AIO::poll_cb; + } + + IO::AIO::flush; + + The call to "poll_cb" inside the loop will normally return + instantly, but as soon as more thna 32 reqeusts are in-flight, it + will block until some requests have been handled. This keeps the + loop from pushing a large number of "aio_stat" requests onto the + queue. + + The default value for "max_outstanding" is very large, so there is + no practical limit on the number of outstanding requests. STATISTICAL INFORMATION IO::AIO::nreqs @@ -1326,19 +1379,32 @@ \&IO::AIO::poll_cb); FORK BEHAVIOUR - This module should do "the right thing" when the process using it forks: - - Before the fork, IO::AIO enters a quiescent state where no requests can - be added in other threads and no results will be processed. After the - fork the parent simply leaves the quiescent state and continues - request/result processing, while the child frees the request/result - queue (so that the requests started before the fork will only be handled - in the parent). Threads will be started on demand until the limit set in - the parent process has been reached again. - - In short: the parent will, after a short pause, continue as if fork had - not been called, while the child will act as if IO::AIO has not been - used yet. + Usage of pthreads in a program changes the semantics of fork + considerably. Specifically, only async-safe functions can be called + after fork. Perl doesn't know about this, so in general, you cannot call + fork with defined behaviour in perl if pthreads are involved. IO::AIO + uses pthreads, so this applies, but many other extensions and (for + inexplicable reasons) perl itself often is linked against pthreads, so + this limitation applies to quite a lot of perls. + + This module no longer tries to fight your OS, or POSIX. That means + IO::AIO only works in the process that loaded it. Forking is fully + supported, but using IO::AIO in the child is not. + + You might get around by not *using* IO::AIO before (or after) forking. + You could also try to call the IO::AIO::reinit function in the child: + + IO::AIO::reinit + Abondons all current requests and I/O threads and simply + reinitialises all data structures. This is not an operation + suppported by any standards, but happens to work on GNU/Linux and + some newer BSD systems. + + The only reasonable use for this function is to call it after + forking, if "IO::AIO" was used in the parent. Calling it while + IO::AIO is active in the process will result in undefined behaviour. + Calling it at any time will also result in any undefined (by POSIX) + behaviour. MEMORY USAGE Per-request usage: