--- IO-AIO/README 2005/08/30 15:45:10 1.13 +++ IO-AIO/README 2010/01/10 23:44:02 1.43 @@ -5,7 +5,8 @@ use IO::AIO; aio_open "/etc/passwd", O_RDONLY, 0, sub { - my ($fh) = @_; + my $fh = shift + or die "/etc/passwd: $!"; ... }; @@ -15,67 +16,264 @@ $_[0] > 0 or die "read error: $!"; }; - # Event - Event->io (fd => IO::AIO::poll_fileno, - poll => 'r', - cb => \&IO::AIO::poll_cb); + # version 2+ has request and group objects + use IO::AIO 2; - # Glib/Gtk2 - add_watch Glib::IO IO::AIO::poll_fileno, - in => sub { IO::AIO::poll_cb; 1 }; - - # Tk - Tk::Event::IO->fileevent (IO::AIO::poll_fileno, "", - readable => \&IO::AIO::poll_cb); + aioreq_pri 4; # give next request a very high priority + my $req = aio_unlink "/tmp/file", sub { }; + $req->cancel; # cancel request if still in queue - # Danga::Socket - Danga::Socket->AddOtherFds (IO::AIO::poll_fileno => - \&IO::AIO::poll_cb); + my $grp = aio_group sub { print "all stats done\n" }; + add $grp aio_stat "..." for ...; DESCRIPTION This module implements asynchronous I/O using whatever means your - operating system supports. + operating system supports. It is implemented as an interface to "libeio" + (). - Currently, a number of threads are started that execute your read/writes - and signal their completion. You don't need thread support in your libc - or perl, and the threads created by this module will not be visible to - the pthreads library. In the future, this module might make use of the - native aio functions available on many operating systems. However, they - are often not well-supported (Linux doesn't allow them on normal files - currently, for example), and they would only support aio_read and + Asynchronous means that operations that can normally block your program + (e.g. reading from disk) will be done asynchronously: the operation will + still block, but you can do something else in the meantime. This is + extremely useful for programs that need to stay interactive even when + doing heavy I/O (GUI programs, high performance network servers etc.), + but can also be used to easily do operations in parallel that are + normally done sequentially, e.g. stat'ing many files, which is much + faster on a RAID volume or over NFS when you do a number of stat + operations concurrently. + + While most of this works on all types of file descriptors (for example + sockets), using these functions on file descriptors that support + nonblocking operation (again, sockets, pipes etc.) is very inefficient. + Use an event loop for that (such as the EV module): IO::AIO will + naturally fit into such an event loop itself. + + In this version, a number of threads are started that execute your + requests and signal their completion. You don't need thread support in + perl, and the threads created by this module will not be visible to + perl. In the future, this module might make use of the native aio + functions available on many operating systems. However, they are often + not well-supported or restricted (GNU/Linux doesn't allow them on normal + files currently, for example), and they would only support aio_read and aio_write, so the remaining functionality would have to be implemented using threads anyway. - Although the module will work with in the presence of other threads, it - is currently not reentrant, so use appropriate locking yourself, always - call "poll_cb" from within the same thread, or never call "poll_cb" (or - other "aio_" functions) recursively. + Although the module will work in the presence of other (Perl-) threads, + it is currently not reentrant in any way, so use appropriate locking + yourself, always call "poll_cb" from within the same thread, or never + call "poll_cb" (or other "aio_" functions) recursively. + + EXAMPLE + This is a simple example that uses the EV module and loads /etc/passwd + asynchronously: + + use Fcntl; + use EV; + use IO::AIO; + + # register the IO::AIO callback with EV + my $aio_w = EV::io IO::AIO::poll_fileno, EV::READ, \&IO::AIO::poll_cb; + + # queue the request to open /etc/passwd + aio_open "/etc/passwd", O_RDONLY, 0, sub { + my $fh = shift + or die "error while opening: $!"; + + # stat'ing filehandles is generally non-blocking + my $size = -s $fh; + + # queue a request to read the file + my $contents; + aio_read $fh, 0, $size, $contents, 0, sub { + $_[0] == $size + or die "short read: $!"; + + close $fh; + + # file contents now in $contents + print $contents; + + # exit event loop and program + EV::unloop; + }; + }; + + # possibly queue up other requests, or open GUI windows, + # check for sockets etc. etc. + + # process events as long as there are some: + EV::loop; + +REQUEST ANATOMY AND LIFETIME + Every "aio_*" function creates a request. which is a C data structure + not directly visible to Perl. + + If called in non-void context, every request function returns a Perl + object representing the request. In void context, nothing is returned, + which saves a bit of memory. + + The perl object is a fairly standard ref-to-hash object. The hash + contents are not used by IO::AIO so you are free to store anything you + like in it. + + During their existance, aio requests travel through the following + states, in order: + + ready + Immediately after a request is created it is put into the ready + state, waiting for a thread to execute it. + + execute + A thread has accepted the request for processing and is currently + executing it (e.g. blocking in read). + + pending + The request has been executed and is waiting for result processing. + + While request submission and execution is fully asynchronous, result + processing is not and relies on the perl interpreter calling + "poll_cb" (or another function with the same effect). + + result + The request results are processed synchronously by "poll_cb". + + The "poll_cb" function will process all outstanding aio requests by + calling their callbacks, freeing memory associated with them and + managing any groups they are contained in. + + done + Request has reached the end of its lifetime and holds no resources + anymore (except possibly for the Perl object, but its connection to + the actual aio request is severed and calling its methods will + either do nothing or result in a runtime error). FUNCTIONS - AIO FUNCTIONS + QUICK OVERVIEW + This section simply lists the prototypes of the most important functions + for quick reference. See the following sections for function-by-function + documentation. + + aio_open $pathname, $flags, $mode, $callback->($fh) + aio_close $fh, $callback->($status) + aio_read $fh,$offset,$length, $data,$dataoffset, $callback->($retval) + aio_write $fh,$offset,$length, $data,$dataoffset, $callback->($retval) + aio_sendfile $out_fh, $in_fh, $in_offset, $length, $callback->($retval) + aio_readahead $fh,$offset,$length, $callback->($retval) + aio_stat $fh_or_path, $callback->($status) + aio_lstat $fh, $callback->($status) + aio_statvfs $fh_or_path, $callback->($statvfs) + aio_utime $fh_or_path, $atime, $mtime, $callback->($status) + aio_chown $fh_or_path, $uid, $gid, $callback->($status) + aio_truncate $fh_or_path, $offset, $callback->($status) + aio_chmod $fh_or_path, $mode, $callback->($status) + aio_unlink $pathname, $callback->($status) + aio_mknod $path, $mode, $dev, $callback->($status) + aio_link $srcpath, $dstpath, $callback->($status) + aio_symlink $srcpath, $dstpath, $callback->($status) + aio_readlink $path, $callback->($link) + aio_rename $srcpath, $dstpath, $callback->($status) + aio_mkdir $pathname, $mode, $callback->($status) + aio_rmdir $pathname, $callback->($status) + aio_readdir $pathname, $callback->($entries) + aio_readdirx $pathname, $flags, $callback->($entries, $flags) + IO::AIO::READDIR_DENTS IO::AIO::READDIR_DIRS_FIRST + IO::AIO::READDIR_STAT_ORDER IO::AIO::READDIR_FOUND_UNKNOWN + aio_load $path, $data, $callback->($status) + aio_copy $srcpath, $dstpath, $callback->($status) + aio_move $srcpath, $dstpath, $callback->($status) + aio_scandir $path, $maxreq, $callback->($dirs, $nondirs) + aio_rmtree $path, $callback->($status) + aio_sync $callback->($status) + aio_fsync $fh, $callback->($status) + aio_fdatasync $fh, $callback->($status) + aio_sync_file_range $fh, $offset, $nbytes, $flags, $callback->($status) + aio_pathsync $path, $callback->($status) + aio_msync $scalar, $offset = 0, $length = undef, flags = 0, $callback->($status) + aio_mtouch $scalar, $offset = 0, $length = undef, flags = 0, $callback->($status) + aio_group $callback->(...) + aio_nop $callback->() + + $prev_pri = aioreq_pri [$pri] + aioreq_nice $pri_adjust + + IO::AIO::poll_wait + IO::AIO::poll_cb + IO::AIO::poll + IO::AIO::flush + IO::AIO::max_poll_reqs $nreqs + IO::AIO::max_poll_time $seconds + IO::AIO::min_parallel $nthreads + IO::AIO::max_parallel $nthreads + IO::AIO::max_idle $nthreads + IO::AIO::max_outstanding $maxreqs + IO::AIO::nreqs + IO::AIO::nready + IO::AIO::npending + + IO::AIO::sendfile $ofh, $ifh, $offset, $count + IO::AIO::fadvise $fh, $offset, $len, $advice + IO::AIO::mlockall $flags + IO::AIO::munlockall + + AIO REQUEST FUNCTIONS All the "aio_*" calls are more or less thin wrappers around the syscall with the same name (sans "aio_"). The arguments are similar or identical, and they all accept an additional (and optional) $callback argument which must be a code reference. This code reference will get called with the syscall return code (e.g. most syscalls return -1 on - error, unlike perl, which usually delivers "false") as it's sole - argument when the given syscall has been executed asynchronously. + error, unlike perl, which usually delivers "false") as its sole argument + after the given syscall has been executed asynchronously. All functions expecting a filehandle keep a copy of the filehandle internally until the request has finished. + All functions return request objects of type IO::AIO::REQ that allow + further manipulation of those requests while they are in-flight. + The pathnames you pass to these routines *must* be absolute and encoded - in byte form. The reason for the former is that at the time the request - is being executed, the current working directory could have changed. + as octets. The reason for the former is that at the time the request is + being executed, the current working directory could have changed. Alternatively, you can make sure that you never change the current - working directory. + working directory anywhere in the program and then use relative paths. + + To encode pathnames as octets, either make sure you either: a) always + pass in filenames you got from outside (command line, readdir etc.) + without tinkering, b) are ASCII or ISO 8859-1, c) use the Encode module + and encode your pathnames to the locale (or other) encoding in effect in + the user environment, d) use Glib::filename_from_unicode on unicode + filenames or e) use something else to ensure your scalar has the correct + contents. + + This works, btw. independent of the internal UTF-8 bit, which IO::AIO + handles correctly whether it is set or not. + + $prev_pri = aioreq_pri [$pri] + Returns the priority value that would be used for the next request + and, if $pri is given, sets the priority for the next aio request. + + The default priority is 0, the minimum and maximum priorities are -4 + and 4, respectively. Requests with higher priority will be serviced + first. + + The priority will be reset to 0 after each call to one of the + "aio_*" functions. + + Example: open a file with low priority, then read something from it + with higher priority so the read request is serviced before other + low priority open requests (potentially spamming the cache): + + aioreq_pri -3; + aio_open ..., sub { + return unless $_[0]; + + aioreq_pri -2; + aio_read $_[0], ..., sub { + ... + }; + }; - To encode pathnames to byte form, either make sure you either: a) always - pass in filenames you got from outside (command line, readdir etc.), b) - are ASCII or ISO 8859-1, c) use the Encode module and encode your - pathnames to the locale (or other) encoding in effect in the user - environment, d) use Glib::filename_from_unicode on unicode filenames or - e) use something else. + aioreq_nice $pri_adjust + Similar to "aioreq_pri", but subtracts the given value from the + current priority, so the effect is cumulative. aio_open $pathname, $flags, $mode, $callback->($fh) Asynchronously open or create a file and call the callback with a @@ -90,7 +288,9 @@ Likewise, $mode specifies the mode of the newly created file, if it didn't exist and "O_CREAT" has been given, just like perl's "sysopen", except that it is mandatory (i.e. use 0 if you don't - create new files, and 0666 or 0777 if you do). + create new files, and 0666 or 0777 if you do). Note that the $mode + will be modified by the umask in effect then the request is being + executed, so better never change the umask. Example: @@ -105,24 +305,42 @@ aio_close $fh, $callback->($status) Asynchronously close a file and call the callback with the result - code. *WARNING:* although accepted, you should not pass in a perl - filehandle here, as perl will likely close the file descriptor - another time when the filehandle is destroyed. Normally, you can - safely call perls "close" or just let filehandles go out of scope. + code. - This is supposed to be a bug in the API, so that might change. It's - therefore best to avoid this function. + Unfortunately, you can't do this to perl. Perl *insists* very + strongly on closing the file descriptor associated with the + filehandle itself. + + Therefore, "aio_close" will not close the filehandle - instead it + will use dup2 to overwrite the file descriptor with the write-end of + a pipe (the pipe fd will be created on demand and will be cached). + + Or in other words: the file descriptor will be closed, but it will + not be free for reuse until the perl filehandle is closed. aio_read $fh,$offset,$length, $data,$dataoffset, $callback->($retval) aio_write $fh,$offset,$length, $data,$dataoffset, $callback->($retval) - Reads or writes "length" bytes from the specified "fh" and "offset" - into the scalar given by "data" and offset "dataoffset" and calls - the callback without the actual number of bytes read (or -1 on + Reads or writes $length bytes from or to the specified $fh and + $offset into the scalar given by $data and offset $dataoffset and + calls the callback without the actual number of bytes read (or -1 on error, just like the syscall). + "aio_read" will, like "sysread", shrink or grow the $data scalar to + offset plus the actual number of bytes read. + + If $offset is undefined, then the current file descriptor offset + will be used (and updated), otherwise the file descriptor offset + will not be changed by these calls. + + If $length is undefined in "aio_write", use the remaining length of + $data. + + If $dataoffset is less than zero, it will be counted from the end of + $data. + The $data scalar *MUST NOT* be modified in any way while the request - is outstanding. Modifying it can result in segfaults or WW3 (if the - necessary/optional hardware is installed). + is outstanding. Modifying it can result in segfaults or World War + III (if the necessary/optional hardware is installed). Example: Read 15 bytes at offset 7 into scalar $buffer, starting at offset 0 within the scalar: @@ -141,11 +359,12 @@ This call tries to make use of a native "sendfile" syscall to provide zero-copy operation. For this to work, $out_fh should refer - to a socket, and $in_fh should refer to mmap'able file. + to a socket, and $in_fh should refer to an mmap'able file. - If the native sendfile call fails or is not implemented, it will be - emulated, so you can call "aio_sendfile" on any type of filehandle - regardless of the limitations of the operating system. + If a native sendfile cannot be found or it fails with "ENOSYS", + "ENOTSUP", "EOPNOTSUPP", "EAFNOSUPPORT", "EPROTOTYPE" or "ENOTSOCK", + it will be emulated, so you can call "aio_sendfile" on any type of + filehandle regardless of the limitations of the operating system. Please note, however, that "aio_sendfile" can read more bytes from $in_fh than are written, and there is no way to find out how many @@ -190,32 +409,230 @@ print "size is ", -s _, "\n"; }; + aio_statvfs $fh_or_path, $callback->($statvfs) + Works like the POSIX "statvfs" or "fstatvfs" syscalls, depending on + whether a file handle or path was passed. + + On success, the callback is passed a hash reference with the + following members: "bsize", "frsize", "blocks", "bfree", "bavail", + "files", "ffree", "favail", "fsid", "flag" and "namemax". On + failure, "undef" is passed. + + The following POSIX IO::AIO::ST_* constants are defined: "ST_RDONLY" + and "ST_NOSUID". + + The following non-POSIX IO::AIO::ST_* flag masks are defined to + their correct value when available, or to 0 on systems that do not + support them: "ST_NODEV", "ST_NOEXEC", "ST_SYNCHRONOUS", + "ST_MANDLOCK", "ST_WRITE", "ST_APPEND", "ST_IMMUTABLE", + "ST_NOATIME", "ST_NODIRATIME" and "ST_RELATIME". + + Example: stat "/wd" and dump out the data if successful. + + aio_statvfs "/wd", sub { + my $f = $_[0] + or die "statvfs: $!"; + + use Data::Dumper; + say Dumper $f; + }; + + # result: + { + bsize => 1024, + bfree => 4333064312, + blocks => 10253828096, + files => 2050765568, + flag => 4096, + favail => 2042092649, + bavail => 4333064312, + ffree => 2042092649, + namemax => 255, + frsize => 1024, + fsid => 1810 + } + + aio_utime $fh_or_path, $atime, $mtime, $callback->($status) + Works like perl's "utime" function (including the special case of + $atime and $mtime being undef). Fractional times are supported if + the underlying syscalls support them. + + When called with a pathname, uses utimes(2) if available, otherwise + utime(2). If called on a file descriptor, uses futimes(2) if + available, otherwise returns ENOSYS, so this is not portable. + + Examples: + + # set atime and mtime to current time (basically touch(1)): + aio_utime "path", undef, undef; + # set atime to current time and mtime to beginning of the epoch: + aio_utime "path", time, undef; # undef==0 + + aio_chown $fh_or_path, $uid, $gid, $callback->($status) + Works like perl's "chown" function, except that "undef" for either + $uid or $gid is being interpreted as "do not change" (but -1 can + also be used). + + Examples: + + # same as "chown root path" in the shell: + aio_chown "path", 0, -1; + # same as above: + aio_chown "path", 0, undef; + + aio_truncate $fh_or_path, $offset, $callback->($status) + Works like truncate(2) or ftruncate(2). + + aio_chmod $fh_or_path, $mode, $callback->($status) + Works like perl's "chmod" function. + aio_unlink $pathname, $callback->($status) Asynchronously unlink (delete) a file and call the callback with the result code. + aio_mknod $path, $mode, $dev, $callback->($status) + [EXPERIMENTAL] + + Asynchronously create a device node (or fifo). See mknod(2). + + The only (POSIX-) portable way of calling this function is: + + aio_mknod $path, IO::AIO::S_IFIFO | $mode, 0, sub { ... + + aio_link $srcpath, $dstpath, $callback->($status) + Asynchronously create a new link to the existing object at $srcpath + at the path $dstpath and call the callback with the result code. + + aio_symlink $srcpath, $dstpath, $callback->($status) + Asynchronously create a new symbolic link to the existing object at + $srcpath at the path $dstpath and call the callback with the result + code. + + aio_readlink $path, $callback->($link) + Asynchronously read the symlink specified by $path and pass it to + the callback. If an error occurs, nothing or undef gets passed to + the callback. + + aio_rename $srcpath, $dstpath, $callback->($status) + Asynchronously rename the object at $srcpath to $dstpath, just as + rename(2) and call the callback with the result code. + + aio_mkdir $pathname, $mode, $callback->($status) + Asynchronously mkdir (create) a directory and call the callback with + the result code. $mode will be modified by the umask at the time the + request is executed, so do not change your umask. + aio_rmdir $pathname, $callback->($status) Asynchronously rmdir (delete) a directory and call the callback with the result code. - aio_readdir $pathname $callback->($entries) + aio_readdir $pathname, $callback->($entries) Unlike the POSIX call of the same name, "aio_readdir" reads an entire directory (i.e. opendir + readdir + closedir). The entries will not be sorted, and will NOT include the "." and ".." entries. - The callback a single argument which is either "undef" or an - array-ref with the filenames. + The callback is passed a single argument which is either "undef" or + an array-ref with the filenames. + + aio_readdirx $pathname, $flags, $callback->($entries, $flags) + Quite similar to "aio_readdir", but the $flags argument allows to + tune behaviour and output format. In case of an error, $entries will + be "undef". + + The flags are a combination of the following constants, ORed + together (the flags will also be passed to the callback, possibly + modified): + + IO::AIO::READDIR_DENTS + When this flag is off, then the callback gets an arrayref with + of names only (as with "aio_readdir"), otherwise it gets an + arrayref with "[$name, $type, $inode]" arrayrefs, each + describing a single directory entry in more detail. + + $name is the name of the entry. + + $type is one of the "IO::AIO::DT_xxx" constants: + + "IO::AIO::DT_UNKNOWN", "IO::AIO::DT_FIFO", "IO::AIO::DT_CHR", + "IO::AIO::DT_DIR", "IO::AIO::DT_BLK", "IO::AIO::DT_REG", + "IO::AIO::DT_LNK", "IO::AIO::DT_SOCK", "IO::AIO::DT_WHT". + + "IO::AIO::DT_UNKNOWN" means just that: readdir does not know. If + you need to know, you have to run stat yourself. Also, for speed + reasons, the $type scalars are read-only: you can not modify + them. + + $inode is the inode number (which might not be exact on systems + with 64 bit inode numbers and 32 bit perls). This field has + unspecified content on systems that do not deliver the inode + information. + + IO::AIO::READDIR_DIRS_FIRST + When this flag is set, then the names will be returned in an + order where likely directories come first. This is useful when + you need to quickly find directories, or you want to find all + directories while avoiding to stat() each entry. + + If the system returns type information in readdir, then this is + used to find directories directly. Otherwise, likely directories + are files beginning with ".", or otherwise files with no dots, + of which files with short names are tried first. + + IO::AIO::READDIR_STAT_ORDER + When this flag is set, then the names will be returned in an + order suitable for stat()'ing each one. That is, when you plan + to stat() all files in the given directory, then the returned + order will likely be fastest. + + If both this flag and "IO::AIO::READDIR_DIRS_FIRST" are + specified, then the likely dirs come first, resulting in a less + optimal stat order. + + IO::AIO::READDIR_FOUND_UNKNOWN + This flag should not be set when calling "aio_readdirx". + Instead, it is being set by "aio_readdirx", when any of the + $type's found were "IO::AIO::DT_UNKNOWN". The absense of this + flag therefore indicates that all $type's are known, which can + be used to speed up some algorithms. + + aio_load $path, $data, $callback->($status) + This is a composite request that tries to fully load the given file + into memory. Status is the same as with aio_read. + + aio_copy $srcpath, $dstpath, $callback->($status) + Try to copy the *file* (directories not supported as either source + or destination) from $srcpath to $dstpath and call the callback with + a status of 0 (ok) or -1 (error, see $!). + + This is a composite request that creates the destination file with + mode 0200 and copies the contents of the source file into it using + "aio_sendfile", followed by restoring atime, mtime, access mode and + uid/gid, in that order. + + If an error occurs, the partial destination file will be unlinked, + if possible, except when setting atime, mtime, access mode and + uid/gid, where errors are being ignored. + + aio_move $srcpath, $dstpath, $callback->($status) + Try to move the *file* (directories not supported as either source + or destination) from $srcpath to $dstpath and call the callback with + a status of 0 (ok) or -1 (error, see $!). + + This is a composite request that tries to rename(2) the file first; + if rename fails with "EXDEV", it copies the file with "aio_copy" + and, if that is successful, unlinks the $srcpath. aio_scandir $path, $maxreq, $callback->($dirs, $nondirs) - Scans a directory (similar to "aio_readdir") and tries to separate - the entries of directory $path into two sets of names, ones you can - recurse into (directories), and ones you cannot recurse into - (everything else). - - "aio_scandir" is a composite request that consists of many - aio-primitives. $maxreq specifies the maximum number of outstanding - aio requests that this function generates. If it is "<= 0", then a - suitable default will be chosen (currently 8). + Scans a directory (similar to "aio_readdir") but additionally tries + to efficiently separate the entries of directory $path into two sets + of names, directories you can recurse into (directories), and ones + you cannot recurse into (everything else, including symlinks to + directories). + + "aio_scandir" is a composite request that creates of many sub + requests_ $maxreq specifies the maximum number of outstanding aio + requests that this function generates. If it is "<= 0", then a + suitable default will be chosen (currently 4). On error, the callback is called without arguments, otherwise it receives two array-refs with path-relative entry names. @@ -233,23 +650,45 @@ The "aio_readdir" cannot be avoided, but "stat()"'ing every entry can. - After reading the directory, the modification time, size etc. of the - directory before and after the readdir is checked, and if they - match, the link count will be used to decide how many entries are - directories (if >= 2). Otherwise, no knowledge of the number of - subdirectories will be assumed. - - Then entires will be sorted into likely directories (everything - without a non-initial dot) and likely non-directories (everything - else). Then every entry + "/." will be "stat"'ed, likely directories - first. This is often faster because filesystems might detect the - type of the entry without reading the inode data (e.g. ext2s - filetype feature). If that succeeds, it assumes that the entry is a - directory or a symlink to directory (which will be checked - seperately). + If readdir returns file type information, then this is used directly + to find directories. - If the known number of directories has been reached, the rest of the - entries is assumed to be non-directories. + Otherwise, after reading the directory, the modification time, size + etc. of the directory before and after the readdir is checked, and + if they match (and isn't the current time), the link count will be + used to decide how many entries are directories (if >= 2). + Otherwise, no knowledge of the number of subdirectories will be + assumed. + + Then entries will be sorted into likely directories a non-initial + dot currently) and likely non-directories (see "aio_readdirx"). Then + every entry plus an appended "/." will be "stat"'ed, likely + directories first, in order of their inode numbers. If that + succeeds, it assumes that the entry is a directory or a symlink to + directory (which will be checked seperately). This is often faster + than stat'ing the entry itself because filesystems might detect the + type of the entry without reading the inode data (e.g. ext2fs + filetype feature), even on systems that cannot return the filetype + information on readdir. + + If the known number of directories (link count - 2) has been + reached, the rest of the entries is assumed to be non-directories. + + This only works with certainty on POSIX (= UNIX) filesystems, which + fortunately are the vast majority of filesystems around. + + It will also likely work on non-POSIX filesystems with reduced + efficiency as those tend to return 0 or 1 as link counts, which + disables the directory counting heuristic. + + aio_rmtree $path, $callback->($status) + Delete a directory tree starting (and including) $path, return the + status of the final "rmdir" only. This is a composite request that + uses "aio_scandir" to recurse into and rmdir directories, and unlink + everything else. + + aio_sync $callback->($status) + Asynchronously call sync and call the callback when finished. aio_fsync $fh, $callback->($status) Asynchronously call fsync on the given filehandle and call the @@ -262,42 +701,295 @@ If this call isn't available because your OS lacks it or it couldn't be detected, it will be emulated by calling "fsync" instead. + aio_sync_file_range $fh, $offset, $nbytes, $flags, $callback->($status) + Sync the data portion of the file specified by $offset and $length + to disk (but NOT the metadata), by calling the Linux-specific + sync_file_range call. If sync_file_range is not available or it + returns ENOSYS, then fdatasync or fsync is being substituted. + + $flags can be a combination of + "IO::AIO::SYNC_FILE_RANGE_WAIT_BEFORE", + "IO::AIO::SYNC_FILE_RANGE_WRITE" and + "IO::AIO::SYNC_FILE_RANGE_WAIT_AFTER": refer to the sync_file_range + manpage for details. + + aio_pathsync $path, $callback->($status) + This request tries to open, fsync and close the given path. This is + a composite request intended to sync directories after directory + operations (E.g. rename). This might not work on all operating + systems or have any specific effect, but usually it makes sure that + directory changes get written to disc. It works for anything that + can be opened for read-only, not just directories. + + Future versions of this function might fall back to other methods + when "fsync" on the directory fails (such as calling "sync"). + + Passes 0 when everything went ok, and -1 on error. + + aio_msync $scalar, $offset = 0, $length = undef, flags = 0, + $callback->($status) + This is a rather advanced IO::AIO call, which only works on + mmap(2)ed scalars (see the "IO::AIO::mmap" function, although it + also works on data scalars managed by the Sys::Mmap or Mmap modules, + note that the scalar must only be modified in-place while an aio + operation is pending on it). + + It calls the "msync" function of your OS, if available, with the + memory area starting at $offset in the string and ending $length + bytes later. If $length is negative, counts from the end, and if + $length is "undef", then it goes till the end of the string. The + flags can be a combination of "IO::AIO::MS_ASYNC", + "IO::AIO::MS_INVALIDATE" and "IO::AIO::MS_SYNC". + + aio_mtouch $scalar, $offset = 0, $length = undef, flags = 0, + $callback->($status) + This is a rather advanced IO::AIO call, which works best on + mmap(2)ed scalars. + + It touches (reads or writes) all memory pages in the specified range + inside the scalar. All caveats and parameters are the same as for + "aio_msync", above, except for flags, which must be either 0 (which + reads all pages and ensures they are instantiated) or + "IO::AIO::MT_MODIFY", which modifies the memory page s(by reading + and writing an octet from it, which dirties the page). + + aio_group $callback->(...) + This is a very special aio request: Instead of doing something, it + is a container for other aio requests, which is useful if you want + to bundle many requests into a single, composite, request with a + definite callback and the ability to cancel the whole request with + its subrequests. + + Returns an object of class IO::AIO::GRP. See its documentation below + for more info. + + Example: + + my $grp = aio_group sub { + print "all stats done\n"; + }; + + add $grp + (aio_stat ...), + (aio_stat ...), + ...; + + aio_nop $callback->() + This is a special request - it does nothing in itself and is only + used for side effects, such as when you want to add a dummy request + to a group so that finishing the requests in the group depends on + executing the given code. + + While this request does nothing, it still goes through the execution + phase and still requires a worker thread. Thus, the callback will + not be executed immediately but only after other requests in the + queue have entered their execution phase. This can be used to + measure request latency. + + IO::AIO::aio_busy $fractional_seconds, $callback->() *NOT EXPORTED* + Mainly used for debugging and benchmarking, this aio request puts + one of the request workers to sleep for the given time. + + While it is theoretically handy to have simple I/O scheduling + requests like sleep and file handle readable/writable, the overhead + this creates is immense (it blocks a thread for a long time) so do + not use this function except to put your application under + artificial I/O pressure. + + IO::AIO::REQ CLASS + All non-aggregate "aio_*" functions return an object of this class when + called in non-void context. + + cancel $req + Cancels the request, if possible. Has the effect of skipping + execution when entering the execute state and skipping calling the + callback when entering the the result state, but will leave the + request otherwise untouched (with the exception of readdir). That + means that requests that currently execute will not be stopped and + resources held by the request will not be freed prematurely. + + cb $req $callback->(...) + Replace (or simply set) the callback registered to the request. + + IO::AIO::GRP CLASS + This class is a subclass of IO::AIO::REQ, so all its methods apply to + objects of this class, too. + + A IO::AIO::GRP object is a special request that can contain multiple + other aio requests. + + You create one by calling the "aio_group" constructing function with a + callback that will be called when all contained requests have entered + the "done" state: + + my $grp = aio_group sub { + print "all requests are done\n"; + }; + + You add requests by calling the "add" method with one or more + "IO::AIO::REQ" objects: + + $grp->add (aio_unlink "..."); + + add $grp aio_stat "...", sub { + $_[0] or return $grp->result ("error"); + + # add another request dynamically, if first succeeded + add $grp aio_open "...", sub { + $grp->result ("ok"); + }; + }; + + This makes it very easy to create composite requests (see the source of + "aio_move" for an application) that work and feel like simple requests. + + * The IO::AIO::GRP objects will be cleaned up during calls to + "IO::AIO::poll_cb", just like any other request. + + * They can be canceled like any other request. Canceling will cancel + not only the request itself, but also all requests it contains. + + * They can also can also be added to other IO::AIO::GRP objects. + + * You must not add requests to a group from within the group callback + (or any later time). + + Their lifetime, simplified, looks like this: when they are empty, they + will finish very quickly. If they contain only requests that are in the + "done" state, they will also finish. Otherwise they will continue to + exist. + + That means after creating a group you have some time to add requests + (precisely before the callback has been invoked, which is only done + within the "poll_cb"). And in the callbacks of those requests, you can + add further requests to the group. And only when all those requests have + finished will the the group itself finish. + + add $grp ... + $grp->add (...) + Add one or more requests to the group. Any type of IO::AIO::REQ can + be added, including other groups, as long as you do not create + circular dependencies. + + Returns all its arguments. + + $grp->cancel_subs + Cancel all subrequests and clears any feeder, but not the group + request itself. Useful when you queued a lot of events but got a + result early. + + The group request will finish normally (you cannot add requests to + the group). + + $grp->result (...) + Set the result value(s) that will be passed to the group callback + when all subrequests have finished and set the groups errno to the + current value of errno (just like calling "errno" without an error + number). By default, no argument will be passed and errno is zero. + + $grp->errno ([$errno]) + Sets the group errno value to $errno, or the current value of errno + when the argument is missing. + + Every aio request has an associated errno value that is restored + when the callback is invoked. This method lets you change this value + from its default (0). + + Calling "result" will also set errno, so make sure you either set $! + before the call to "result", or call c after it. + + feed $grp $callback->($grp) + Sets a feeder/generator on this group: every group can have an + attached generator that generates requests if idle. The idea behind + this is that, although you could just queue as many requests as you + want in a group, this might starve other requests for a potentially + long time. For example, "aio_scandir" might generate hundreds of + thousands "aio_stat" requests, delaying any later requests for a + long time. + + To avoid this, and allow incremental generation of requests, you can + instead a group and set a feeder on it that generates those + requests. The feed callback will be called whenever there are few + enough (see "limit", below) requests active in the group itself and + is expected to queue more requests. + + The feed callback can queue as many requests as it likes (i.e. "add" + does not impose any limits). + + If the feed does not queue more requests when called, it will be + automatically removed from the group. + + If the feed limit is 0 when this method is called, it will be set to + 2 automatically. + + Example: + + # stat all files in @files, but only ever use four aio requests concurrently: + + my $grp = aio_group sub { print "finished\n" }; + limit $grp 4; + feed $grp sub { + my $file = pop @files + or return; + + add $grp aio_stat $file, sub { ... }; + }; + + limit $grp $num + Sets the feeder limit for the group: The feeder will be called + whenever the group contains less than this many requests. + + Setting the limit to 0 will pause the feeding process. + + The default value for the limit is 0, but note that setting a feeder + automatically bumps it up to 2. + SUPPORT FUNCTIONS + EVENT PROCESSING AND EVENT LOOP INTEGRATION $fileno = IO::AIO::poll_fileno Return the *request result pipe file descriptor*. This filehandle must be polled for reading by some mechanism outside this module - (e.g. Event or select, see below or the SYNOPSIS). If the pipe - becomes readable you have to call "poll_cb" to check the results. + (e.g. EV, Glib, select and so on, see below or the SYNOPSIS). If the + pipe becomes readable you have to call "poll_cb" to check the + results. See "poll_cb" for an example. IO::AIO::poll_cb - Process all outstanding events on the result pipe. You have to call - this regularly. Returns the number of events processed. Returns - immediately when no events are outstanding. + Process some outstanding events on the result pipe. You have to call + this regularly. Returns 0 if all events could be processed, or -1 if + it returned earlier for whatever reason. Returns immediately when no + events are outstanding. The amount of events processed depends on + the settings of "IO::AIO::max_poll_req" and + "IO::AIO::max_poll_time". + + If not all requests were processed for whatever reason, the + filehandle will still be ready when "poll_cb" returns, so normally + you don't have to do anything special to have it called later. Example: Install an Event watcher that automatically calls - IO::AIO::poll_cb with high priority: + IO::AIO::poll_cb with high priority (more examples can be found in + the SYNOPSIS section, at the top of this document): Event->io (fd => IO::AIO::poll_fileno, poll => 'r', async => 1, cb => \&IO::AIO::poll_cb); IO::AIO::poll_wait - Wait till the result filehandle becomes ready for reading (simply - does a "select" on the filehandle. This is useful if you want to - synchronously wait for some requests to finish). + If there are any outstanding requests and none of them in the result + phase, wait till the result filehandle becomes ready for reading + (simply does a "select" on the filehandle. This is useful if you + want to synchronously wait for some requests to finish). See "nreqs" for an example. - IO::AIO::nreqs - Returns the number of requests currently outstanding (i.e. for which - their callback has not been invoked yet). + IO::AIO::poll + Waits until some requests have been handled. - Example: wait till there are no outstanding requests anymore: + Returns the number of requests processed, but is otherwise strictly + equivalent to: IO::AIO::poll_wait, IO::AIO::poll_cb - while IO::AIO::nreqs; IO::AIO::flush Wait till all outstanding AIO requests have been handled. @@ -307,27 +999,55 @@ IO::AIO::poll_wait, IO::AIO::poll_cb while IO::AIO::nreqs; - IO::AIO::poll - Waits until some requests have been handled. + IO::AIO::max_poll_reqs $nreqs + IO::AIO::max_poll_time $seconds + These set the maximum number of requests (default 0, meaning + infinity) that are being processed by "IO::AIO::poll_cb" in one + call, respectively the maximum amount of time (default 0, meaning + infinity) spent in "IO::AIO::poll_cb" to process requests (more + correctly the mininum amount of time "poll_cb" is allowed to use). + + Setting "max_poll_time" to a non-zero value creates an overhead of + one syscall per request processed, which is not normally a problem + unless your callbacks are really really fast or your OS is really + really slow (I am not mentioning Solaris here). Using + "max_poll_reqs" incurs no overhead. + + Setting these is useful if you want to ensure some level of + interactiveness when perl is not fast enough to process all requests + in time. - Strictly equivalent to: + For interactive programs, values such as 0.01 to 0.1 should be fine. - IO::AIO::poll_wait, IO::AIO::poll_cb - if IO::AIO::nreqs; + Example: Install an Event watcher that automatically calls + IO::AIO::poll_cb with low priority, to ensure that other parts of + the program get the CPU sometimes even under high AIO load. + + # try not to spend much more than 0.1s in poll_cb + IO::AIO::max_poll_time 0.1; + + # use a low priority so other tasks have priority + Event->io (fd => IO::AIO::poll_fileno, + poll => 'r', nice => 1, + cb => &IO::AIO::poll_cb); + CONTROLLING THE NUMBER OF THREADS IO::AIO::min_parallel $nthreads Set the minimum number of AIO threads to $nthreads. The current - default is 4, which means four asynchronous operations can be done - at one time (the number of outstanding operations, however, is - unlimited). + default is 8, which means eight asynchronous operations can execute + concurrently at any one time (the number of outstanding requests, + however, is unlimited). IO::AIO starts threads only on demand, when an AIO request is queued - and no free thread exists. - - It is recommended to keep the number of threads low, as some Linux - kernel versions will scale negatively with the number of threads - (higher parallelity => MUCH higher latency). With current Linux 2.6 - versions, 4-32 threads should be fine. + and no free thread exists. Please note that queueing up a hundred + requests can create demand for a hundred threads, even if it turns + out that everything is in the cache and could have been processed + faster by a single thread. + + It is recommended to keep the number of threads relatively low, as + some Linux kernel versions will scale negatively with the number of + threads (higher parallelity => MUCH higher latency). With current + Linux 2.6 versions, 4-32 threads should be fine. Under most circumstances you don't need to call this function, as the module selects a default that is suitable for low to moderate @@ -347,28 +1067,215 @@ Under normal circumstances you don't need to call this function. - $oldnreqs = IO::AIO::max_outstanding $nreqs - Sets the maximum number of outstanding requests to $nreqs. If you - try to queue up more than this number of requests, the caller will - block until some requests have been handled. - - The default is very large, so normally there is no practical limit. - If you queue up many requests in a loop it often improves speed if - you set this to a relatively low number, such as 100. + IO::AIO::max_idle $nthreads + Limit the number of threads (default: 4) that are allowed to idle + (i.e., threads that did not get a request to process within 10 + seconds). That means if a thread becomes idle while $nthreads other + threads are also idle, it will free its resources and exit. + + This is useful when you allow a large number of threads (e.g. 100 or + 1000) to allow for extremely high load situations, but want to free + resources under normal circumstances (1000 threads can easily + consume 30MB of RAM). + + The default is probably ok in most situations, especially if thread + creation is fast. If thread creation is very slow on your system you + might want to use larger values. + + IO::AIO::max_outstanding $maxreqs + This is a very bad function to use in interactive programs because + it blocks, and a bad way to reduce concurrency because it is + inexact: Better use an "aio_group" together with a feed callback. + + Sets the maximum number of outstanding requests to $nreqs. If you do + queue up more than this number of requests, the next call to the + "poll_cb" (and "poll_some" and other functions calling "poll_cb") + function will block until the limit is no longer exceeded. + + The default value is very large, so there is no practical limit on + the number of outstanding requests. + + You can still queue as many requests as you want. Therefore, + "max_outstanding" is mainly useful in simple scripts (with low + values) or as a stop gap to shield against fatal memory overflow + (with large values). - Under normal circumstances you don't need to call this function. + STATISTICAL INFORMATION + IO::AIO::nreqs + Returns the number of requests currently in the ready, execute or + pending states (i.e. for which their callback has not been invoked + yet). + + Example: wait till there are no outstanding requests anymore: + + IO::AIO::poll_wait, IO::AIO::poll_cb + while IO::AIO::nreqs; + + IO::AIO::nready + Returns the number of requests currently in the ready state (not yet + executed). + + IO::AIO::npending + Returns the number of requests currently in the pending state + (executed, but not yet processed by poll_cb). + + MISCELLANEOUS FUNCTIONS + IO::AIO implements some functions that might be useful, but are not + asynchronous. + + IO::AIO::sendfile $ofh, $ifh, $offset, $count + Calls the "eio_sendfile_sync" function, which is like + "aio_sendfile", but is blocking (this makes most sense if you know + the input data is likely cached already and the output filehandle is + set to non-blocking operations). + + Returns the number of bytes copied, or -1 on error. + + IO::AIO::fadvise $fh, $offset, $len, $advice + Simply calls the "posix_fadvise" function (see it's manpage for + details). The following advice constants are avaiable: + "IO::AIO::FADV_NORMAL", "IO::AIO::FADV_SEQUENTIAL", + "IO::AIO::FADV_RANDOM", "IO::AIO::FADV_NOREUSE", + "IO::AIO::FADV_WILLNEED", "IO::AIO::FADV_DONTNEED". + + On systems that do not implement "posix_fadvise", this function + returns ENOSYS, otherwise the return value of "posix_fadvise". + + IO::AIO::mmap $scalar, $length, $prot, $flags, $fh[, $offset] + Memory-maps a file (or anonymous memory range) and attaches it to + the given $scalar, which will act like a string scalar. + + The only operations allowed on the scalar are "substr"/"vec" that + don't change the string length, and most read-only operations such + as copying it or searching it with regexes and so on. + + Anything else is unsafe and will, at best, result in memory leaks. + + The memory map associated with the $scalar is automatically removed + when the $scalar is destroyed, or when the "IO::AIO::mmap" or + "IO::AIO::munmap" functions are called. + + This calls the "mmap"(2) function internally. See your system's + manual page for details on the $length, $prot and $flags parameters. + + The $length must be larger than zero and smaller than the actual + filesize. + + $prot is a combination of "IO::AIO::PROT_NONE", + "IO::AIO::PROT_EXEC", "IO::AIO::PROT_READ" and/or + "IO::AIO::PROT_WRITE", + + $flags can be a combination of "IO::AIO::MAP_SHARED" or + "IO::AIO::MAP_PRIVATE", or a number of system-specific flags (when + not available, the are defined as 0): "IO::AIO::MAP_ANONYMOUS" + (which is set to "MAP_ANON" if your system only provides this + constant), "IO::AIO::MAP_HUGETLB", "IO::AIO::MAP_LOCKED", + "IO::AIO::MAP_NORESERVE", "IO::AIO::MAP_POPULATE" or + "IO::AIO::MAP_NONBLOCK" + + If $fh is "undef", then a file descriptor of -1 is passed. + + $offset is the offset from the start of the file - it generally must + be a multiple of "IO::AIO::PAGESIZE" and defaults to 0. + + Example: + + use Digest::MD5; + use IO::AIO; + + open my $fh, "io (fd => IO::AIO::poll_fileno, + poll => 'r', + cb => \&IO::AIO::poll_cb); + + # Glib/Gtk2 integration + add_watch Glib::IO IO::AIO::poll_fileno, + in => sub { IO::AIO::poll_cb; 1 }; + + # Tk integration + Tk::Event::IO->fileevent (IO::AIO::poll_fileno, "", + readable => \&IO::AIO::poll_cb); + + # Danga::Socket integration + Danga::Socket->AddOtherFds (IO::AIO::poll_fileno => + \&IO::AIO::poll_cb); FORK BEHAVIOUR + This module should do "the right thing" when the process using it forks: + Before the fork, IO::AIO enters a quiescent state where no requests can be added in other threads and no results will be processed. After the fork the parent simply leaves the quiescent state and continues - request/result processing, while the child clears the request/result - queue (so the requests started before the fork will only be handled in - the parent). Threats will be started on demand until the limit ste in + request/result processing, while the child frees the request/result + queue (so that the requests started before the fork will only be handled + in the parent). Threads will be started on demand until the limit set in the parent process has been reached again. + In short: the parent will, after a short pause, continue as if fork had + not been called, while the child will act as if IO::AIO has not been + used yet. + + MEMORY USAGE + Per-request usage: + + Each aio request uses - depending on your architecture - around 100-200 + bytes of memory. In addition, stat requests need a stat buffer (possibly + a few hundred bytes), readdir requires a result buffer and so on. Perl + scalars and other data passed into aio requests will also be locked and + will consume memory till the request has entered the done state. + + This is not awfully much, so queuing lots of requests is not usually a + problem. + + Per-thread usage: + + In the execution phase, some aio requests require more memory for + temporary buffers, and each thread requires a stack and other data + structures (usually around 16k-128k, depending on the OS). + +KNOWN BUGS + Known bugs will be fixed in the next release. + SEE ALSO - Coro, Linux::AIO. + AnyEvent::AIO for easy integration into event loops, Coro::AIO for a + more natural syntax. AUTHOR Marc Lehmann