--- libev/ev.pod 2008/10/29 14:12:34 1.209 +++ libev/ev.pod 2008/11/05 02:48:45 1.213 @@ -388,27 +388,29 @@ like O(total_fds) where n is the total number of fds (or the highest fd), epoll scales either O(1) or O(active_fds). -The epoll syscalls are the most misdesigned of the more advanced event -mechanisms: problems include silently dropping fds, requiring a system -call per change per fd (and unnecessary guessing of parameters), problems -with dup and so on. The biggest issue is fork races, however - if a -program forks then I parent and child process have to recreate the -epoll set, which can take considerable time (one syscall per fd) and is of -course hard to detect. +The epoll mechanism deserves honorable mention as the most misdesigned +of the more advanced event mechanisms: mere annoyances include silently +dropping file descriptors, requiring a system call per change per file +descriptor (and unnecessary guessing of parameters), problems with dup and +so on. The biggest issue is fork races, however - if a program forks then +I parent and child process have to recreate the epoll set, which can +take considerable time (one syscall per file descriptor) and is of course +hard to detect. -Epoll is also notoriously buggy - embedding epoll fds should work, but -of course doesn't, and epoll just loves to report events for totally +Epoll is also notoriously buggy - embedding epoll fds I work, but +of course I, and epoll just loves to report events for totally I file descriptors (even already closed ones, so one cannot even remove them from the set) than registered in the set (especially on SMP systems). Libev tries to counter these spurious notifications by employing an additional generation counter and comparing that against the -events to filter out spurious ones. +events to filter out spurious ones, recreating the set when required. While stopping, setting and starting an I/O watcher in the same iteration -will result in some caching, there is still a system call per such incident -(because the fd could point to a different file description now), so its -best to avoid that. Also, C'ed file descriptors might not work -very well if you register events for both fds. +will result in some caching, there is still a system call per such +incident (because the same I could point to a different +I now), so its best to avoid that. Also, C'ed +file descriptors might not work very well if you register events for both +file descriptors. Best performance from this backend is achieved by not unregistering all watchers for a file descriptor until it has been closed, if possible, @@ -418,6 +420,9 @@ as in libev having to destroy and recreate the epoll object, which can take considerable time and thus should be avoided. +All this means that, in practise, C is as fast or faster +then epoll for maybe up to a hundred file descriptors. So sad. + While nominally embeddable in other event loops, this feature is broken in all kernel versions tested so far. @@ -426,12 +431,15 @@ =item C (value 8, most BSD clones) -Kqueue deserves special mention, as at the time of this writing, it was -broken on all BSDs except NetBSD (usually it doesn't work reliably with -anything but sockets and pipes, except on Darwin, where of course it's -completely useless). For this reason it's not being "auto-detected" unless -you explicitly specify it in the flags (i.e. using C) or -libev was compiled on a known-to-be-good (-enough) system like NetBSD. +Kqueue deserves special mention, as at the time of this writing, it +was broken on all BSDs except NetBSD (usually it doesn't work reliably +with anything but sockets and pipes, except on Darwin, where of course +it's completely useless). Unlike epoll, however, whose brokenness +is by design, these kqueue bugs can (and eventually will) be fixed +without API changes to existing programs. For this reason it's not being +"auto-detected" unless you explicitly specify it in the flags (i.e. using +C) or libev was compiled on a known-to-be-good (-enough) +system like NetBSD. You still can embed kqueue into a normal poll or select backend and use it only for sockets (after having made sure that sockets work with kqueue on @@ -1929,10 +1937,11 @@ it did. The path does not need to exist: changing from "path exists" to "path does -not exist" is a status change like any other. The condition "path does -not exist" is signified by the C field being zero (which is -otherwise always forced to be at least one) and all the other fields of -the stat buffer having unspecified contents. +not exist" is a status change like any other. The condition "path does not +exist" (or more correctly "path cannot be stat'ed") is signified by the +C field being zero (which is otherwise always forced to be at +least one) and all the other fields of the stat buffer having unspecified +contents. The path I end in a slash or contain special components such as C<.> or C<..>. The path I be absolute: If it is relative and @@ -1952,9 +1961,9 @@ resource-intensive. At the time of this writing, the only OS-specific interface implemented -is the Linux inotify interface (implementing kqueue support is left as -an exercise for the reader. Note, however, that the author sees no way -of implementing C semantics with kqueue). +is the Linux inotify interface (implementing kqueue support is left as an +exercise for the reader. Note, however, that the author sees no way of +implementing C semantics with kqueue, except as a hint). =head3 ABI Issues (Largefile Support) @@ -1975,23 +1984,43 @@ =head3 Inotify and Kqueue -When C support has been compiled into libev (generally -only available with Linux 2.6.25 or above due to bugs in earlier -implementations) and present at runtime, it will be used to speed up -change detection where possible. The inotify descriptor will be created -lazily when the first C watcher is being started. +When C support has been compiled into libev and present at +runtime, it will be used to speed up change detection where possible. The +inotify descriptor will be created lazily when the first C +watcher is being started. Inotify presence does not change the semantics of C watchers except that changes might be detected earlier, and in some cases, to avoid making regular C calls. Even in the presence of inotify support there are many cases where libev has to resort to regular C polling, -but as long as the path exists, libev usually gets away without polling. +but as long as kernel 2.6.25 or newer is used (2.6.24 and older have too +many bugs), the path exists (i.e. stat succeeds), and the path resides on +a local filesystem (libev currently assumes only ext2/3, jfs, reiserfs and +xfs are fully working) libev usually gets away without polling. There is no support for kqueue, as apparently it cannot be used to implement this functionality, due to the requirement of having a file descriptor open on the object at all times, and detecting renames, unlinks etc. is difficult. +=head3 C is a synchronous operation + +Libev doesn't normally do any kind of I/O itself, and so is not blocking +the process. The exception are C watchers - those call C, which is a synchronous operation. + +For local paths, this usually doesn't matter: unless the system is very +busy or the intervals between stat's are large, a stat call will be fast, +as the path data is suually in memory already (except when starting the +watcher). + +For networked file systems, calling C can block an indefinite +time due to network issues, and even under good conditions, a stat call +often takes multiple milliseconds. + +Therefore, it is best to avoid using C watchers on networked +paths, although this is fully supported by libev. + =head3 The special problem of stat time resolution The C system call only supports full-second resolution portably,