… | |
… | |
386 | For few fds, this backend is a bit little slower than poll and select, |
386 | For few fds, this backend is a bit little slower than poll and select, |
387 | but it scales phenomenally better. While poll and select usually scale |
387 | but it scales phenomenally better. While poll and select usually scale |
388 | like O(total_fds) where n is the total number of fds (or the highest fd), |
388 | like O(total_fds) where n is the total number of fds (or the highest fd), |
389 | epoll scales either O(1) or O(active_fds). |
389 | epoll scales either O(1) or O(active_fds). |
390 | |
390 | |
391 | The epoll syscalls are the most misdesigned of the more advanced event |
391 | The epoll mechanism deserves honorable mention as the most misdesigned |
392 | mechanisms: problems include silently dropping fds, requiring a system |
392 | of the more advanced event mechanisms: mere annoyances include silently |
393 | call per change per fd (and unnecessary guessing of parameters), problems |
393 | dropping file descriptors, requiring a system call per change per file |
|
|
394 | descriptor (and unnecessary guessing of parameters), problems with dup and |
394 | with dup and so on. The biggest issue is fork races, however - if a |
395 | so on. The biggest issue is fork races, however - if a program forks then |
395 | program forks then I<both> parent and child process have to recreate the |
396 | I<both> parent and child process have to recreate the epoll set, which can |
396 | epoll set, which can take considerable time (one syscall per fd) and is of |
397 | take considerable time (one syscall per file descriptor) and is of course |
397 | course hard to detect. |
398 | hard to detect. |
398 | |
399 | |
399 | Epoll is also notoriously buggy - embedding epoll fds should work, but |
400 | Epoll is also notoriously buggy - embedding epoll fds I<should> work, but |
400 | of course doesn't, and epoll just loves to report events for totally |
401 | of course I<doesn't>, and epoll just loves to report events for totally |
401 | I<different> file descriptors (even already closed ones, so one cannot |
402 | I<different> file descriptors (even already closed ones, so one cannot |
402 | even remove them from the set) than registered in the set (especially |
403 | even remove them from the set) than registered in the set (especially |
403 | on SMP systems). Libev tries to counter these spurious notifications by |
404 | on SMP systems). Libev tries to counter these spurious notifications by |
404 | employing an additional generation counter and comparing that against the |
405 | employing an additional generation counter and comparing that against the |
405 | events to filter out spurious ones. |
406 | events to filter out spurious ones, recreating the set when required. |
406 | |
407 | |
407 | While stopping, setting and starting an I/O watcher in the same iteration |
408 | While stopping, setting and starting an I/O watcher in the same iteration |
408 | will result in some caching, there is still a system call per such incident |
409 | will result in some caching, there is still a system call per such |
409 | (because the fd could point to a different file description now), so its |
410 | incident (because the same I<file descriptor> could point to a different |
410 | best to avoid that. Also, C<dup ()>'ed file descriptors might not work |
411 | I<file description> now), so its best to avoid that. Also, C<dup ()>'ed |
411 | very well if you register events for both fds. |
412 | file descriptors might not work very well if you register events for both |
|
|
413 | file descriptors. |
412 | |
414 | |
413 | Best performance from this backend is achieved by not unregistering all |
415 | Best performance from this backend is achieved by not unregistering all |
414 | watchers for a file descriptor until it has been closed, if possible, |
416 | watchers for a file descriptor until it has been closed, if possible, |
415 | i.e. keep at least one watcher active per fd at all times. Stopping and |
417 | i.e. keep at least one watcher active per fd at all times. Stopping and |
416 | starting a watcher (without re-setting it) also usually doesn't cause |
418 | starting a watcher (without re-setting it) also usually doesn't cause |
… | |
… | |
424 | This backend maps C<EV_READ> and C<EV_WRITE> in the same way as |
426 | This backend maps C<EV_READ> and C<EV_WRITE> in the same way as |
425 | C<EVBACKEND_POLL>. |
427 | C<EVBACKEND_POLL>. |
426 | |
428 | |
427 | =item C<EVBACKEND_KQUEUE> (value 8, most BSD clones) |
429 | =item C<EVBACKEND_KQUEUE> (value 8, most BSD clones) |
428 | |
430 | |
429 | Kqueue deserves special mention, as at the time of this writing, it was |
431 | Kqueue deserves special mention, as at the time of this writing, it |
430 | broken on all BSDs except NetBSD (usually it doesn't work reliably with |
432 | was broken on all BSDs except NetBSD (usually it doesn't work reliably |
431 | anything but sockets and pipes, except on Darwin, where of course it's |
433 | with anything but sockets and pipes, except on Darwin, where of course |
432 | completely useless). For this reason it's not being "auto-detected" unless |
434 | it's completely useless). Unlike epoll, however, whose brokenness |
433 | you explicitly specify it in the flags (i.e. using C<EVBACKEND_KQUEUE>) or |
435 | is by design, these kqueue bugs can (and eventually will) be fixed |
434 | libev was compiled on a known-to-be-good (-enough) system like NetBSD. |
436 | without API changes to existing programs. For this reason it's not being |
|
|
437 | "auto-detected" unless you explicitly specify it in the flags (i.e. using |
|
|
438 | C<EVBACKEND_KQUEUE>) or libev was compiled on a known-to-be-good (-enough) |
|
|
439 | system like NetBSD. |
435 | |
440 | |
436 | You still can embed kqueue into a normal poll or select backend and use it |
441 | You still can embed kqueue into a normal poll or select backend and use it |
437 | only for sockets (after having made sure that sockets work with kqueue on |
442 | only for sockets (after having made sure that sockets work with kqueue on |
438 | the target platform). See C<ev_embed> watchers for more info. |
443 | the target platform). See C<ev_embed> watchers for more info. |
439 | |
444 | |
… | |
… | |
1927 | C<stat> on that path in regular intervals (or when the OS says it changed) |
1932 | C<stat> on that path in regular intervals (or when the OS says it changed) |
1928 | and sees if it changed compared to the last time, invoking the callback if |
1933 | and sees if it changed compared to the last time, invoking the callback if |
1929 | it did. |
1934 | it did. |
1930 | |
1935 | |
1931 | The path does not need to exist: changing from "path exists" to "path does |
1936 | The path does not need to exist: changing from "path exists" to "path does |
1932 | not exist" is a status change like any other. The condition "path does |
1937 | not exist" is a status change like any other. The condition "path does not |
1933 | not exist" is signified by the C<st_nlink> field being zero (which is |
1938 | exist" (or more correctly "path cannot be stat'ed") is signified by the |
1934 | otherwise always forced to be at least one) and all the other fields of |
1939 | C<st_nlink> field being zero (which is otherwise always forced to be at |
1935 | the stat buffer having unspecified contents. |
1940 | least one) and all the other fields of the stat buffer having unspecified |
|
|
1941 | contents. |
1936 | |
1942 | |
1937 | The path I<must not> end in a slash or contain special components such as |
1943 | The path I<must not> end in a slash or contain special components such as |
1938 | C<.> or C<..>. The path I<should> be absolute: If it is relative and |
1944 | C<.> or C<..>. The path I<should> be absolute: If it is relative and |
1939 | your working directory changes, then the behaviour is undefined. |
1945 | your working directory changes, then the behaviour is undefined. |
1940 | |
1946 | |
… | |
… | |
1950 | This watcher type is not meant for massive numbers of stat watchers, |
1956 | This watcher type is not meant for massive numbers of stat watchers, |
1951 | as even with OS-supported change notifications, this can be |
1957 | as even with OS-supported change notifications, this can be |
1952 | resource-intensive. |
1958 | resource-intensive. |
1953 | |
1959 | |
1954 | At the time of this writing, the only OS-specific interface implemented |
1960 | At the time of this writing, the only OS-specific interface implemented |
1955 | is the Linux inotify interface (implementing kqueue support is left as |
1961 | is the Linux inotify interface (implementing kqueue support is left as an |
1956 | an exercise for the reader. Note, however, that the author sees no way |
1962 | exercise for the reader. Note, however, that the author sees no way of |
1957 | of implementing C<ev_stat> semantics with kqueue). |
1963 | implementing C<ev_stat> semantics with kqueue, except as a hint). |
1958 | |
1964 | |
1959 | =head3 ABI Issues (Largefile Support) |
1965 | =head3 ABI Issues (Largefile Support) |
1960 | |
1966 | |
1961 | Libev by default (unless the user overrides this) uses the default |
1967 | Libev by default (unless the user overrides this) uses the default |
1962 | compilation environment, which means that on systems with large file |
1968 | compilation environment, which means that on systems with large file |
… | |
… | |
1973 | to exchange stat structures with application programs compiled using the |
1979 | to exchange stat structures with application programs compiled using the |
1974 | default compilation environment. |
1980 | default compilation environment. |
1975 | |
1981 | |
1976 | =head3 Inotify and Kqueue |
1982 | =head3 Inotify and Kqueue |
1977 | |
1983 | |
1978 | When C<inotify (7)> support has been compiled into libev (generally |
1984 | When C<inotify (7)> support has been compiled into libev and present at |
1979 | only available with Linux 2.6.25 or above due to bugs in earlier |
1985 | runtime, it will be used to speed up change detection where possible. The |
1980 | implementations) and present at runtime, it will be used to speed up |
1986 | inotify descriptor will be created lazily when the first C<ev_stat> |
1981 | change detection where possible. The inotify descriptor will be created |
1987 | watcher is being started. |
1982 | lazily when the first C<ev_stat> watcher is being started. |
|
|
1983 | |
1988 | |
1984 | Inotify presence does not change the semantics of C<ev_stat> watchers |
1989 | Inotify presence does not change the semantics of C<ev_stat> watchers |
1985 | except that changes might be detected earlier, and in some cases, to avoid |
1990 | except that changes might be detected earlier, and in some cases, to avoid |
1986 | making regular C<stat> calls. Even in the presence of inotify support |
1991 | making regular C<stat> calls. Even in the presence of inotify support |
1987 | there are many cases where libev has to resort to regular C<stat> polling, |
1992 | there are many cases where libev has to resort to regular C<stat> polling, |
1988 | but as long as the path exists, libev usually gets away without polling. |
1993 | but as long as kernel 2.6.25 or newer is used (2.6.24 and older have too |
|
|
1994 | many bugs), the path exists (i.e. stat succeeds), and the path resides on |
|
|
1995 | a local filesystem (libev currently assumes only ext2/3, jfs, reiserfs and |
|
|
1996 | xfs are fully working) libev usually gets away without polling. |
1989 | |
1997 | |
1990 | There is no support for kqueue, as apparently it cannot be used to |
1998 | There is no support for kqueue, as apparently it cannot be used to |
1991 | implement this functionality, due to the requirement of having a file |
1999 | implement this functionality, due to the requirement of having a file |
1992 | descriptor open on the object at all times, and detecting renames, unlinks |
2000 | descriptor open on the object at all times, and detecting renames, unlinks |
1993 | etc. is difficult. |
2001 | etc. is difficult. |
|
|
2002 | |
|
|
2003 | =head3 C<stat ()> is a synchronous operation |
|
|
2004 | |
|
|
2005 | Libev doesn't normally do any kind of I/O itself, and so is not blocking |
|
|
2006 | the process. The exception are C<ev_stat> watchers - those call C<stat |
|
|
2007 | ()>, which is a synchronous operation. |
|
|
2008 | |
|
|
2009 | For local paths, this usually doesn't matter: unless the system is very |
|
|
2010 | busy or the intervals between stat's are large, a stat call will be fast, |
|
|
2011 | as the path data is suually in memory already (except when starting the |
|
|
2012 | watcher). |
|
|
2013 | |
|
|
2014 | For networked file systems, calling C<stat ()> can block an indefinite |
|
|
2015 | time due to network issues, and even under good conditions, a stat call |
|
|
2016 | often takes multiple milliseconds. |
|
|
2017 | |
|
|
2018 | Therefore, it is best to avoid using C<ev_stat> watchers on networked |
|
|
2019 | paths, although this is fully supported by libev. |
1994 | |
2020 | |
1995 | =head3 The special problem of stat time resolution |
2021 | =head3 The special problem of stat time resolution |
1996 | |
2022 | |
1997 | The C<stat ()> system call only supports full-second resolution portably, |
2023 | The C<stat ()> system call only supports full-second resolution portably, |
1998 | and even on systems where the resolution is higher, most file systems |
2024 | and even on systems where the resolution is higher, most file systems |