--- libev/ev.pod 2007/11/27 20:38:07 1.55 +++ libev/ev.pod 2008/01/15 04:07:37 1.119 @@ -6,7 +6,7 @@ #include -=head1 EXAMPLE PROGRAM +=head2 EXAMPLE PROGRAM #include @@ -50,8 +50,12 @@ =head1 DESCRIPTION +The newest version of this document is also available as a html-formatted +web page you might find easier to navigate when reading it for the first +time: L. + Libev is an event loop: you register interest in certain events (such as a -file descriptor being readable or a timeout occuring), and it will manage +file descriptor being readable or a timeout occurring), and it will manage these event sources and provide your program with events. To do this, it must take more or less complete control over your process @@ -63,14 +67,15 @@ details of the event, and then hand it over to libev by I the watcher. -=head1 FEATURES +=head2 FEATURES -Libev supports C, C, the Linux-specific C, the +BSD-specific C and the Solaris-specific event port mechanisms +for file descriptor events (C), the Linux C interface +(for C), relative timers (C), absolute timers +with customised rescheduling (C), synchronous signals +(C), process status change events (C), and event +watchers dealing with the event loop mechanism itself (C, C, C and C watchers) as well as file watchers (C) and even limited support for fork events (C). @@ -79,7 +84,7 @@ L comparing it to libevent for example). -=head1 CONVENTIONS +=head2 CONVENTIONS Libev is very configurable. In this manual the default configuration will be described, which supports multiple event loops. For more info about @@ -88,14 +93,16 @@ loops, then all functions taking an initial argument of name C (which is always of type C) will not have this argument. -=head1 TIME REPRESENTATION +=head2 TIME REPRESENTATION Libev represents time as a single floating point number, representing the (fractional) number of seconds since the (POSIX) epoch (somewhere near the beginning of 1970, details are complicated, don't ask). This type is called C, which is what you should use too. It usually aliases to the C type in C, and when you need to do any calculations on -it, you should treat it as such. +it, you should treat it as some floatingpoint value. Unlike the name +component C might indicate, it is also used for time differences +throughout libev. =head1 GLOBAL FUNCTIONS @@ -110,18 +117,27 @@ C function is usually faster and also often returns the timestamp you actually want to know. +=item ev_sleep (ev_tstamp interval) + +Sleep for the given interval: The current thread will be blocked until +either it is interrupted or the given time interval has passed. Basically +this is a subsecond-resolution C. + =item int ev_version_major () =item int ev_version_minor () -You can find out the major and minor version numbers of the library +You can find out the major and minor ABI version numbers of the library you linked against by calling the functions C and C. If you want, you can compare against the global symbols C and C, which specify the version of the library your program was compiled against. +These version numbers refer to the ABI version of the library, not the +release version. + Usually, it's a good idea to terminate if the major versions mismatch, -as this indicates an incompatible change. Minor versions are usually +as this indicates an incompatible change. Minor versions are usually compatible to older versions, so a larger minor version alone is usually not a problem. @@ -164,13 +180,14 @@ See the description of C watchers for more info. -=item ev_set_allocator (void *(*cb)(void *ptr, size_t size)) +=item ev_set_allocator (void *(*cb)(void *ptr, long size)) -Sets the allocation function to use (the prototype and semantics are -identical to the realloc C function). It is used to allocate and free -memory (no surprises here). If it returns zero when memory needs to be -allocated, the library might abort or take some potentially destructive -action. The default is your system realloc function. +Sets the allocation function to use (the prototype is similar - the +semantics is identical - to the realloc C function). It is used to +allocate and free memory (no surprises here). If it returns zero when +memory needs to be allocated, the library might abort or take some +potentially destructive action. The default is your system realloc +function. You could override this function in high-availability programs to, say, free some memory if it cannot allocate memory, to use a special allocator, @@ -245,6 +262,13 @@ If you don't know what event loop to use, use the one returned from this function. +The default loop is the only loop that can handle C and +C watchers, and to do this, it always registers a handler +for C. If this is a problem for your app you can either +create a dynamic loop with C that doesn't do that, or you +can simply overwrite the C signal handler I calling +C. + The flags argument can be used to specify special behaviour or specific backends to use, and is usually specified as C<0> (or C). @@ -266,78 +290,145 @@ useful to try out specific backends to test their performance, or to work around bugs. +=item C + +Instead of calling C or C manually after +a fork, you can also make libev check for a fork in each iteration by +enabling this flag. + +This works by calling C on every iteration of the loop, +and thus this might slow down your event loop if you do a lot of loop +iterations and little real work, but is usually not noticeable (on my +Linux system for example, C is actually a simple 5-insn sequence +without a syscall and thus I fast, but my Linux system also has +C which is even faster). + +The big advantage of this flag is that you can forget about fork (and +forget about forgetting to tell libev about forking) when you use this +flag. + +This flag setting cannot be overriden or specified in the C +environment variable. + =item C (value 1, portable select backend) This is your standard select(2) backend. Not I standard, as libev tries to roll its own fd_set with no limits on the number of fds, but if that fails, expect a fairly low limit on the number of fds when -using this backend. It doesn't scale too well (O(highest_fd)), but its usually -the fastest backend for a low number of fds. +using this backend. It doesn't scale too well (O(highest_fd)), but its +usually the fastest backend for a low number of (low-numbered :) fds. + +To get good performance out of this backend you need a high amount of +parallelity (most of the file descriptors should be busy). If you are +writing a server, you should C in a loop to accept as many +connections as possible during one iteration. You might also want to have +a look at C to increase the amount of +readyness notifications you get per iteration. =item C (value 2, poll backend, available everywhere except on windows) -And this is your standard poll(2) backend. It's more complicated than -select, but handles sparse fds better and has no artificial limit on the -number of fds you can use (except it will slow down considerably with a -lot of inactive fds). It scales similarly to select, i.e. O(total_fds). +And this is your standard poll(2) backend. It's more complicated +than select, but handles sparse fds better and has no artificial +limit on the number of fds you can use (except it will slow down +considerably with a lot of inactive fds). It scales similarly to select, +i.e. O(total_fds). See the entry for C, above, for +performance tips. =item C (value 4, Linux) For few fds, this backend is a bit little slower than poll and select, -but it scales phenomenally better. While poll and select usually scale like -O(total_fds) where n is the total number of fds (or the highest fd), epoll scales -either O(1) or O(active_fds). +but it scales phenomenally better. While poll and select usually scale +like O(total_fds) where n is the total number of fds (or the highest fd), +epoll scales either O(1) or O(active_fds). The epoll design has a number +of shortcomings, such as silently dropping events in some hard-to-detect +cases and rewiring a syscall per fd change, no fork support and bad +support for dup. -While stopping and starting an I/O watcher in the same iteration will -result in some caching, there is still a syscall per such incident +While stopping, setting and starting an I/O watcher in the same iteration +will result in some caching, there is still a syscall per such incident (because the fd could point to a different file description now), so its -best to avoid that. Also, dup()ed file descriptors might not work very -well if you register events for both fds. +best to avoid that. Also, C'ed file descriptors might not work +very well if you register events for both fds. Please note that epoll sometimes generates spurious notifications, so you need to use non-blocking I/O or other means to avoid blocking when no data (or space) is available. +Best performance from this backend is achieved by not unregistering all +watchers for a file descriptor until it has been closed, if possible, i.e. +keep at least one watcher active per fd at all times. + +While nominally embeddeble in other event loops, this feature is broken in +all kernel versions tested so far. + =item C (value 8, most BSD clones) Kqueue deserves special mention, as at the time of this writing, it -was broken on all BSDs except NetBSD (usually it doesn't work with -anything but sockets and pipes, except on Darwin, where of course its -completely useless). For this reason its not being "autodetected" +was broken on all BSDs except NetBSD (usually it doesn't work reliably +with anything but sockets and pipes, except on Darwin, where of course +it's completely useless). For this reason it's not being "autodetected" unless you explicitly specify it explicitly in the flags (i.e. using -C). +C) or libev was compiled on a known-to-be-good (-enough) +system like NetBSD. + +You still can embed kqueue into a normal poll or select backend and use it +only for sockets (after having made sure that sockets work with kqueue on +the target platform). See C watchers for more info. It scales in the same way as the epoll backend, but the interface to the kernel is more efficient (which says nothing about its actual speed, of -course). While starting and stopping an I/O watcher does not cause an -extra syscall as with epoll, it still adds up to four event changes per -incident, so its best to avoid that. +course). While stopping, setting and starting an I/O watcher does never +cause an extra syscall as with C, it still adds up to +two event changes per incident, support for C is very bad and it +drops fds silently in similarly hard-to-detect cases. + +This backend usually performs well under most conditions. + +While nominally embeddable in other event loops, this doesn't work +everywhere, so you might need to test for this. And since it is broken +almost everywhere, you should only use it when you have a lot of sockets +(for which it usually works), by embedding it into another event loop +(e.g. C or C) and using it only for +sockets. =item C (value 16, Solaris 8) -This is not implemented yet (and might never be). +This is not implemented yet (and might never be, unless you send me an +implementation). According to reports, C only supports sockets +and is not embeddable, which would limit the usefulness of this backend +immensely. =item C (value 32, Solaris 10) -This uses the Solaris 10 port mechanism. As with everything on Solaris, +This uses the Solaris 10 event port mechanism. As with everything on Solaris, it's really slow, but it still scales very well (O(active_fds)). -Please note that solaris ports can result in a lot of spurious +Please note that solaris event ports can deliver a lot of spurious notifications, so you need to use non-blocking I/O or other means to avoid blocking when no data (or space) is available. +While this backend scales well, it requires one system call per active +file descriptor per loop iteration. For small and medium numbers of file +descriptors a "slow" C or C backend +might perform better. + +On the positive side, ignoring the spurious readyness notifications, this +backend actually performed to specification in all tests and is fully +embeddable, which is a rare feat among the OS-specific backends. + =item C Try all backends (even potentially broken ones that wouldn't be tried with C). Since this is a mask, you can do stuff such as C. +It is definitely not recommended to use this flag. + =back If one or more of these are ored into the flags value, then only these -backends will be tried (in the reverse order as given here). If none are -specified, most compiled-in backend will be tried, usually in reverse -order of their flag values :) +backends will be tried (in the reverse order as listed here). If none are +specified, all backends in C will be tried. The most typical usage is like this: @@ -375,9 +466,18 @@ sense, so e.g. C might still return true. It is your responsibility to either stop all watchers cleanly yoursef I calling this function, or cope with the fact afterwards (which is usually -the easiest thing, youc na just ignore the watchers and/or C them +the easiest thing, you can just ignore the watchers and/or C them for example). +Note that certain global state, such as signal state, will not be freed by +this function, and related watchers (such as signal and child watchers) +would need to be stopped manually. + +In general it is not advisable to call this function except in the +rare occasion where you really need to free e.g. the signal handling +pipe fds. If you need dynamically allocated loops it is better to use +C and C). + =item ev_loop_destroy (loop) Like C, but destroys an event loop created by an @@ -385,14 +485,16 @@ =item ev_default_fork () -This function reinitialises the kernel state for backends that have -one. Despite the name, you can call it anytime, but it makes most sense -after forking, in either the parent or child process (or both, but that -again makes little sense). - -You I call this function in the child process after forking if and -only if you want to use the event library in both processes. If you just -fork+exec, you don't have to call it. +This function sets a flag that causes subsequent C iterations +to reinitialise the kernel state for backends that have one. Despite the +name, you can call it anytime, but it makes most sense after forking, in +the child process (or both child and parent, but that again makes little +sense). You I call it in the child before using any of the libev +functions, and it will only take effect at the next C iteration. + +On the other hand, you only need to call this function in the child +process if and only if you want to use the event library in the child. If +you just fork+exec, you don't have to call it at all. The function itself is quite fast and it's usually not a problem to call it just in case after a fork. To make this easy, the function will fit in @@ -400,16 +502,22 @@ pthread_atfork (0, 0, ev_default_fork); -At the moment, C and C are safe to use -without calling this function, so if you force one of those backends you -do not need to care. - =item ev_loop_fork (loop) Like C, but acts on an event loop created by C. Yes, you have to call this on every allocated event loop after fork, and how you do this is entirely your own problem. +=item unsigned int ev_loop_count (loop) + +Returns the count of loop iterations for the loop, which is identical to +the number of times libev did poll for new events. It starts at C<0> and +happily wraps around with enough iterations. + +This value can sometimes be useful as a generation counter of sorts (it +"ticks" the number of loop iterations), as it roughly corresponds with +C and C calls. + =item unsigned int ev_backend (loop) Returns one of the C flags indicating the event backend in @@ -421,7 +529,7 @@ received events and started processing them. This timestamp does not change as long as callbacks are being processed, and this is also the base time used for relative timers. You can treat it as the timestamp of the -event occuring (or more correctly, libev finding out about it). +event occurring (or more correctly, libev finding out about it). =item ev_loop (loop, int flags) @@ -452,12 +560,17 @@ Here are the gory details of what C does: - * If there are no active watchers (reference count is zero), return. - - Queue prepare watchers and then call all outstanding watchers. + - Before the first iteration, call any pending watchers. + * If EVFLAG_FORKCHECK was used, check for a fork. + - If a fork was detected, queue and call all fork watchers. + - Queue and call all prepare watchers. - If we have been forked, recreate the kernel state. - Update the kernel state with all outstanding changes. - Update the "event loop time". - - Calculate for how long to block. + - Calculate for how long to sleep or block, if at all + (active idle watchers, EVLOOP_NONBLOCK or not having + any active watchers at all will result in not sleeping). + - Sleep if the I/O and timer collect interval say so. - Block the process, waiting for any events. - Queue all outstanding I/O (fd) events. - Update the "event loop time" and do time jump handling. @@ -468,10 +581,11 @@ - Call all queued watchers in reverse order (i.e. check watchers first). Signals and child watchers are implemented as I/O watchers, and will be handled here by queueing them when their watcher gets executed. - - If ev_unloop has been called or EVLOOP_ONESHOT or EVLOOP_NONBLOCK - were used, return, otherwise continue with step *. + - If ev_unloop has been called, or EVLOOP_ONESHOT or EVLOOP_NONBLOCK + were used, or there are no active watchers, return, otherwise + continue with step *. -Example: Queue some jobs and then loop until no events are outsanding +Example: Queue some jobs and then loop until no events are outstanding anymore. ... queue jobs here, make sure they register event watchers as long @@ -486,6 +600,8 @@ C, which will make the innermost C call return, or C, which will make all nested C calls return. +This "unloop state" will be cleared when entering C again. + =item ev_ref (loop) =item ev_unref (loop) @@ -499,7 +615,9 @@ visible to the libev user and should not keep C from exiting if no event watchers registered by it are active. It is also an excellent way to do this for generic recurring timers or from within third-party -libraries. Just remember to I and I. +libraries. Just remember to I and I +(but only if the watcher wasn't active before, or was active before, +respectively). Example: Create a signal watcher, but keep it from keeping C running when nothing else is active. @@ -514,6 +632,42 @@ ev_ref (loop); ev_signal_stop (loop, &exitsig); +=item ev_set_io_collect_interval (loop, ev_tstamp interval) + +=item ev_set_timeout_collect_interval (loop, ev_tstamp interval) + +These advanced functions influence the time that libev will spend waiting +for events. Both are by default C<0>, meaning that libev will try to +invoke timer/periodic callbacks and I/O callbacks with minimum latency. + +Setting these to a higher value (the C I be >= C<0>) +allows libev to delay invocation of I/O and timer/periodic callbacks to +increase efficiency of loop iterations. + +The background is that sometimes your program runs just fast enough to +handle one (or very few) event(s) per loop iteration. While this makes +the program responsive, it also wastes a lot of CPU time to poll for new +events, especially with backends like C. =item EV_USE_SELECT @@ -1980,6 +2510,14 @@ it is assumed that all these functions actually work on fds, even on win32. Should not be defined on non-win32 platforms. +=item EV_FD_TO_WIN32_HANDLE + +If C is enabled, then libev needs a way to map +file descriptors to socket handles. When not defining this symbol (the +default), then libev will call C<_get_osfhandle>, which is usually +correct. In some cases, programs use their own file descriptor management, +in which case they can provide this function to map fds to socket handles. + =item EV_USE_POLL If defined to be C<1>, libev will compile in support for the C(2) @@ -2016,11 +2554,17 @@ reserved for future expansion, works like the USE symbols above. +=item EV_USE_INOTIFY + +If defined to be C<1>, libev will compile in support for the Linux inotify +interface to speed up C watchers. Its actual availability will +be detected at runtime. + =item EV_H The name of the F header file used to include it. The default if -undefined is C<< >> in F and C<"ev.h"> in F. This -can be used to virtually rename the F header file in case of conflicts. +undefined is C<"ev.h"> in F, F and F. This can be +used to virtually rename the F header file in case of conflicts. =item EV_CONFIG_H @@ -2031,7 +2575,7 @@ =item EV_EVENT_H Similarly to C, this macro can be used to override F's idea -of how the F header can be found. +of how the F header can be found, the default is C<"event.h">. =item EV_PROTOTYPES @@ -2048,12 +2592,35 @@ for multiple event loops and there is no first event loop pointer argument. Instead, all functions act on the single default loop. +=item EV_MINPRI + +=item EV_MAXPRI + +The range of allowed priorities. C must be smaller or equal to +C, but otherwise there are no non-obvious limitations. You can +provide for more priorities by overriding those symbols (usually defined +to be C<-2> and C<2>, respectively). + +When doing priority-based operations, libev usually has to linearly search +all the priorities, so having many of them (hundreds) uses a lot of space +and time, so using the defaults of five priorities (-2 .. +2) is usually +fine. + +If your embedding app does not need any priorities, defining these both to +C<0> will save some memory and cpu. + =item EV_PERIODIC_ENABLE If undefined or defined to be C<1>, then periodic timers are supported. If defined to be C<0>, then they are not. Disabling them saves a few kB of code. +=item EV_IDLE_ENABLE + +If undefined or defined to be C<1>, then idle watchers are supported. If +defined to be C<0>, then they are not. Disabling them saves a few kB of +code. + =item EV_EMBED_ENABLE If undefined or defined to be C<1>, then embed watchers are supported. If @@ -2080,7 +2647,15 @@ C watchers use a small hash table to distribute workload by pid. The default size is C<16> (or C<1> with C), usually more than enough. If you need to manage thousands of children you might want to -increase this value. +increase this value (I be a power of two). + +=item EV_INOTIFY_HASHSIZE + +C watchers use a small hash table to distribute workload by +inotify watch id. The default size is C<16> (or C<1> with C), +usually more than enough. If you need to manage thousands of C +watchers you might want to increase this value (I be a power of +two). =item EV_COMMON @@ -2103,11 +2678,36 @@ Can be used to change the callback member declaration in each watcher, and the way callbacks are invoked and set. Must expand to a struct member -definition and a statement, respectively. See the F header file for +definition and a statement, respectively. See the F header file for their default definitions. One possible use for overriding these is to avoid the C as first argument in all cases, or to use method calls instead of plain function calls in C++. +=head2 EXPORTED API SYMBOLS + +If you need to re-export the API (e.g. via a dll) and you need a list of +exported symbols, you can use the provided F files which list +all public symbols, one per line: + + Symbols.ev for libev proper + Symbols.event for the libevent emulation + +This can also be used to rename all public symbols to avoid clashes with +multiple versions of libev linked together (which is obviously bad in +itself, but sometimes it is inconvinient to avoid this). + +A sed command like this will create wrapper C<#define>'s that you need to +include before including F: + + wrap.h + +This would create a file F which essentially looks like this: + + #define ev_backend myprefix_ev_backend + #define ev_check_start myprefix_ev_check_start + #define ev_check_stop myprefix_ev_check_stop + ... + =head2 EXAMPLES For a real-world example of a program the includes libev @@ -2119,12 +2719,17 @@ file. The usage in rxvt-unicode is simpler. It has a F header file -that everybody includes and which overrides some autoconf choices: +that everybody includes and which overrides some configure choices: + #define EV_MINIMAL 1 #define EV_USE_POLL 0 #define EV_MULTIPLICITY 0 - #define EV_PERIODICS 0 + #define EV_PERIODIC_ENABLE 0 + #define EV_STAT_ENABLE 0 + #define EV_FORK_ENABLE 0 #define EV_CONFIG_H + #define EV_MINPRI 0 + #define EV_MAXPRI 0 #include "ev++.h" @@ -2140,23 +2745,123 @@ libev will be explained. For complexity discussions about backends see the documentation for C. +All of the following are about amortised time: If an array needs to be +extended, libev needs to realloc and move the whole array, but this +happens asymptotically never with higher number of elements, so O(1) might +mean it might do a lengthy realloc operation in rare cases, but on average +it is much faster and asymptotically approaches constant time. + =over 4 =item Starting and stopping timer/periodic watchers: O(log skipped_other_timers) -=item Changing timer/periodic watchers (by autorepeat, again): O(log skipped_other_timers) +This means that, when you have a watcher that triggers in one hour and +there are 100 watchers that would trigger before that then inserting will +have to skip roughly seven (C) of these watchers. + +=item Changing timer/periodic watchers (by autorepeat or calling again): O(log skipped_other_timers) + +That means that changing a timer costs less than removing/adding them +as only the relative motion in the event queue has to be paid for. =item Starting io/check/prepare/idle/signal/child watchers: O(1) +These just add the watcher into an array or at the head of a list. + =item Stopping check/prepare/idle watchers: O(1) -=item Stopping an io/signal/child watcher: O(number_of_watchers_for_this_(fd/signal/pid % 16)) +=item Stopping an io/signal/child watcher: O(number_of_watchers_for_this_(fd/signal/pid % EV_PID_HASHSIZE)) -=item Finding the next timer per loop iteration: O(1) +These watchers are stored in lists then need to be walked to find the +correct watcher to remove. The lists are usually short (you don't usually +have many watchers waiting for the same fd or signal). + +=item Finding the next timer in each loop iteration: O(1) + +By virtue of using a binary heap, the next timer is always found at the +beginning of the storage array. =item Each change on a file descriptor per loop iteration: O(number_of_watchers_for_this_fd) -=item Activating one watcher: O(1) +A change means an I/O watcher gets started or stopped, which requires +libev to recalculate its status (and possibly tell the kernel, depending +on backend and wether C was used). + +=item Activating one watcher (putting it into the pending state): O(1) + +=item Priority handling: O(number_of_priorities) + +Priorities are implemented by allocating some space for each +priority. When doing priority-based operations, libev usually has to +linearly search all the priorities, but starting/stopping and activating +watchers becomes O(1) w.r.t. prioritiy handling. + +=back + + +=head1 Win32 platform limitations and workarounds + +Win32 doesn't support any of the standards (e.g. POSIX) that libev +requires, and its I/O model is fundamentally incompatible with the POSIX +model. Libev still offers limited functionality on this platform in +the form of the C backend, and only supports socket +descriptors. This only applies when using Win32 natively, not when using +e.g. cygwin. + +There is no supported compilation method available on windows except +embedding it into other applications. + +Due to the many, low, and arbitrary limits on the win32 platform and the +abysmal performance of winsockets, using a large number of sockets is not +recommended (and not reasonable). If your program needs to use more than +a hundred or so sockets, then likely it needs to use a totally different +implementation for windows, as libev offers the POSIX model, which cannot +be implemented efficiently on windows (microsoft monopoly games). + +=over 4 + +=item The winsocket select function + +The winsocket C