--- libev/ev.pod 2007/11/12 08:45:49 1.14 +++ libev/ev.pod 2020/01/22 01:50:42 1.459 @@ -1,15 +1,99 @@ +=encoding utf-8 + =head1 NAME libev - a high performance full-featured event loop written in C =head1 SYNOPSIS - #include + #include + +=head2 EXAMPLE PROGRAM + + // a single header file is required + #include + + #include // for puts + + // every watcher type has its own typedef'd struct + // with the name ev_TYPE + ev_io stdin_watcher; + ev_timer timeout_watcher; + + // all watcher callbacks have a similar signature + // this callback is called when data is readable on stdin + static void + stdin_cb (EV_P_ ev_io *w, int revents) + { + puts ("stdin ready"); + // for one-shot events, one must manually stop the watcher + // with its corresponding stop function. + ev_io_stop (EV_A_ w); + + // this causes all nested ev_run's to stop iterating + ev_break (EV_A_ EVBREAK_ALL); + } + + // another callback, this time for a time-out + static void + timeout_cb (EV_P_ ev_timer *w, int revents) + { + puts ("timeout"); + // this causes the innermost ev_run to stop iterating + ev_break (EV_A_ EVBREAK_ONE); + } + + int + main (void) + { + // use the default event loop unless you have special needs + struct ev_loop *loop = EV_DEFAULT; + + // initialise an io watcher, then start it + // this one will watch for stdin to become readable + ev_io_init (&stdin_watcher, stdin_cb, /*STDIN_FILENO*/ 0, EV_READ); + ev_io_start (loop, &stdin_watcher); + + // initialise a timer watcher, then start it + // simple non-repeating 5.5 second timeout + ev_timer_init (&timeout_watcher, timeout_cb, 5.5, 0.); + ev_timer_start (loop, &timeout_watcher); + + // now wait for events to arrive + ev_run (loop, 0); + + // break was called, so exit + return 0; + } + +=head1 ABOUT THIS DOCUMENT -=head1 DESCRIPTION +This document documents the libev software package. + +The newest version of this document is also available as an html-formatted +web page you might find easier to navigate when reading it for the first +time: L. + +While this document tries to be as complete as possible in documenting +libev, its usage and the rationale behind its design, it is not a tutorial +on event-based programming, nor will it introduce event-based programming +with libev. + +Familiarity with event based programming techniques in general is assumed +throughout this document. + +=head1 WHAT TO READ WHEN IN A HURRY + +This manual tries to be very detailed, but unfortunately, this also makes +it very long. If you just want to know the basics of libev, I suggest +reading L, then the L above and +look up the missing functions in L and the C and +C sections in L. + +=head1 ABOUT LIBEV Libev is an event loop: you register interest in certain events (such as a -file descriptor being readable or a timeout occuring), and it will manage +file descriptor being readable or a timeout occurring), and it will manage these event sources and provide your program with events. To do this, it must take more or less complete control over your process @@ -21,108 +105,311 @@ details of the event, and then hand it over to libev by I the watcher. -=head1 FEATURES +=head2 FEATURES + +Libev supports C (files, many character devices...). + +Epoll is truly the train wreck among event poll mechanisms, a frankenpoll, +cobbled together in a hurry, no thought to design or interaction with +others. Oh, the pain, will it ever stop... + +While stopping, setting and starting an I/O watcher in the same iteration +will result in some caching, there is still a system call per such +incident (because the same I could point to a different +I now), so its best to avoid that. Also, C'ed +file descriptors might not work very well if you register events for both +file descriptors. + +Best performance from this backend is achieved by not unregistering all +watchers for a file descriptor until it has been closed, if possible, +i.e. keep at least one watcher active per fd at all times. Stopping and +starting a watcher (without re-setting it) also usually doesn't cause +extra overhead. A fork can both result in spurious notifications as well +as in libev having to destroy and recreate the epoll object, which can +take considerable time and thus should be avoided. + +All this means that, in practice, C can be as fast or +faster than epoll for maybe up to a hundred file descriptors, depending on +the usage. So sad. + +While nominally embeddable in other event loops, this feature is broken in +a lot of kernel revisions, but probably(!) works in current versions. + +This backend maps C and C in the same way as +C. + +=item C (value 64, Linux) + +Use the Linux-specific Linux AIO (I C<< aio(7) >> but C<< +io_submit(2) >>) event interface available in post-4.18 kernels (but libev +only tries to use it in 4.19+). + +This is another Linux train wreck of an event interface. + +If this backend works for you (as of this writing, it was very +experimental), it is the best event interface available on Linux and might +be well worth enabling it - if it isn't available in your kernel this will +be detected and this backend will be skipped. + +This backend can batch oneshot requests and supports a user-space ring +buffer to receive events. It also doesn't suffer from most of the design +problems of epoll (such as not being able to remove event sources from +the epoll set), and generally sounds too good to be true. Because, this +being the Linux kernel, of course it suffers from a whole new set of +limitations, forcing you to fall back to epoll, inheriting all its design +issues. + +For one, it is not easily embeddable (but probably could be done using +an event fd at some extra overhead). It also is subject to a system wide +limit that can be configured in F. If no AIO +requests are left, this backend will be skipped during initialisation, and +will switch to epoll when the loop is active. + +Most problematic in practice, however, is that not all file descriptors +work with it. For example, in Linux 5.1, TCP sockets, pipes, event fds, +files, F and many others are supported, but ttys do not work +properly (a known bug that the kernel developers don't care about, see +L), so this is not +(yet?) a generic event polling interface. + +Overall, it seems the Linux developers just don't want it to have a +generic event handling mechanism other than C which have a high +overhead for the actual polling but can deliver many events at once. + +By setting a higher I you allow libev to spend more +time collecting I/O events, so you can handle more events per iteration, +at the cost of increasing latency. Timeouts (both C and +C) will not be affected. Setting this to a non-null value will +introduce an additional C call into most loop iterations. The +sleep time ensures that libev will not poll for I/O events more often then +once per this interval, on average (as long as the host time resolution is +good enough). + +Likewise, by setting a higher I you allow libev +to spend more time collecting timeouts, at the expense of increased +latency/jitter/inexactness (the watcher callback will be called +later). C watchers will not be affected. Setting this to a non-null +value will not introduce any overhead in libev. + +Many (busy) programs can usually benefit by setting the I/O collect +interval to a value near C<0.1> or so, which is often enough for +interactive servers (of course not for games), likewise for timeouts. It +usually doesn't make much sense to set it to a lower value than C<0.01>, +as this approaches the timing granularity of most systems. Note that if +you do transactions with the outside world and you can't increase the +parallelity, then this setting will limit your transaction rate (if you +need to poll once per transaction and the I/O collect interval is 0.01, +then you can't do more than 100 transactions per second). + +Setting the I can improve the opportunity for +saving power, as the program will "bundle" timer callback invocations that +are "near" in time together, by delaying some, thus reducing the number of +times the process sleeps and wakes up again. Another useful technique to +reduce iterations/wake-ups is to use C watchers and make sure +they fire on, say, one-second boundaries only. + +Example: we only need 0.1s timeout granularity, and we wish not to poll +more often than 100 times per second: + + ev_set_timeout_collect_interval (EV_DEFAULT_UC_ 0.1); + ev_set_io_collect_interval (EV_DEFAULT_UC_ 0.01); + +=item ev_invoke_pending (loop) + +This call will simply invoke all pending watchers while resetting their +pending state. Normally, C does this automatically when required, +but when overriding the invoke callback this call comes handy. This +function can be invoked from a watcher - this can be useful for example +when you want to do some lengthy calculation and want to pass further +event handling to another thread (you still have to make sure only one +thread executes within C or C of course). + +=item int ev_pending_count (loop) + +Returns the number of pending watchers - zero indicates that no watchers +are pending. + +=item ev_set_invoke_pending_cb (loop, void (*invoke_pending_cb)(EV_P)) + +This overrides the invoke pending functionality of the loop: Instead of +invoking all pending watchers when there are any, C will call +this callback instead. This is useful, for example, when you want to +invoke the actual watchers inside another context (another thread etc.). + +If you want to reset the callback, use C as new +callback. + +=item ev_set_loop_release_cb (loop, void (*release)(EV_P) throw (), void (*acquire)(EV_P) throw ()) + +Sometimes you want to share the same loop between multiple threads. This +can be done relatively simply by putting mutex_lock/unlock calls around +each call to a libev function. + +However, C can run an indefinite time, so it is not feasible +to wait for it to return. One way around this is to wake up the event +loop via C and C, another way is to set these +I and I callbacks on the loop. + +When set, then C will be called just before the thread is +suspended waiting for new events, and C is called just +afterwards. + +Ideally, C will just call your mutex_unlock function, and +C will just call the mutex_lock function again. + +While event loop modifications are allowed between invocations of +C and C (that's their only purpose after all), no +modifications done will affect the event loop, i.e. adding watchers will +have no effect on the set of file descriptors being watched, or the time +waited. Use an C watcher to wake up C when you want it +to take note of any changes you made. + +In theory, threads executing C will be async-cancel safe between +invocations of C and C. + +See also the locking example in the C section later in this +document. + +=item ev_set_userdata (loop, void *data) + +=item void *ev_userdata (loop) + +Set and retrieve a single C associated with a loop. When +C has never been called, then C returns +C<0>. + +These two functions can be used to associate arbitrary data with a loop, +and are intended solely for the C, C and +C callbacks described above, but of course can be (ab-)used for +any other purpose as well. + +=item ev_verify (loop) + +This function only does something when C support has been +compiled in, which is the default for non-minimal builds. It tries to go +through all internal structures and checks them for validity. If anything +is found to be inconsistent, it will print an error message to standard +error and call C. + +This can be used to catch bugs inside libev itself: under normal +circumstances, this function will never abort as of course libev keeps its +data structures consistent. =back + =head1 ANATOMY OF A WATCHER -A watcher is a structure that you create and register to record your -interest in some event. For instance, if you want to wait for STDIN to -become readable, you would create an C watcher for that: +In the following description, uppercase C in names stands for the +watcher type, e.g. C can mean C for timer +watchers and C for I/O watchers. + +A watcher is an opaque structure that you allocate and register to record +your interest in some event. To make a concrete example, imagine you want +to wait for STDIN to become readable, you would create an C watcher +for that: - static void my_cb (struct ev_loop *loop, struct ev_io *w, int revents) - { - ev_io_stop (w); - ev_unloop (loop, EVUNLOOP_ALL); - } + static void my_cb (struct ev_loop *loop, ev_io *w, int revents) + { + ev_io_stop (w); + ev_break (loop, EVBREAK_ALL); + } + + struct ev_loop *loop = ev_default_loop (0); - struct ev_loop *loop = ev_default_loop (0); - struct ev_io stdin_watcher; - ev_init (&stdin_watcher, my_cb); - ev_io_set (&stdin_watcher, STDIN_FILENO, EV_READ); - ev_io_start (loop, &stdin_watcher); - ev_loop (loop, 0); + ev_io stdin_watcher; + + ev_init (&stdin_watcher, my_cb); + ev_io_set (&stdin_watcher, STDIN_FILENO, EV_READ); + ev_io_start (loop, &stdin_watcher); + + ev_run (loop, 0); As you can see, you are responsible for allocating the memory for your -watcher structures (and it is usually a bad idea to do this on the stack, -although this can sometimes be quite valid). +watcher structures (and it is I a bad idea to do this on the +stack). + +Each watcher has an associated watcher structure (called C +or simply C, as typedefs are provided for all watcher structs). -Each watcher structure must be initialised by a call to C, which expects a callback to be provided. This -callback gets invoked each time the event occurs (or, in the case of io -watchers, each time the event loop detects that the file descriptor given -is readable and/or writable). - -Each watcher type has its own C<< ev__set (watcher *, ...) >> macro -with arguments specific to this watcher type. There is also a macro -to combine initialisation and setting in one call: C<< ev__init -(watcher *, callback, ...) >>. +Each watcher structure must be initialised by a call to C, which expects a callback to be provided. This callback is +invoked each time the event occurs (or, in the case of I/O watchers, each +time the event loop detects that the file descriptor given is readable +and/or writable). + +Each watcher type further has its own C<< ev_TYPE_set (watcher *, ...) >> +macro to configure it, with arguments specific to the watcher type. There +is also a macro to combine initialisation and setting in one call: C<< +ev_TYPE_init (watcher *, callback, ...) >>. To make the watcher actually watch out for events, you have to start it -with a watcher-specific start function (C<< ev__start (loop, watcher +with a watcher-specific start function (C<< ev_TYPE_start (loop, watcher *) >>), and you can stop watching for events at any time by calling the -corresponding stop function (C<< ev__stop (loop, watcher *) >>. +corresponding stop function (C<< ev_TYPE_stop (loop, watcher *) >>. As long as your watcher is active (has been started but not stopped) you must not touch the values stored in it. Most specifically you must never -reinitialise it or call its set method. - -You can check whether an event is active by calling the C macro. To see whether an event is outstanding (but the -callback for it has not been called yet) you can use the C macro. +reinitialise it or call its C macro. Each and every callback receives the event loop pointer as first, the registered watcher structure as second, and a bitset of received events as @@ -323,7 +1240,7 @@ The file descriptor in the C watcher has become readable and/or writable. -=item C +=item C The C watcher has timed out. @@ -339,6 +1256,10 @@ The pid specified in the C watcher has received a status change. +=item C + +The path specified in the C watcher changed its attributes somehow. + =item C The C watcher has determined that you have nothing better to do. @@ -347,86 +1268,560 @@ =item C -All C watchers are invoked just I C starts -to gather new events, and all C watchers are invoked just after -C has gathered them, but before it invokes any callbacks for any -received events. Callbacks of both watcher types can start and stop as -many watchers as they want, and all of them will be taken into account -(for example, a C watcher might start an idle watcher to keep -C from blocking). +All C watchers are invoked just I C starts to +gather new events, and all C watchers are queued (not invoked) +just after C has gathered them, but before it queues any callbacks +for any received events. That means C watchers are the last +watchers invoked before the event loop sleeps or polls for new events, and +C watchers will be invoked before any other watchers of the same +or lower priority within an event loop iteration. + +Callbacks of both watcher types can start and stop as many watchers as +they want, and all of them will be taken into account (for example, a +C watcher might start an idle watcher to keep C from +blocking). + +=item C + +The embedded event loop specified in the C watcher needs attention. + +=item C + +The event loop has been resumed in the child process after fork (see +C). + +=item C + +The event loop is about to be destroyed (see C). + +=item C + +The given async watcher has been asynchronously notified (see C). + +=item C + +Not ever sent (or otherwise used) by libev itself, but can be freely used +by libev users to signal watchers (e.g. via C). =item C -An unspecified error has occured, the watcher has been stopped. This might +An unspecified error has occurred, the watcher has been stopped. This might happen because the watcher could not be properly started because libev ran out of memory, a file descriptor was found to be closed or any other -problem. You best act on it by reporting the problem and somehow coping -with the watcher being stopped. +problem. Libev considers these application bugs. -Libev will usually signal a few "dummy" events together with an error, -for example it might indicate that a fd is readable or writable, and if -your callbacks is well-written it can just attempt the operation and cope -with the error from read() or write(). This will not work in multithreaded -programs, though, so beware. +You best act on it by reporting the problem and somehow coping with the +watcher being stopped. Note that well-written programs should not receive +an error ever, so when your watcher receives it, this usually indicates a +bug in your program. + +Libev will usually signal a few "dummy" events together with an error, for +example it might indicate that a fd is readable or writable, and if your +callbacks is well-written it can just attempt the operation and cope with +the error from read() or write(). This will not work in multi-threaded +programs, though, as the fd could already be closed and reused for another +thing, so beware. =back -=head2 ASSOCIATING CUSTOM DATA WITH A WATCHER +=head2 GENERIC WATCHER FUNCTIONS -Each watcher has, by default, a member C that you can change -and read at any time, libev will completely ignore it. This can be used -to associate arbitrary data with your watcher. If you need more data and -don't want to allocate memory and store a pointer to it in that data -member, you can also "subclass" the watcher type and provide your own -data: +=over 4 - struct my_io - { - struct ev_io io; - int otherfd; - void *somedata; - struct whatever *mostinteresting; - } +=item C (ev_TYPE *watcher, callback) -And since your callback will be called with a pointer to the watcher, you -can cast it back to your own type: +This macro initialises the generic portion of a watcher. The contents +of the watcher object can be arbitrary (so C will do). Only +the generic parts of the watcher are initialised, you I to call +the type-specific C macro afterwards to initialise the +type-specific parts. For each type there is also a C macro +which rolls both calls into one. - static void my_cb (struct ev_loop *loop, struct ev_io *w_, int revents) - { - struct my_io *w = (struct my_io *)w_; - ... - } +You can reinitialise a watcher at any time as long as it has been stopped +(or never started) and there are no pending events outstanding. + +The callback is always of type C. + +Example: Initialise an C watcher in two steps. + + ev_io w; + ev_init (&w, my_cb); + ev_io_set (&w, STDIN_FILENO, EV_READ); + +=item C (ev_TYPE *watcher, [args]) + +This macro initialises the type-specific parts of a watcher. You need to +call C at least once before you call this macro, but you can +call C any number of times. You must not, however, call this +macro on a watcher that is active (it can be pending, however, which is a +difference to the C macro). + +Although some watcher types do not have type-specific arguments +(e.g. C) you still need to call its C macro. + +See C, above, for an example. + +=item C (ev_TYPE *watcher, callback, [args]) + +This convenience macro rolls both C and C macro +calls into a single call. This is the most convenient method to initialise +a watcher. The same limitations apply, of course. + +Example: Initialise and set an C watcher in one step. + + ev_io_init (&w, my_cb, STDIN_FILENO, EV_READ); + +=item C (loop, ev_TYPE *watcher) + +Starts (activates) the given watcher. Only active watchers will receive +events. If the watcher is already active nothing will happen. + +Example: Start the C watcher that is being abused as example in this +whole section. + + ev_io_start (EV_DEFAULT_UC, &w); + +=item C (loop, ev_TYPE *watcher) + +Stops the given watcher if active, and clears the pending status (whether +the watcher was active or not). + +It is possible that stopped watchers are pending - for example, +non-repeating timers are being stopped when they become pending - but +calling C ensures that the watcher is neither active nor +pending. If you want to free or reuse the memory used by the watcher it is +therefore a good idea to always call its C function. -More interesting and less C-conformant ways of catsing your callback type -have been omitted.... +=item bool ev_is_active (ev_TYPE *watcher) + +Returns a true value iff the watcher is active (i.e. it has been started +and not yet been stopped). As long as a watcher is active you must not modify +it. + +=item bool ev_is_pending (ev_TYPE *watcher) + +Returns a true value iff the watcher is pending, (i.e. it has outstanding +events but its callback has not yet been invoked). As long as a watcher +is pending (but not active) you must not call an init function on it (but +C is safe), you must not change its priority, and you must +make sure the watcher is available to libev (e.g. you cannot C +it). + +=item callback ev_cb (ev_TYPE *watcher) + +Returns the callback currently set on the watcher. + +=item ev_set_cb (ev_TYPE *watcher, callback) + +Change the callback. You can change the callback at virtually any time +(modulo threads). + +=item ev_set_priority (ev_TYPE *watcher, int priority) + +=item int ev_priority (ev_TYPE *watcher) + +Set and query the priority of the watcher. The priority is a small +integer between C (default: C<2>) and C +(default: C<-2>). Pending watchers with higher priority will be invoked +before watchers with lower priority, but priority will not keep watchers +from being executed (except for C watchers). + +If you need to suppress invocation when higher priority events are pending +you need to look at C watchers, which provide this functionality. + +You I change the priority of a watcher as long as it is active or +pending. + +Setting a priority outside the range of C to C is +fine, as long as you do not mind that the priority value you query might +or might not have been clamped to the valid range. + +The default priority used by watchers when no priority has been set is +always C<0>, which is supposed to not be too high and not be too low :). + +See L, below, for a more thorough treatment of +priorities. + +=item ev_invoke (loop, ev_TYPE *watcher, int revents) + +Invoke the C with the given C and C. Neither +C nor C need to be valid as long as the watcher callback +can deal with that fact, as both are simply passed through to the +callback. + +=item int ev_clear_pending (loop, ev_TYPE *watcher) + +If the watcher is pending, this function clears its pending status and +returns its C bitset (as if its callback was invoked). If the +watcher isn't pending it does nothing and returns C<0>. + +Sometimes it can be useful to "poll" a watcher instead of waiting for its +callback to be invoked, which can be accomplished with this function. + +=item ev_feed_event (loop, ev_TYPE *watcher, int revents) + +Feeds the given event set into the event loop, as if the specified event +had happened for the specified watcher (which must be a pointer to an +initialised but not necessarily started event watcher). Obviously you must +not free the watcher as long as it has pending events. + +Stopping the watcher, letting libev invoke it, or calling +C will clear the pending event, even if the watcher was +not started in the first place. + +See also C and C for related +functions that do not need a watcher. + +=back + +See also the L and L idioms. + +=head2 WATCHER STATES + +There are various watcher states mentioned throughout this manual - +active, pending and so on. In this section these states and the rules to +transition between them will be described in more detail - and while these +rules might look complicated, they usually do "the right thing". + +=over 4 + +=item initialised + +Before a watcher can be registered with the event loop it has to be +initialised. This can be done with a call to C, or calls to +C followed by the watcher-specific C function. + +In this state it is simply some block of memory that is suitable for +use in an event loop. It can be moved around, freed, reused etc. at +will - as long as you either keep the memory contents intact, or call +C again. + +=item started/running/active + +Once a watcher has been started with a call to C it becomes +property of the event loop, and is actively waiting for events. While in +this state it cannot be accessed (except in a few documented ways), moved, +freed or anything else - the only legal thing is to keep a pointer to it, +and call libev functions on it that are documented to work on active watchers. + +=item pending + +If a watcher is active and libev determines that an event it is interested +in has occurred (such as a timer expiring), it will become pending. It will +stay in this pending state until either it is stopped or its callback is +about to be invoked, so it is not normally pending inside the watcher +callback. + +The watcher might or might not be active while it is pending (for example, +an expired non-repeating timer can be pending but no longer active). If it +is stopped, it can be freely accessed (e.g. by calling C), +but it is still property of the event loop at this time, so cannot be +moved, freed or reused. And if it is active the rules described in the +previous item still apply. + +It is also possible to feed an event on a watcher that is not active (e.g. +via C), in which case it becomes pending without being +active. + +=item stopped + +A watcher can be stopped implicitly by libev (in which case it might still +be pending), or explicitly by calling its C function. The +latter will clear any pending state the watcher might be in, regardless +of whether it was active or not, so stopping a watcher explicitly before +freeing it is often a good idea. + +While stopped (and not pending) the watcher is essentially in the +initialised state, that is, it can be reused, moved, modified in any way +you wish (but when you trash the memory block, you need to C +it again). + +=back + +=head2 WATCHER PRIORITY MODELS + +Many event loops support I, which are usually small +integers that influence the ordering of event callback invocation +between watchers in some way, all else being equal. + +In libev, watcher priorities can be set using C. See its +description for the more technical details such as the actual priority +range. + +There are two common ways how these these priorities are being interpreted +by event loops: + +In the more common lock-out model, higher priorities "lock out" invocation +of lower priority watchers, which means as long as higher priority +watchers receive events, lower priority watchers are not being invoked. + +The less common only-for-ordering model uses priorities solely to order +callback invocation within a single event loop iteration: Higher priority +watchers are invoked before lower priority ones, but they all get invoked +before polling for new events. + +Libev uses the second (only-for-ordering) model for all its watchers +except for idle watchers (which use the lock-out model). + +The rationale behind this is that implementing the lock-out model for +watchers is not well supported by most kernel interfaces, and most event +libraries will just poll for the same events again and again as long as +their callbacks have not been executed, which is very inefficient in the +common case of one high-priority watcher locking out a mass of lower +priority ones. + +Static (ordering) priorities are most useful when you have two or more +watchers handling the same resource: a typical usage example is having an +C watcher to receive data, and an associated C to handle +timeouts. Under load, data might be received while the program handles +other jobs, but since timers normally get invoked first, the timeout +handler will be executed before checking for data. In that case, giving +the timer a lower priority than the I/O watcher ensures that I/O will be +handled first even under adverse conditions (which is usually, but not +always, what you want). + +Since idle watchers use the "lock-out" model, meaning that idle watchers +will only be executed when no same or higher priority watchers have +received events, they can be used to implement the "lock-out" model when +required. + +For example, to emulate how many other event libraries handle priorities, +you can associate an C watcher to each such watcher, and in +the normal watcher callback, you just start the idle watcher. The real +processing is done in the idle watcher callback. This causes libev to +continuously poll and process kernel event data for the watcher, but when +the lock-out case is known to be rare (which in turn is rare :), this is +workable. + +Usually, however, the lock-out model implemented that way will perform +miserably under the type of load it was designed to handle. In that case, +it might be preferable to stop the real watcher before starting the +idle watcher, so the kernel will not have to process the event in case +the actual processing will be delayed for considerable time. + +Here is an example of an I/O watcher that should run at a strictly lower +priority than the default, and which should only process data when no +other events are pending: + + ev_idle idle; // actual processing watcher + ev_io io; // actual event watcher + + static void + io_cb (EV_P_ ev_io *w, int revents) + { + // stop the I/O watcher, we received the event, but + // are not yet ready to handle it. + ev_io_stop (EV_A_ w); + + // start the idle watcher to handle the actual event. + // it will not be executed as long as other watchers + // with the default priority are receiving events. + ev_idle_start (EV_A_ &idle); + } + + static void + idle_cb (EV_P_ ev_idle *w, int revents) + { + // actual processing + read (STDIN_FILENO, ...); + + // have to start the I/O watcher again, as + // we have handled the event + ev_io_start (EV_P_ &io); + } + + // initialisation + ev_idle_init (&idle, idle_cb); + ev_io_init (&io, io_cb, STDIN_FILENO, EV_READ); + ev_io_start (EV_DEFAULT_ &io); + +In the "real" world, it might also be beneficial to start a timer, so that +low-priority connections can not be locked out forever under load. This +enables your program to keep a lower latency for important connections +during short periods of high load, while not completely locking out less +important ones. =head1 WATCHER TYPES This section describes each watcher in detail, but will not repeat -information given in the last section. +information given in the last section. Any initialisation/set macros, +functions and members specific to the watcher type are explained. + +Most members are additionally marked with either I<[read-only]>, meaning +that, while the watcher is active, you can look at the member and expect +some sensible content, but you must not modify it (you can modify it while +the watcher is stopped to your hearts content), or I<[read-write]>, which +means you can expect it to have some sensible content while the watcher +is active, but you can also modify it. Modifying it may not do something +sensible or take immediate effect (or do anything at all), but libev will +not crash or malfunction in any way. -=head2 C - is this file descriptor readable or writable +In any case, the documentation for each member will explain what the +effects are, and if there are any additional access restrictions. + +=head2 C - is this file descriptor readable or writable? I/O watchers check whether a file descriptor is readable or writable -in each iteration of the event loop (This behaviour is called -level-triggering because you keep receiving events as long as the -condition persists. Remember you can stop the watcher if you don't want to -act on the event and neither want to receive future events). +in each iteration of the event loop, or, more precisely, when reading +would not block the process and writing would at least be able to write +some data. This behaviour is called level-triggering because you keep +receiving events as long as the condition persists. Remember you can stop +the watcher if you don't want to act on the event and neither want to +receive future events. -In general you can register as many read and/or write event watchers oer +In general you can register as many read and/or write event watchers per fd as you want (as long as you don't confuse yourself). Setting all file descriptors to non-blocking mode is also usually a good idea (but not required if you know what you are doing). -You have to be careful with dup'ed file descriptors, though. Some backends -(the linux epoll backend is a notable example) cannot handle dup'ed file -descriptors correctly if you register interest in two or more fds pointing -to the same file/socket etc. description. - -If you must do this, then force the use of a known-to-be-good backend -(at the time of this writing, this includes only EVMETHOD_SELECT and -EVMETHOD_POLL). +Another thing you have to watch out for is that it is quite easy to +receive "spurious" readiness notifications, that is, your callback might +be called with C but a subsequent C(2) will actually block +because there is no data. It is very easy to get into this situation even +with a relatively standard program structure. Thus it is best to always +use non-blocking I/O: An extra C(2) returning C is far +preferable to a program hanging until some data arrives. + +If you cannot run the fd in non-blocking mode (for example you should +not play around with an Xlib connection), then you have to separately +re-test whether a file descriptor is really ready with a known-to-be good +interface such as poll (fortunately in the case of Xlib, it already does +this on its own, so its quite safe to use). Some people additionally +use C and an interval timer, just to be sure you won't block +indefinitely. + +But really, best use non-blocking mode. + +=head3 The special problem of disappearing file descriptors + +Some backends (e.g. kqueue, epoll, linuxaio) need to be told about closing +a file descriptor (either due to calling C explicitly or any other +means, such as C). The reason is that you register interest in some +file descriptor, but when it goes away, the operating system will silently +drop this interest. If another file descriptor with the same number then +is registered with libev, there is no efficient way to see that this is, +in fact, a different file descriptor. + +To avoid having to explicitly tell libev about such cases, libev follows +the following policy: Each time C is being called, libev +will assume that this is potentially a new file descriptor, otherwise +it is assumed that the file descriptor stays the same. That means that +you I to call C (or C) when you change the +descriptor even if the file descriptor number itself did not change. + +This is how one would do it normally anyway, the important point is that +the libev application should not optimise around libev but should leave +optimisations to libev. + +=head3 The special problem of dup'ed file descriptors + +Some backends (e.g. epoll), cannot register events for file descriptors, +but only events for the underlying file descriptions. That means when you +have C'ed file descriptors or weirder constellations, and register +events for them, only one file descriptor might actually receive events. + +There is no workaround possible except not registering events +for potentially C'ed file descriptors, or to resort to +C or C. + +=head3 The special problem of files + +Many people try to use C. + +=item EV_USE_EVENTFD + +If defined to be C<1>, then libev will assume that C is +available and will probe for kernel support at runtime. This will improve +C and C performance and reduce resource consumption. +If undefined, it will be enabled if the headers indicate GNU/Linux + Glibc +2.7 or newer, otherwise disabled. + +=item EV_USE_SIGNALFD + +If defined to be C<1>, then libev will assume that C is +available and will probe for kernel support at runtime. This enables +the use of EVFLAG_SIGNALFD for faster and simpler signal handling. If +undefined, it will be enabled if the headers indicate GNU/Linux + Glibc +2.7 or newer, otherwise disabled. + +=item EV_USE_TIMERFD + +If defined to be C<1>, then libev will assume that C is +available and will probe for kernel support at runtime. This allows +libev to detect time jumps accurately. If undefined, it will be enabled +if the headers indicate GNU/Linux + Glibc 2.8 or newer and define +C, otherwise disabled. + +=item EV_USE_EVENTFD + +If defined to be C<1>, then libev will assume that C is +available and will probe for kernel support at runtime. This will improve +C and C performance and reduce resource consumption. +If undefined, it will be enabled if the headers indicate GNU/Linux + Glibc +2.7 or newer, otherwise disabled. + +=item EV_USE_SELECT + +If undefined or defined to be C<1>, libev will compile in support for the +C is buggy + +All that's left is C actively limits the number of file +descriptors you can pass in to 1024 - your program suddenly crashes when +you use more. + +There is an undocumented "workaround" for this - defining +C<_DARWIN_UNLIMITED_SELECT>, which libev tries to use, so select I +work on OS/X. + +=head2 SOLARIS PROBLEMS AND WORKAROUNDS + +=head3 C reentrancy + +The default compile environment on Solaris is unfortunately so +thread-unsafe that you can't even use components/libraries compiled +without C<-D_REENTRANT> in a threaded program, which, of course, isn't +defined by default. A valid, if stupid, implementation choice. + +If you want to use libev in threaded environments you have to make sure +it's compiled with C<_REENTRANT> defined. + +=head3 Event port backend + +The scalable event interface for Solaris is called "event +ports". Unfortunately, this mechanism is very buggy in all major +releases. If you run into high CPU usage, your program freezes or you get +a large number of spurious wakeups, make sure you have all the relevant +and latest kernel patches applied. No, I don't know which ones, but there +are multiple ones to apply, and afterwards, event ports actually work +great. + +If you can't get it to work, you can try running the program by setting +the environment variable C to only allow C and +C works fine +with large bitsets on AIX, and AIX is dead anyway. + +=head2 WIN32 PLATFORM LIMITATIONS AND WORKAROUNDS + +=head3 General issues + +Win32 doesn't support any of the standards (e.g. POSIX) that libev +requires, and its I/O model is fundamentally incompatible with the POSIX +model. Libev still offers limited functionality on this platform in +the form of the C backend, and only supports socket +descriptors. This only applies when using Win32 natively, not when using +e.g. cygwin. Actually, it only applies to the microsofts own compilers, +as every compiler comes with a slightly differently broken/incompatible +environment. + +Lifting these limitations would basically require the full +re-implementation of the I/O system. If you are into this kind of thing, +then note that glib does exactly that for you in a very portable way (note +also that glib is the slowest event library known to man). + +There is no supported compilation method available on windows except +embedding it into other applications. + +Sensible signal handling is officially unsupported by Microsoft - libev +tries its best, but under most conditions, signals will simply not work. + +Not a libev limitation but worth mentioning: windows apparently doesn't +accept large writes: instead of resulting in a partial write, windows will +either accept everything or return C if the buffer is too large, +so make sure you only write small amounts into your sockets (less than a +megabyte seems safe, but this apparently depends on the amount of memory +available). + +Due to the many, low, and arbitrary limits on the win32 platform and +the abysmal performance of winsockets, using a large number of sockets +is not recommended (and not reasonable). If your program needs to use +more than a hundred or so sockets, then likely it needs to use a totally +different implementation for windows, as libev offers the POSIX readiness +notification model, which cannot be implemented efficiently on windows +(due to Microsoft monopoly games). + +A typical way to use libev under windows is to embed it (see the embedding +section for details) and use the following F header file instead +of F: + + #define EV_STANDALONE /* keeps ev from requiring config.h */ + #define EV_SELECT_IS_WINSOCKET 1 /* configure libev for windows select */ + + #include "ev.h" + +And compile the following F file into your project (make sure +you do I compile the F or any other embedded source files!): + + #include "evwrap.h" + #include "ev.c" + +=head3 The winsocket C function doesn't follow POSIX in that it +requires socket I and not socket I (it is +also extremely buggy). This makes select very inefficient, and also +requires a mapping from file descriptors to socket handles (the Microsoft +C runtime provides the function C<_open_osfhandle> for this). See the +discussion of the C, C and +C preprocessor symbols for more info. + +The configuration for a "naked" win32 using the Microsoft runtime +libraries and raw winsocket select is: + + #define EV_USE_SELECT 1 + #define EV_SELECT_IS_WINSOCKET 1 /* forces EV_SELECT_USE_FD_SET, too */ + +Note that winsockets handling of fd sets is O(n), so you can easily get a +complexity in the O(n²) range when using win32. + +=head3 Limited number of file descriptors + +Windows has numerous arbitrary (and low) limits on things. + +Early versions of winsocket's select only supported waiting for a maximum +of C<64> handles (probably owning to the fact that all windows kernels +can only wait for C<64> things at the same time internally; Microsoft +recommends spawning a chain of threads and wait for 63 handles and the +previous thread in each. Sounds great!). + +Newer versions support more handles, but you need to define C +to some high number (e.g. C<2048>) before compiling the winsocket select +call (which might be in libev or elsewhere, for example, perl and many +other interpreters do their own select emulation on windows). + +Another limit is the number of file descriptors in the Microsoft runtime +libraries, which by default is C<64> (there must be a hidden I<64> +fetish or something like this inside Microsoft). You can increase this +by calling C<_setmaxstdio>, which can increase this limit to C<2048> +(another arbitrary limit), but is broken in many versions of the Microsoft +runtime libraries. This might get you to about C<512> or C<2048> sockets +(depending on windows version and/or the phase of the moon). To get more, +you need to wrap all I/O functions and provide your own fd management, but +the cost of calling select (O(n²)) will likely make this unworkable. + +=head2 PORTABILITY REQUIREMENTS + +In addition to a working ISO-C implementation and of course the +backend-specific APIs, libev relies on a few additional extensions: + +=over 4 + +=item C must have compatible +calling conventions regardless of C. + +Libev assumes not only that all watcher pointers have the same internal +structure (guaranteed by POSIX but not by ISO C for example), but it also +assumes that the same (machine) code can be used to call any watcher +callback: The watcher callbacks have different type signatures, but libev +calls them using an C internally. + +=item null pointers and integer zero are represented by 0 bytes + +Libev uses C to initialise structs and arrays to C<0> bytes, and +relies on this setting pointers and integers to null. + +=item pointer accesses must be thread-atomic + +Accessing a pointer value must be atomic, it must both be readable and +writable in one piece - this is the case on all current architectures. + +=item C must be thread-atomic as well + +The type C (or whatever is defined as +C) must be atomic with respect to accesses from different +threads. This is not part of the specification for C, but is +believed to be sufficiently portable. + +=item C must work in a threaded environment + +Libev uses C to temporarily block signals. This is not +allowed in a threaded program (C has to be used). Typical +pthread implementations will either allow C in the "main +thread" or will block signals process-wide, both behaviours would +be compatible with libev. Interaction between C and +C could complicate things, however. + +The most portable way to handle signals is to block signals in all threads +except the initial one, and run the signal handling loop in the initial +thread as well. + +=item C must be large enough for common memory allocation sizes + +To improve portability and simplify its API, libev uses C internally +instead of C when allocating its data structures. On non-POSIX +systems (Microsoft...) this might be unexpectedly low, but is still at +least 31 bits everywhere, which is enough for hundreds of millions of +watchers. + +=item C must hold a time value in seconds with enough accuracy + +The type C is used to represent timestamps. It is required to +have at least 51 bits of mantissa (and 9 bits of exponent), which is +good enough for at least into the year 4000 with millisecond accuracy +(the design goal for libev). This requirement is overfulfilled by +implementations using IEEE 754, which is basically all existing ones. + +With IEEE 754 doubles, you get microsecond accuracy until at least the +year 2255 (and millisecond accuracy till the year 287396 - by then, libev +is either obsolete or somebody patched it to use C or +something like that, just kidding). + +=back + +If you know of other additional requirements drop me a note. + + +=head1 ALGORITHMIC COMPLEXITIES + +In this section the complexities of (many of) the algorithms used inside +libev will be documented. For complexity discussions about backends see +the documentation for C. + +All of the following are about amortised time: If an array needs to be +extended, libev needs to realloc and move the whole array, but this +happens asymptotically rarer with higher number of elements, so O(1) might +mean that libev does a lengthy realloc operation in rare cases, but on +average it is much faster and asymptotically approaches constant time. + +=over 4 + +=item Starting and stopping timer/periodic watchers: O(log skipped_other_timers) + +This means that, when you have a watcher that triggers in one hour and +there are 100 watchers that would trigger before that, then inserting will +have to skip roughly seven (C) of these watchers. + +=item Changing timer/periodic watchers (by autorepeat or calling again): O(log skipped_other_timers) + +That means that changing a timer costs less than removing/adding them, +as only the relative motion in the event queue has to be paid for. + +=item Starting io/check/prepare/idle/signal/child/fork/async watchers: O(1) + +These just add the watcher into an array or at the head of a list. + +=item Stopping check/prepare/idle/fork/async watchers: O(1) + +=item Stopping an io/signal/child watcher: O(number_of_watchers_for_this_(fd/signal/pid % EV_PID_HASHSIZE)) + +These watchers are stored in lists, so they need to be walked to find the +correct watcher to remove. The lists are usually short (you don't usually +have many watchers waiting for the same fd or signal: one is typical, two +is rare). + +=item Finding the next timer in each loop iteration: O(1) + +By virtue of using a binary or 4-heap, the next timer is always found at a +fixed position in the storage array. + +=item Each change on a file descriptor per loop iteration: O(number_of_watchers_for_this_fd) + +A change means an I/O watcher gets started or stopped, which requires +libev to recalculate its status (and possibly tell the kernel, depending +on backend and whether C was used). + +=item Activating one watcher (putting it into the pending state): O(1) + +=item Priority handling: O(number_of_priorities) + +Priorities are implemented by allocating some space for each +priority. When doing priority-based operations, libev usually has to +linearly search all the priorities, but starting/stopping and activating +watchers becomes O(1) with respect to priority handling. + +=item Sending an ev_async: O(1) + +=item Processing ev_async_send: O(number_of_async_watchers) + +=item Processing signals: O(max_signal_number) + +Sending involves a system call I there were no other C +calls in the current loop iteration and the loop is currently +blocked. Checking for async and signal events involves iterating over all +running async watchers or all signal numbers. + +=back + + +=head1 PORTING FROM LIBEV 3.X TO 4.X + +The major version 4 introduced some incompatible changes to the API. + +At the moment, the C header file provides compatibility definitions +for all changes, so most programs should still compile. The compatibility +layer might be removed in later versions of libev, so better update to the +new API early than late. + +=over 4 + +=item C backwards compatibility mechanism + +The backward compatibility mechanism can be controlled by +C. See L in the L +section. + +=item C and C have been removed + +These calls can be replaced easily by their C counterparts: + + ev_loop_destroy (EV_DEFAULT_UC); + ev_loop_fork (EV_DEFAULT); + +=item function/symbol renames + +A number of functions and symbols have been renamed: + + ev_loop => ev_run + EVLOOP_NONBLOCK => EVRUN_NOWAIT + EVLOOP_ONESHOT => EVRUN_ONCE + + ev_unloop => ev_break + EVUNLOOP_CANCEL => EVBREAK_CANCEL + EVUNLOOP_ONE => EVBREAK_ONE + EVUNLOOP_ALL => EVBREAK_ALL + + EV_TIMEOUT => EV_TIMER + + ev_loop_count => ev_iteration + ev_loop_depth => ev_depth + ev_loop_verify => ev_verify + +Most functions working on C objects don't have an +C prefix, so it was removed; C, C and +associated constants have been renamed to not collide with the C anymore and C now follows the same naming scheme +as all other watcher types. Note that C is still called +C because it would otherwise clash with the C +typedef. + +=item C mechanism replaced by C + +The preprocessor symbol C has been replaced by a different +mechanism, C. Programs using C usually compile +and work, but the library code will of course be larger. + +=back + + +=head1 GLOSSARY + +=over 4 + +=item active + +A watcher is active as long as it has been started and not yet stopped. +See L for details. + +=item application + +In this document, an application is whatever is using libev. + +=item backend + +The part of the code dealing with the operating system interfaces. + +=item callback + +The address of a function that is called when some event has been +detected. Callbacks are being passed the event loop, the watcher that +received the event, and the actual event bitset. + +=item callback/watcher invocation + +The act of calling the callback associated with a watcher. + +=item event + +A change of state of some external event, such as data now being available +for reading on a file descriptor, time having passed or simply not having +any other events happening anymore. + +In libev, events are represented as single bits (such as C or +C). + +=item event library + +A software package implementing an event model and loop. + +=item event loop + +An entity that handles and processes external events and converts them +into callback invocations. + +=item event model + +The model used to describe how an event loop handles and processes +watchers and events. + +=item pending + +A watcher is pending as soon as the corresponding event has been +detected. See L for details. + +=item real time + +The physical time that is observed. It is apparently strictly monotonic :) + +=item wall-clock time + +The time and date as shown on clocks. Unlike real time, it can actually +be wrong and jump forwards and backwards, e.g. when you adjust your +clock. + +=item watcher + +A data structure that describes interest in certain events. Watchers need +to be started (attached to an event loop) before they can receive events. =back =head1 AUTHOR -Marc Lehmann . +Marc Lehmann , with repeated corrections by Mikael +Magnusson and Emanuele Giaquinta, and minor corrections by many others.