--- libev/ev.pod 2008/10/24 08:30:01 1.202 +++ libev/ev.pod 2010/10/22 09:40:22 1.318 @@ -11,6 +11,8 @@ // a single header file is required #include + #include // for puts + // every watcher type has its own typedef'd struct // with the name ev_TYPE ev_io stdin_watcher; @@ -26,8 +28,8 @@ // with its corresponding stop function. ev_io_stop (EV_A_ w); - // this causes all nested ev_loop's to stop iterating - ev_unloop (EV_A_ EVUNLOOP_ALL); + // this causes all nested ev_run's to stop iterating + ev_break (EV_A_ EVBREAK_ALL); } // another callback, this time for a time-out @@ -35,15 +37,15 @@ timeout_cb (EV_P_ ev_timer *w, int revents) { puts ("timeout"); - // this causes the innermost ev_loop to stop iterating - ev_unloop (EV_A_ EVUNLOOP_ONE); + // this causes the innermost ev_run to stop iterating + ev_break (EV_A_ EVBREAK_ONE); } int main (void) { // use the default event loop unless you have special needs - ev_loop *loop = ev_default_loop (0); + struct ev_loop *loop = ev_default_loop (0); // initialise an io watcher, then start it // this one will watch for stdin to become readable @@ -56,18 +58,30 @@ ev_timer_start (loop, &timeout_watcher); // now wait for events to arrive - ev_loop (loop, 0); + ev_run (loop, 0); // unloop was called, so exit return 0; } -=head1 DESCRIPTION +=head1 ABOUT THIS DOCUMENT + +This document documents the libev software package. The newest version of this document is also available as an html-formatted web page you might find easier to navigate when reading it for the first time: L. +While this document tries to be as complete as possible in documenting +libev, its usage and the rationale behind its design, it is not a tutorial +on event-based programming, nor will it introduce event-based programming +with libev. + +Familiarity with event based programming techniques in general is assumed +throughout this document. + +=head1 ABOUT LIBEV + Libev is an event loop: you register interest in certain events (such as a file descriptor being readable or a timeout occurring), and it will manage these event sources and provide your program with events. @@ -86,13 +100,14 @@ Libev supports C (files, many character devices...). While stopping, setting and starting an I/O watcher in the same iteration -will result in some caching, there is still a system call per such incident -(because the fd could point to a different file description now), so its -best to avoid that. Also, C'ed file descriptors might not work -very well if you register events for both fds. - -Please note that epoll sometimes generates spurious notifications, so you -need to use non-blocking I/O or other means to avoid blocking when no data -(or space) is available. +will result in some caching, there is still a system call per such +incident (because the same I could point to a different +I now), so its best to avoid that. Also, C'ed +file descriptors might not work very well if you register events for both +file descriptors. Best performance from this backend is achieved by not unregistering all watchers for a file descriptor until it has been closed, if possible, i.e. keep at least one watcher active per fd at all times. Stopping and starting a watcher (without re-setting it) also usually doesn't cause -extra overhead. +extra overhead. A fork can both result in spurious notifications as well +as in libev having to destroy and recreate the epoll object, which can +take considerable time and thus should be avoided. + +All this means that, in practice, C can be as fast or +faster than epoll for maybe up to a hundred file descriptors, depending on +the usage. So sad. While nominally embeddable in other event loops, this feature is broken in all kernel versions tested so far. @@ -415,12 +474,15 @@ =item C (value 8, most BSD clones) -Kqueue deserves special mention, as at the time of this writing, it was -broken on all BSDs except NetBSD (usually it doesn't work reliably with -anything but sockets and pipes, except on Darwin, where of course it's -completely useless). For this reason it's not being "auto-detected" unless -you explicitly specify it in the flags (i.e. using C) or -libev was compiled on a known-to-be-good (-enough) system like NetBSD. +Kqueue deserves special mention, as at the time of this writing, it +was broken on all BSDs except NetBSD (usually it doesn't work reliably +with anything but sockets and pipes, except on Darwin, where of course +it's completely useless). Unlike epoll, however, whose brokenness +is by design, these kqueue bugs can (and eventually will) be fixed +without API changes to existing programs. For this reason it's not being +"auto-detected" unless you explicitly specify it in the flags (i.e. using +C) or libev was compiled on a known-to-be-good (-enough) +system like NetBSD. You still can embed kqueue into a normal poll or select backend and use it only for sockets (after having made sure that sockets work with kqueue on @@ -430,8 +492,9 @@ kernel is more efficient (which says nothing about its actual speed, of course). While stopping, setting and starting an I/O watcher does never cause an extra system call as with C, it still adds up to -two event changes per incident. Support for C is very bad and it -drops fds silently in similarly hard-to-detect cases. +two event changes per incident. Support for C is very bad (but +sane, unlike epoll) and it drops fds silently in similarly hard-to-detect +cases This backend usually performs well under most conditions. @@ -439,8 +502,8 @@ everywhere, so you might need to test for this. And since it is broken almost everywhere, you should only use it when you have a lot of sockets (for which it usually works), by embedding it into another event loop -(e.g. C or C) and, did I mention it, -using it only for sockets. +(e.g. C or C (but C is of course +also broken on OS X)) and, did I mention it, using it only for sockets. This backend maps C into an C kevent with C, and C into an C kevent with @@ -470,7 +533,7 @@ On the positive side, with the exception of the spurious readiness notifications, this backend actually performed fully to specification in all tests and is fully embeddable, which is a rare feat among the -OS-specific backends. +OS-specific backends (I vastly prefer correctness over speed hacks). This backend maps C and C in the same way as C. @@ -485,9 +548,10 @@ =back -If one or more of these are or'ed into the flags value, then only these -backends will be tried (in the reverse order as listed here). If none are -specified, all backends in C will be tried. +If one or more of the backend flags are or'ed into the flags value, +then only these backends will be tried (in the reverse order as listed +here). If none are specified, all backends in C will be tried. Example: This is the most typical usage. @@ -509,11 +573,9 @@ =item struct ev_loop *ev_loop_new (unsigned int flags) Similar to C, but always creates a new event loop that is -always distinct from the default loop. Unlike the default loop, it cannot -handle signal and child watchers, and attempts to do so will be greeted by -undefined behaviour (or a failed assertion if assertions are enabled). +always distinct from the default loop. -Note that this function I thread-safe, and the recommended way to use +Note that this function I thread-safe, and one common way to use libev with threads is indeed to create one loop per thread, and using the default loop in the "main" or "initial" thread. @@ -525,22 +587,21 @@ =item ev_default_destroy () -Destroys the default loop again (frees all memory and kernel state -etc.). None of the active event watchers will be stopped in the normal -sense, so e.g. C might still return true. It is your -responsibility to either stop all watchers cleanly yourself I -calling this function, or cope with the fact afterwards (which is usually -the easiest thing, you can just ignore the watchers and/or C them -for example). - -Note that certain global state, such as signal state, will not be freed by -this function, and related watchers (such as signal and child watchers) -would need to be stopped manually. +Destroys the default loop (frees all memory and kernel state etc.). None +of the active event watchers will be stopped in the normal sense, so +e.g. C might still return true. It is your responsibility to +either stop all watchers cleanly yourself I calling this function, +or cope with the fact afterwards (which is usually the easiest thing, you +can just ignore the watchers and/or C them for example). + +Note that certain global state, such as signal state (and installed signal +handlers), will not be freed by this function, and related watchers (such +as signal and child watchers) would need to be stopped manually. In general it is not advisable to call this function except in the rare occasion where you really need to free e.g. the signal handling pipe fds. If you need dynamically allocated loops it is better to use -C and C). +C and C. =item ev_loop_destroy (loop) @@ -549,16 +610,24 @@ =item ev_default_fork () -This function sets a flag that causes subsequent C iterations +This function sets a flag that causes subsequent C iterations to reinitialise the kernel state for backends that have one. Despite the name, you can call it anytime, but it makes most sense after forking, in the child process (or both child and parent, but that again makes little sense). You I call it in the child before using any of the libev -functions, and it will only take effect at the next C iteration. +functions, and it will only take effect at the next C iteration. + +Again, you I to call it on I loop that you want to re-use after +a fork, I. This is +because some kernel interfaces *cough* I *cough* do funny things +during fork. On the other hand, you only need to call this function in the child -process if and only if you want to use the event library in the child. If -you just fork+exec, you don't have to call it at all. +process if and only if you want to use the event loop in the child. If +you just fork+exec or create a new loop in the child, you don't have to +call it at all (in fact, C is so badly broken that it makes a +difference, but libev will usually detect this case on its own and do a +costly reset of the backend). The function itself is quite fast and it's usually not a problem to call it just in case after a fork. To make this easy, the function will fit in @@ -570,23 +639,37 @@ Like C, but acts on an event loop created by C. Yes, you have to call this on every allocated event loop -after fork that you want to re-use in the child, and how you do this is -entirely your own problem. +after fork that you want to re-use in the child, and how you keep track of +them is entirely your own problem. =item int ev_is_default_loop (loop) Returns true when the given loop is, in fact, the default loop, and false otherwise. -=item unsigned int ev_loop_count (loop) +=item unsigned int ev_iteration (loop) -Returns the count of loop iterations for the loop, which is identical to -the number of times libev did poll for new events. It starts at C<0> and -happily wraps around with enough iterations. +Returns the current iteration count for the event loop, which is identical +to the number of times libev did poll for new events. It starts at C<0> +and happily wraps around with enough iterations. This value can sometimes be useful as a generation counter of sorts (it "ticks" the number of loop iterations), as it roughly corresponds with -C and C calls. +C and C calls - and is incremented between the +prepare and check phases. + +=item unsigned int ev_depth (loop) + +Returns the number of times C was entered minus the number of +times C was exited, in other words, the recursion depth. + +Outside C, this number is zero. In a callback, this number is +C<1>, unless C was invoked recursively (or from another thread), +in which case it is higher. + +Leaving C abnormally (setjmp/longjmp, cancelling the thread +etc.), doesn't count as "exit" - consider this as a hint to avoid such +ungentleman-like behaviour unless it's really convenient. =item unsigned int ev_backend (loop) @@ -605,93 +688,132 @@ Establishes the current time by querying the kernel, updating the time returned by C in the progress. This is a costly operation and -is usually done automatically within C. +is usually done automatically within C. This function is rarely useful, but when some event callback runs for a very long time without entering the event loop, updating libev's idea of the current time is a good idea. -See also "The special problem of time updates" in the C section. +See also L in the C section. + +=item ev_suspend (loop) -=item ev_loop (loop, int flags) +=item ev_resume (loop) + +These two functions suspend and resume an event loop, for use when the +loop is not used for a while and timeouts should not be processed. + +A typical use case would be an interactive program such as a game: When +the user presses C<^Z> to suspend the game and resumes it an hour later it +would be best to handle timeouts as if no time had actually passed while +the program was suspended. This can be achieved by calling C +in your C handler, sending yourself a C and calling +C directly afterwards to resume timer processing. + +Effectively, all C watchers will be delayed by the time spend +between C and C, and all C watchers +will be rescheduled (that is, they will lose any events that would have +occurred while suspended). + +After calling C you B call I function on the +given loop other than C, and you B call C +without a previous call to C. + +Calling C/C has the side effect of updating the +event loop time (see C). + +=item ev_run (loop, int flags) Finally, this is it, the event handler. This function usually is called -after you initialised all your watchers and you want to start handling -events. +after you have initialised all your watchers and you want to start +handling events. It will ask the operating system for any new events, call +the watcher callbacks, an then repeat the whole process indefinitely: This +is why event loops are called I. -If the flags argument is specified as C<0>, it will not return until -either no event watchers are active anymore or C was called. +If the flags argument is specified as C<0>, it will keep handling events +until either no event watchers are active anymore or C was +called. -Please note that an explicit C is usually better than +Please note that an explicit C is usually better than relying on all watchers to be stopped when deciding when a program has finished (especially in interactive programs), but having a program that automatically loops as long as it has to and no longer by virtue of relying on its watchers stopping correctly, that is truly a thing of beauty. -A flags value of C will look for new events, will handle -those events and any already outstanding ones, but will not block your -process in case there are no events and will return after one iteration of -the loop. +A flags value of C will look for new events, will handle +those events and any already outstanding ones, but will not wait and +block your process in case there are no events and will return after one +iteration of the loop. This is sometimes useful to poll and handle new +events while doing lengthy calculations, to keep the program responsive. -A flags value of C will look for new events (waiting if +A flags value of C will look for new events (waiting if necessary) and will handle those and any already outstanding ones. It will block your process until at least one new event arrives (which could -be an event internal to libev itself, so there is no guarentee that a +be an event internal to libev itself, so there is no guarantee that a user-registered callback will be called), and will return after one iteration of the loop. This is useful if you are waiting for some external event in conjunction with something not expressible using other libev watchers (i.e. "roll your -own C"). However, a pair of C/C watchers is +own C"). However, a pair of C/C watchers is usually a better approach for this kind of thing. -Here are the gory details of what C does: +Here are the gory details of what C does: + - Increment loop depth. + - Reset the ev_break status. - Before the first iteration, call any pending watchers. - * If EVFLAG_FORKCHECK was used, check for a fork. + LOOP: + - If EVFLAG_FORKCHECK was used, check for a fork. - If a fork was detected (by any means), queue and call all fork watchers. - Queue and call all prepare watchers. + - If ev_break was called, goto FINISH. - If we have been forked, detach and recreate the kernel state as to not disturb the other process. - Update the kernel state with all outstanding changes. - Update the "event loop time" (ev_now ()). - Calculate for how long to sleep or block, if at all - (active idle watchers, EVLOOP_NONBLOCK or not having + (active idle watchers, EVRUN_NOWAIT or not having any active watchers at all will result in not sleeping). - Sleep if the I/O and timer collect interval say so. + - Increment loop iteration counter. - Block the process, waiting for any events. - Queue all outstanding I/O (fd) events. - Update the "event loop time" (ev_now ()), and do time jump adjustments. - Queue all expired timers. - Queue all expired periodics. - - Unless any events are pending now, queue all idle watchers. + - Queue all idle watchers with priority higher than that of pending events. - Queue all check watchers. - Call all queued watchers in reverse order (i.e. check watchers first). Signals and child watchers are implemented as I/O watchers, and will be handled here by queueing them when their watcher gets executed. - - If ev_unloop has been called, or EVLOOP_ONESHOT or EVLOOP_NONBLOCK - were used, or there are no active watchers, return, otherwise - continue with step *. + - If ev_break has been called, or EVRUN_ONCE or EVRUN_NOWAIT + were used, or there are no active watchers, goto FINISH, otherwise + continue with step LOOP. + FINISH: + - Reset the ev_break status iff it was EVBREAK_ONE. + - Decrement the loop depth. + - Return. Example: Queue some jobs and then loop until no events are outstanding anymore. ... queue jobs here, make sure they register event watchers as long ... as they still have work to do (even an idle watcher will do..) - ev_loop (my_loop, 0); + ev_run (my_loop, 0); ... jobs done or somebody called unloop. yeah! -=item ev_unloop (loop, how) +=item ev_break (loop, how) -Can be used to make a call to C return early (but only after it +Can be used to make a call to C return early (but only after it has processed all outstanding events). The C argument must be either -C, which will make the innermost C call return, or -C, which will make all nested C calls return. +C, which will make the innermost C call return, or +C, which will make all nested C calls return. -This "unloop state" will be cleared when entering C again. +This "unloop state" will be cleared when entering C again. -It is safe to call C from otuside any C calls. +It is safe to call C from outside any C calls. ##TODO## =item ev_ref (loop) @@ -699,21 +821,24 @@ Ref/unref can be used to add or remove a reference count on the event loop: Every watcher keeps one reference, and as long as the reference -count is nonzero, C will not return on its own. +count is nonzero, C will not return on its own. -If you have a watcher you never unregister that should not keep C -from returning, call ev_unref() after starting, and ev_ref() before -stopping it. - -As an example, libev itself uses this for its internal signal pipe: It is -not visible to the libev user and should not keep C from exiting -if no event watchers registered by it are active. It is also an excellent -way to do this for generic recurring timers or from within third-party -libraries. Just remember to I and I -(but only if the watcher wasn't active before, or was active before, -respectively). +This is useful when you have a watcher that you never intend to +unregister, but that nevertheless should not keep C from +returning. In such a case, call C after starting, and C +before stopping it. + +As an example, libev itself uses this for its internal signal pipe: It +is not visible to the libev user and should not keep C from +exiting if no event watchers registered by it are active. It is also an +excellent way to do this for generic recurring timers or from within +third-party libraries. Just remember to I and I (but only if the watcher wasn't active before, or was active +before, respectively. Note also that libev might stop watchers itself +(e.g. non-repeating timers) in which case you have to C +in the callback). -Example: Create a signal watcher, but keep it from keeping C +Example: Create a signal watcher, but keep it from keeping C running when nothing else is active. ev_signal exitsig; @@ -750,7 +875,9 @@ time collecting I/O events, so you can handle more events per iteration, at the cost of increasing latency. Timeouts (both C and C) will be not affected. Setting this to a non-null value will -introduce an additional C call into most loop iterations. +introduce an additional C call into most loop iterations. The +sleep time ensures that libev will not poll for I/O events more often then +once per this interval, on average. Likewise, by setting a higher I you allow libev to spend more time collecting timeouts, at the expense of increased @@ -762,7 +889,11 @@ interval to a value near C<0.1> or so, which is often enough for interactive servers (of course not for games), likewise for timeouts. It usually doesn't make much sense to set it to a lower value than C<0.01>, -as this approaches the timing granularity of most systems. +as this approaches the timing granularity of most systems. Note that if +you do transactions with the outside world and you can't increase the +parallelity, then this setting will limit your transaction rate (if you +need to poll once per transaction and the I/O collect interval is 0.01, +then you can't do more than 100 transactions per second). Setting the I can improve the opportunity for saving power, as the program will "bundle" timer callback invocations that @@ -771,7 +902,82 @@ reduce iterations/wake-ups is to use C watchers and make sure they fire on, say, one-second boundaries only. -=item ev_loop_verify (loop) +Example: we only need 0.1s timeout granularity, and we wish not to poll +more often than 100 times per second: + + ev_set_timeout_collect_interval (EV_DEFAULT_UC_ 0.1); + ev_set_io_collect_interval (EV_DEFAULT_UC_ 0.01); + +=item ev_invoke_pending (loop) + +This call will simply invoke all pending watchers while resetting their +pending state. Normally, C does this automatically when required, +but when overriding the invoke callback this call comes handy. This +function can be invoked from a watcher - this can be useful for example +when you want to do some lengthy calculation and want to pass further +event handling to another thread (you still have to make sure only one +thread executes within C or C of course). + +=item int ev_pending_count (loop) + +Returns the number of pending watchers - zero indicates that no watchers +are pending. + +=item ev_set_invoke_pending_cb (loop, void (*invoke_pending_cb)(EV_P)) + +This overrides the invoke pending functionality of the loop: Instead of +invoking all pending watchers when there are any, C will call +this callback instead. This is useful, for example, when you want to +invoke the actual watchers inside another context (another thread etc.). + +If you want to reset the callback, use C as new +callback. + +=item ev_set_loop_release_cb (loop, void (*release)(EV_P), void (*acquire)(EV_P)) + +Sometimes you want to share the same loop between multiple threads. This +can be done relatively simply by putting mutex_lock/unlock calls around +each call to a libev function. + +However, C can run an indefinite time, so it is not feasible +to wait for it to return. One way around this is to wake up the event +loop via C and C, another way is to set these +I and I callbacks on the loop. + +When set, then C will be called just before the thread is +suspended waiting for new events, and C is called just +afterwards. + +Ideally, C will just call your mutex_unlock function, and +C will just call the mutex_lock function again. + +While event loop modifications are allowed between invocations of +C and C (that's their only purpose after all), no +modifications done will affect the event loop, i.e. adding watchers will +have no effect on the set of file descriptors being watched, or the time +waited. Use an C watcher to wake up C when you want it +to take note of any changes you made. + +In theory, threads executing C will be async-cancel safe between +invocations of C and C. + +See also the locking example in the C section later in this +document. + +=item ev_set_userdata (loop, void *data) + +=item ev_userdata (loop) + +Set and retrieve a single C associated with a loop. When +C has never been called, then C returns +C<0.> + +These two functions can be used to associate arbitrary data with a loop, +and are intended solely for the C, C and +C callbacks described above, but of course can be (ab-)used for +any other purpose as well. + +=item ev_verify (loop) This function only does something when C support has been compiled in, which is the default for non-minimal builds. It tries to go @@ -792,14 +998,15 @@ watcher type, e.g. C can mean C for timer watchers and C for I/O watchers. -A watcher is a structure that you create and register to record your -interest in some event. For instance, if you want to wait for STDIN to -become readable, you would create an C watcher for that: +A watcher is an opaque structure that you allocate and register to record +your interest in some event. To make a concrete example, imagine you want +to wait for STDIN to become readable, you would create an C watcher +for that: static void my_cb (struct ev_loop *loop, ev_io *w, int revents) { ev_io_stop (w); - ev_unloop (loop, EVUNLOOP_ALL); + ev_break (loop, EVBREAK_ALL); } struct ev_loop *loop = ev_default_loop (0); @@ -810,7 +1017,7 @@ ev_io_set (&stdin_watcher, STDIN_FILENO, EV_READ); ev_io_start (loop, &stdin_watcher); - ev_loop (loop, 0); + ev_run (loop, 0); As you can see, you are responsible for allocating the memory for your watcher structures (and it is I a bad idea to do this on the @@ -819,11 +1026,11 @@ Each watcher has an associated watcher structure (called C or simply C, as typedefs are provided for all watcher structs). -Each watcher structure must be initialised by a call to C, which expects a callback to be provided. This -callback gets invoked each time the event occurs (or, in the case of I/O -watchers, each time the event loop detects that the file descriptor given -is readable and/or writable). +Each watcher structure must be initialised by a call to C, which expects a callback to be provided. This callback is +invoked each time the event occurs (or, in the case of I/O watchers, each +time the event loop detects that the file descriptor given is readable +and/or writable). Each watcher type further has its own C<< ev_TYPE_set (watcher *, ...) >> macro to configure it, with arguments specific to the watcher type. There @@ -856,7 +1063,7 @@ The file descriptor in the C watcher has become readable and/or writable. -=item C +=item C The C watcher has timed out. @@ -884,13 +1091,13 @@ =item C -All C watchers are invoked just I C starts +All C watchers are invoked just I C starts to gather new events, and all C watchers are invoked just after -C has gathered them, but before it invokes any callbacks for any +C has gathered them, but before it invokes any callbacks for any received events. Callbacks of both watcher types can start and stop as many watchers as they want, and all of them will be taken into account (for example, a C watcher might start an idle watcher to keep -C from blocking). +C from blocking). =item C @@ -905,6 +1112,11 @@ The given async watcher has been asynchronously notified (see C). +=item C + +Not ever sent (or otherwise used) by libev itself, but can be freely used +by libev users to signal watchers (e.g. via C). + =item C An unspecified error has occurred, the watcher has been stopped. This might @@ -926,6 +1138,65 @@ =back +=head2 WATCHER STATES + +There are various watcher states mentioned throughout this manual - +active, pending and so on. In this section these states and the rules to +transition between them will be described in more detail - and while these +rules might look complicated, they usually do "the right thing". + +=over 4 + +=item initialiased + +Before a watcher can be registered with the event looop it has to be +initialised. This can be done with a call to C, or calls to +C followed by the watcher-specific C function. + +In this state it is simply some block of memory that is suitable for use +in an event loop. It can be moved around, freed, reused etc. at will. + +=item started/running/active + +Once a watcher has been started with a call to C it becomes +property of the event loop, and is actively waiting for events. While in +this state it cannot be accessed (except in a few documented ways), moved, +freed or anything else - the only legal thing is to keep a pointer to it, +and call libev functions on it that are documented to work on active watchers. + +=item pending + +If a watcher is active and libev determines that an event it is interested +in has occurred (such as a timer expiring), it will become pending. It will +stay in this pending state until either it is stopped or its callback is +about to be invoked, so it is not normally pending inside the watcher +callback. + +The watcher might or might not be active while it is pending (for example, +an expired non-repeating timer can be pending but no longer active). If it +is stopped, it can be freely accessed (e.g. by calling C), +but it is still property of the event loop at this time, so cannot be +moved, freed or reused. And if it is active the rules described in the +previous item still apply. + +It is also possible to feed an event on a watcher that is not active (e.g. +via C), in which case it becomes pending without being +active. + +=item stopped + +A watcher can be stopped implicitly by libev (in which case it might still +be pending), or explicitly by calling its C function. The +latter will clear any pending state the watcher might be in, regardless +of whether it was active or not, so stopping a watcher explicitly before +freeing it is often a good idea. + +While stopped (and not pending) the watcher is essentially in the +initialised state, that is it can be reused, moved, modified in any way +you wish. + +=back + =head2 GENERIC WATCHER FUNCTIONS =over 4 @@ -951,7 +1222,7 @@ ev_init (&w, my_cb); ev_io_set (&w, STDIN_FILENO, EV_READ); -=item C (ev_TYPE *, [args]) +=item C (ev_TYPE *watcher, [args]) This macro initialises the type-specific parts of a watcher. You need to call C at least once before you call this macro, but you can @@ -974,7 +1245,7 @@ ev_io_init (&w, my_cb, STDIN_FILENO, EV_READ); -=item C (loop *, ev_TYPE *watcher) +=item C (loop, ev_TYPE *watcher) Starts (activates) the given watcher. Only active watchers will receive events. If the watcher is already active nothing will happen. @@ -984,7 +1255,7 @@ ev_io_start (EV_DEFAULT_UC, &w); -=item C (loop *, ev_TYPE *watcher) +=item C (loop, ev_TYPE *watcher) Stops the given watcher if active, and clears the pending status (whether the watcher was active or not). @@ -1019,7 +1290,7 @@ Change the callback. You can change the callback at virtually any time (modulo threads). -=item ev_set_priority (ev_TYPE *watcher, priority) +=item ev_set_priority (ev_TYPE *watcher, int priority) =item int ev_priority (ev_TYPE *watcher) @@ -1029,24 +1300,22 @@ before watchers with lower priority, but priority will not keep watchers from being executed (except for C watchers). -This means that priorities are I used for ordering callback -invocation after new events have been received. This is useful, for -example, to reduce latency after idling, or more often, to bind two -watchers on the same event and make sure one is called first. - If you need to suppress invocation when higher priority events are pending you need to look at C watchers, which provide this functionality. You I change the priority of a watcher as long as it is active or pending. -The default priority used by watchers when no priority has been set is -always C<0>, which is supposed to not be too high and not be too low :). - Setting a priority outside the range of C to C is fine, as long as you do not mind that the priority value you query might or might not have been clamped to the valid range. +The default priority used by watchers when no priority has been set is +always C<0>, which is supposed to not be too high and not be too low :). + +See L, below, for a more thorough treatment of +priorities. + =item ev_invoke (loop, ev_TYPE *watcher, int revents) Invoke the C with the given C and C. Neither @@ -1063,6 +1332,20 @@ Sometimes it can be useful to "poll" a watcher instead of waiting for its callback to be invoked, which can be accomplished with this function. +=item ev_feed_event (loop, ev_TYPE *watcher, int revents) + +Feeds the given event set into the event loop, as if the specified event +had happened for the specified watcher (which must be a pointer to an +initialised but not necessarily started event watcher). Obviously you must +not free the watcher as long as it has pending events. + +Stopping the watcher, letting libev invoke it, or calling +C will clear the pending event, even if the watcher was +not started in the first place. + +See also C and C for related +functions that do not need a watcher. + =back @@ -1120,17 +1403,120 @@ static void t1_cb (EV_P_ ev_timer *w, int revents) { - struct my_biggy big = (struct my_biggy * + struct my_biggy big = (struct my_biggy *) (((char *)w) - offsetof (struct my_biggy, t1)); } static void t2_cb (EV_P_ ev_timer *w, int revents) { - struct my_biggy big = (struct my_biggy * + struct my_biggy big = (struct my_biggy *) (((char *)w) - offsetof (struct my_biggy, t2)); } +=head2 WATCHER PRIORITY MODELS + +Many event loops support I, which are usually small +integers that influence the ordering of event callback invocation +between watchers in some way, all else being equal. + +In libev, Watcher priorities can be set using C. See its +description for the more technical details such as the actual priority +range. + +There are two common ways how these these priorities are being interpreted +by event loops: + +In the more common lock-out model, higher priorities "lock out" invocation +of lower priority watchers, which means as long as higher priority +watchers receive events, lower priority watchers are not being invoked. + +The less common only-for-ordering model uses priorities solely to order +callback invocation within a single event loop iteration: Higher priority +watchers are invoked before lower priority ones, but they all get invoked +before polling for new events. + +Libev uses the second (only-for-ordering) model for all its watchers +except for idle watchers (which use the lock-out model). + +The rationale behind this is that implementing the lock-out model for +watchers is not well supported by most kernel interfaces, and most event +libraries will just poll for the same events again and again as long as +their callbacks have not been executed, which is very inefficient in the +common case of one high-priority watcher locking out a mass of lower +priority ones. + +Static (ordering) priorities are most useful when you have two or more +watchers handling the same resource: a typical usage example is having an +C watcher to receive data, and an associated C to handle +timeouts. Under load, data might be received while the program handles +other jobs, but since timers normally get invoked first, the timeout +handler will be executed before checking for data. In that case, giving +the timer a lower priority than the I/O watcher ensures that I/O will be +handled first even under adverse conditions (which is usually, but not +always, what you want). + +Since idle watchers use the "lock-out" model, meaning that idle watchers +will only be executed when no same or higher priority watchers have +received events, they can be used to implement the "lock-out" model when +required. + +For example, to emulate how many other event libraries handle priorities, +you can associate an C watcher to each such watcher, and in +the normal watcher callback, you just start the idle watcher. The real +processing is done in the idle watcher callback. This causes libev to +continuously poll and process kernel event data for the watcher, but when +the lock-out case is known to be rare (which in turn is rare :), this is +workable. + +Usually, however, the lock-out model implemented that way will perform +miserably under the type of load it was designed to handle. In that case, +it might be preferable to stop the real watcher before starting the +idle watcher, so the kernel will not have to process the event in case +the actual processing will be delayed for considerable time. + +Here is an example of an I/O watcher that should run at a strictly lower +priority than the default, and which should only process data when no +other events are pending: + + ev_idle idle; // actual processing watcher + ev_io io; // actual event watcher + + static void + io_cb (EV_P_ ev_io *w, int revents) + { + // stop the I/O watcher, we received the event, but + // are not yet ready to handle it. + ev_io_stop (EV_A_ w); + + // start the idle watcher to handle the actual event. + // it will not be executed as long as other watchers + // with the default priority are receiving events. + ev_idle_start (EV_A_ &idle); + } + + static void + idle_cb (EV_P_ ev_idle *w, int revents) + { + // actual processing + read (STDIN_FILENO, ...); + + // have to start the I/O watcher again, as + // we have handled the event + ev_io_start (EV_P_ &io); + } + + // initialisation + ev_idle_init (&idle, idle_cb); + ev_io_init (&io, io_cb, STDIN_FILENO, EV_READ); + ev_io_start (EV_DEFAULT_ &io); + +In the "real" world, it might also be beneficial to start a timer, so that +low-priority connections can not be locked out forever under load. This +enables your program to keep a lower latency for important connections +during short periods of high load, while not completely locking out less +important ones. + =head1 WATCHER TYPES @@ -1165,7 +1551,9 @@ If you cannot use non-blocking mode, then force the use of a known-to-be-good backend (at the time of this writing, this includes only -C and C). +C and C). The same applies to file +descriptors for which non-blocking operation makes no sense (such as +files) - libev doesn't guarantee any specific behaviour in that case. Another thing you have to watch out for is that it is quite easy to receive "spurious" readiness notifications, that is your callback might @@ -1240,6 +1628,44 @@ ignore SIGPIPE (and maybe make sure you log the exit status of your daemon somewhere, as that would have given you a big clue). +=head3 The special problem of accept()ing when you can't + +Many implementations of the POSIX C function (for example, +found in post-2004 Linux) have the peculiar behaviour of not removing a +connection from the pending queue in all error cases. + +For example, larger servers often run out of file descriptors (because +of resource limits), causing C to fail with C but not +rejecting the connection, leading to libev signalling readiness on +the next iteration again (the connection still exists after all), and +typically causing the program to loop at 100% CPU usage. + +Unfortunately, the set of errors that cause this issue differs between +operating systems, there is usually little the app can do to remedy the +situation, and no known thread-safe method of removing the connection to +cope with overload is known (to me). + +One of the easiest ways to handle this situation is to just ignore it +- when the program encounters an overload, it will just loop until the +situation is over. While this is a form of busy waiting, no OS offers an +event-based way to handle this situation, so it's the best one can do. + +A better way to handle the situation is to log any errors other than +C and C, making sure not to flood the log with such +messages, and continue as usual, which at least gives the user an idea of +what could be wrong ("raise the ulimit!"). For extra points one could stop +the C watcher on the listening fd "for a while", which reduces CPU +usage. + +If your program is single-threaded, then you could also keep a dummy file +descriptor for overload situations (e.g. by opening F), and +when you run into C or C, close it, run C, +close that fd, and create a new dummy fd. This will gracefully refuse +clients under typical overload conditions. + +The last way to handle it is to simply log the error and C, as +is often done with C failures, but this results in an easy +opportunity for a DoS attack. =head3 Watcher-Specific Functions @@ -1281,7 +1707,7 @@ ev_io stdin_readable; ev_io_init (&stdin_readable, stdin_readable_cb, STDIN_FILENO, EV_READ); ev_io_start (loop, &stdin_readable); - ev_loop (loop, 0); + ev_run (loop, 0); =head2 C - relative and optionally repeating timeouts @@ -1296,8 +1722,11 @@ monotonic clock option helps a lot here). The callback is guaranteed to be invoked only I its timeout has -passed, but if multiple timers become ready during the same loop iteration -then order of execution is undefined. +passed (not I, so on systems with very low-resolution clocks this +might introduce a small delay). If multiple timers become ready during the +same loop iteration then the ones with earlier time-out values are invoked +before ones of the same priority with later time-out values (but this is +no longer true when a callback calls C recursively). =head3 Be smart about timeouts @@ -1351,7 +1780,7 @@ At start: - ev_timer_init (timer, callback); + ev_init (timer, callback); timer->repeat = 60.; ev_timer_again (loop, timer); @@ -1393,14 +1822,14 @@ // if last_activity + 60. is older than now, we did time out if (timeout < now) { - // timeout occured, take action + // timeout occurred, take action } else { // callback was invoked, but there was some activity, re-arm // the watcher to fire in last_activity + 60, which is // guaranteed to be in the future, so "again" is positive: - w->again = timeout - now; + w->repeat = timeout - now; ev_timer_again (EV_A_ w); } } @@ -1423,14 +1852,14 @@ to the current time (meaning we just have some activity :), then call the callback, which will "do the right thing" and start the timer: - ev_timer_init (timer, callback); + ev_init (timer, callback); last_activity = ev_now (loop); - callback (loop, timer, EV_TIMEOUT); + callback (loop, timer, EV_TIMER); And when there is some activity, simply store the current time in C, no libev calls at all: - last_actiivty = ev_now (loop); + last_activity = ev_now (loop); This technique is slightly more complex, but in most cases where the time-out is unlikely to be triggered, much more efficient. @@ -1478,7 +1907,7 @@ Establishing the current time is a costly operation (it usually takes at least two system calls): EV therefore updates its idea of the current -time only before and after C collects new events, which causes a +time only before and after C collects new events, which causes a growing difference between C and C when handling lots of events in one iteration. @@ -1494,6 +1923,36 @@ update of the time returned by C by calling C. +=head3 The special problems of suspended animation + +When you leave the server world it is quite customary to hit machines that +can suspend/hibernate - what happens to the clocks during such a suspend? + +Some quick tests made with a Linux 2.6.28 indicate that a suspend freezes +all processes, while the clocks (C, C) continue +to run until the system is suspended, but they will not advance while the +system is suspended. That means, on resume, it will be as if the program +was frozen for a few seconds, but the suspend time will not be counted +towards C when a monotonic clock source is used. The real time +clock advanced as expected, but if it is used as sole clocksource, then a +long suspend would be detected as a time jump by libev, and timers would +be adjusted accordingly. + +I would not be surprised to see different behaviour in different between +operating systems, OS versions or even different hardware. + +The other form of suspend (job control, or sending a SIGSTOP) will see a +time jump in the monotonic clocks and the realtime clock. If the program +is suspended for a very long time, and monotonic clock sources are in use, +then you can expect Cs to expire as the full suspension time +will be counted towards the timers. When no monotonic clock source is in +use, then libev will again assume a timejump and adjust accordingly. + +It might be beneficial for this latter case to call C +and C in code that handles C, to at least get +deterministic behaviour in this case (you can do nothing against +C). + =head3 Watcher-Specific Functions and Data Members =over 4 @@ -1526,9 +1985,21 @@ If the timer is repeating, either start it if necessary (with the C value), or reset the running timer to the C value. -This sounds a bit complicated, see "Be smart about timeouts", above, for a +This sounds a bit complicated, see L, above, for a usage example. +=item ev_tstamp ev_timer_remaining (loop, ev_timer *) + +Returns the remaining time until a timer fires. If the timer is active, +then this time is relative to the current event loop time, otherwise it's +the timeout value currently configured. + +That is, after an C, C returns +C<5>. When the timer is started and one second passes, C +will return C<4>. When the timer expires and is restarted, it will return +roughly C<7> (likely slightly less as callback invocation takes some time, +too), and so on. + =item ev_tstamp repeat [read-write] The current C value. Will be used each time the watcher times out @@ -1563,7 +2034,7 @@ ev_timer mytimer; ev_timer_init (&mytimer, timeout_cb, 0., 10.); /* note, only repeat used */ ev_timer_again (&mytimer); /* start timer */ - ev_loop (loop, 0); + ev_run (loop, 0); // and in some piece of code that gets executed on any "activity": // reset the timeout to start ticking again at 10 seconds @@ -1575,52 +2046,63 @@ Periodic watchers are also timers of a kind, but they are very versatile (and unfortunately a bit complex). -Unlike C's, they are not based on real time (or relative time) -but on wall clock time (absolute time). You can tell a periodic watcher -to trigger after some specific point in time. For example, if you tell a -periodic watcher to trigger in 10 seconds (by specifying e.g. C, that is, an absolute time not a delay) and then reset your system -clock to January of the previous year, then it will take more than year -to trigger the event (unlike an C, which would still trigger -roughly 10 seconds later as it uses a relative timeout). - -Cs can also be used to implement vastly more complex timers, -such as triggering an event on each "midnight, local time", or other -complicated rules. +Unlike C, periodic watchers are not based on real time (or +relative time, the physical time that passes) but on wall clock time +(absolute time, the thing you can read on your calender or clock). The +difference is that wall clock time can run faster or slower than real +time, and time jumps are not uncommon (e.g. when you adjust your +wrist-watch). + +You can tell a periodic watcher to trigger after some specific point +in time: for example, if you tell a periodic watcher to trigger "in 10 +seconds" (by specifying e.g. C, that is, an absolute time +not a delay) and then reset your system clock to January of the previous +year, then it will take a year or more to trigger the event (unlike an +C, which would still trigger roughly 10 seconds after starting +it, as it uses a relative timeout). + +C watchers can also be used to implement vastly more complex +timers, such as triggering an event on each "midnight, local time", or +other complicated rules. This cannot be done with C watchers, as +those cannot react to time jumps. As with timers, the callback is guaranteed to be invoked only when the -time (C) has passed, but if multiple periodic timers become ready -during the same loop iteration, then order of execution is undefined. +point in time where it is supposed to trigger has passed. If multiple +timers become ready during the same loop iteration then the ones with +earlier time-out values are invoked before ones with later time-out values +(but this is no longer true when a callback calls C recursively). =head3 Watcher-Specific Functions and Data Members =over 4 -=item ev_periodic_init (ev_periodic *, callback, ev_tstamp at, ev_tstamp interval, reschedule_cb) +=item ev_periodic_init (ev_periodic *, callback, ev_tstamp offset, ev_tstamp interval, reschedule_cb) -=item ev_periodic_set (ev_periodic *, ev_tstamp after, ev_tstamp repeat, reschedule_cb) +=item ev_periodic_set (ev_periodic *, ev_tstamp offset, ev_tstamp interval, reschedule_cb) -Lots of arguments, lets sort it out... There are basically three modes of +Lots of arguments, let's sort it out... There are basically three modes of operation, and we will explain them from simplest to most complex: =over 4 -=item * absolute timer (at = time, interval = reschedule_cb = 0) +=item * absolute timer (offset = absolute time, interval = 0, reschedule_cb = 0) In this configuration the watcher triggers an event after the wall clock -time C has passed. It will not repeat and will not adjust when a time -jump occurs, that is, if it is to be run at January 1st 2011 then it will -only run when the system clock reaches or surpasses this time. +time C has passed. It will not repeat and will not adjust when a +time jump occurs, that is, if it is to be run at January 1st 2011 then it +will be stopped and invoked when the system clock reaches or surpasses +this point in time. -=item * repeating interval timer (at = offset, interval > 0, reschedule_cb = 0) +=item * repeating interval timer (offset = offset within interval, interval > 0, reschedule_cb = 0) In this mode the watcher will always be scheduled to time out at the next -C time (for some integer N, which can also be negative) -and then repeat, regardless of any time jumps. +C time (for some integer N, which can also be +negative) and then repeat, regardless of any time jumps. The C +argument is merely an offset into the C periods. This can be used to create timers that do not drift with respect to the -system clock, for example, here is a C that triggers each -hour, on the hour: +system clock, for example, here is an C that triggers each +hour, on the hour (with respect to UTC): ev_periodic_set (&periodic, 0., 3600., 0); @@ -1631,9 +2113,9 @@ Another way to think about it (for the mathematically inclined) is that C will try to run the callback in this mode at the next possible -time where C