--- libev/ev.pod 2010/10/14 05:07:04 1.305 +++ libev/ev.pod 2013/04/28 14:57:12 1.427 @@ -28,8 +28,8 @@ // with its corresponding stop function. ev_io_stop (EV_A_ w); - // this causes all nested ev_loop's to stop iterating - ev_unloop (EV_A_ EVUNLOOP_ALL); + // this causes all nested ev_run's to stop iterating + ev_break (EV_A_ EVBREAK_ALL); } // another callback, this time for a time-out @@ -37,15 +37,15 @@ timeout_cb (EV_P_ ev_timer *w, int revents) { puts ("timeout"); - // this causes the innermost ev_loop to stop iterating - ev_unloop (EV_A_ EVUNLOOP_ONE); + // this causes the innermost ev_run to stop iterating + ev_break (EV_A_ EVBREAK_ONE); } int main (void) { // use the default event loop unless you have special needs - struct ev_loop *loop = ev_default_loop (0); + struct ev_loop *loop = EV_DEFAULT; // initialise an io watcher, then start it // this one will watch for stdin to become readable @@ -58,9 +58,9 @@ ev_timer_start (loop, &timeout_watcher); // now wait for events to arrive - ev_loop (loop, 0); + ev_run (loop, 0); - // unloop was called, so exit + // break was called, so exit return 0; } @@ -80,6 +80,14 @@ Familiarity with event based programming techniques in general is assumed throughout this document. +=head1 WHAT TO READ WHEN IN A HURRY + +This manual tries to be very detailed, but unfortunately, this also makes +it very long. If you just want to know the basics of libev, I suggest +reading L, then the L above and +look up the missing functions in L and the C and +C sections in L. + =head1 ABOUT LIBEV Libev is an event loop: you register interest in certain events (such as a @@ -126,7 +134,7 @@ =head2 TIME REPRESENTATION Libev represents time as a single floating point number, representing -the (fractional) number of seconds since the (POSIX) epoch (in practise +the (fractional) number of seconds since the (POSIX) epoch (in practice somewhere near the beginning of 1970, details are complicated, don't ask). This type is called C, which is what you should use too. It usually aliases to the C type in C. When you need to do @@ -167,13 +175,20 @@ Returns the current time as libev would use it. Please note that the C function is usually faster and also often returns the timestamp -you actually want to know. +you actually want to know. Also interesting is the combination of +C and C. =item ev_sleep (ev_tstamp interval) -Sleep for the given interval: The current thread will be blocked until -either it is interrupted or the given time interval has passed. Basically -this is a sub-second-resolution C. +Sleep for the given interval: The current thread will be blocked +until either it is interrupted or the given time interval has +passed (approximately - it might return a bit earlier even if not +interrupted). Returns immediately if C<< interval <= 0 >>. + +Basically this is a sub-second-resolution C. + +The range of the C is limited - libev only guarantees to work +with sleep times of up to one day (C<< interval <= 86400 >>). =item int ev_version_major () @@ -194,7 +209,8 @@ not a problem. Example: Make sure we haven't accidentally been linked against the wrong -version (note, however, that this will not detect ABI mismatches :). +version (note, however, that this will not detect other ABI mismatches, +such as LFS or reentrancy). assert (("libev version mismatch", ev_version_major () == EV_VERSION_MAJOR @@ -215,24 +231,25 @@ =item unsigned int ev_recommended_backends () -Return the set of all backends compiled into this binary of libev and also -recommended for this platform. This set is often smaller than the one -returned by C, as for example kqueue is broken on -most BSDs and will not be auto-detected unless you explicitly request it -(assuming you know what you are doing). This is the set of backends that -libev will probe for if you specify no backends explicitly. +Return the set of all backends compiled into this binary of libev and +also recommended for this platform, meaning it will work for most file +descriptor types. This set is often smaller than the one returned by +C, as for example kqueue is broken on most BSDs +and will not be auto-detected unless you explicitly request it (assuming +you know what you are doing). This is the set of backends that libev will +probe for if you specify no backends explicitly. =item unsigned int ev_embeddable_backends () Returns the set of backends that are embeddable in other event loops. This -is the theoretical, all-platform, value. To find which backends -might be supported on the current system, you would need to look at -C, likewise for -recommended ones. +value is platform-specific but can include backends not available on the +current system. To find which embeddable backends might be supported on +the current system, you would need to look at C, likewise for recommended ones. See the description of C watchers for more info. -=item ev_set_allocator (void *(*cb)(void *ptr, long size)) [NOT REENTRANT] +=item ev_set_allocator (void *(*cb)(void *ptr, long size) throw ()) Sets the allocation function to use (the prototype is similar - the semantics are identical to the C C89/SuS/POSIX function). It is @@ -268,7 +285,7 @@ ... ev_set_allocator (persistent_realloc); -=item ev_set_syserr_cb (void (*cb)(const char *msg)); [NOT REENTRANT] +=item ev_set_syserr_cb (void (*cb)(const char *msg) throw ()) Set the callback function to call on a retryable system call error (such as failed select, poll, epoll_wait). The message is a printable string @@ -290,40 +307,78 @@ ... ev_set_syserr_cb (fatal_error); +=item ev_feed_signal (int signum) + +This function can be used to "simulate" a signal receive. It is completely +safe to call this function at any time, from any context, including signal +handlers or random threads. + +Its main use is to customise signal handling in your process, especially +in the presence of threads. For example, you could block signals +by default in all threads (and specifying C when +creating any loops), and in one thread, use C or any other +mechanism to wait for signals, then "deliver" them to libev by calling +C. + =back -=head1 FUNCTIONS CONTROLLING THE EVENT LOOP +=head1 FUNCTIONS CONTROLLING EVENT LOOPS -An event loop is described by a C (the C -is I optional in this case, as there is also an C -I). +An event loop is described by a C (the C is +I optional in this case unless libev 3 compatibility is disabled, as +libev 3 had an C function colliding with the struct name). The library knows two types of such loops, the I loop, which -supports signals and child events, and dynamically created loops which do -not. +supports child process events, and dynamically created event loops which +do not. =over 4 =item struct ev_loop *ev_default_loop (unsigned int flags) -This will initialise the default event loop if it hasn't been initialised -yet and return it. If the default loop could not be initialised, returns -false. If it already was initialised it simply returns it (and ignores the -flags. If that is troubling you, check C afterwards). +This returns the "default" event loop object, which is what you should +normally use when you just need "the event loop". Event loop objects and +the C parameter are described in more detail in the entry for +C. + +If the default loop is already initialised then this function simply +returns it (and ignores the flags. If that is troubling you, check +C afterwards). Otherwise it will create it with the given +flags, which should almost always be C<0>, unless the caller is also the +one calling C or otherwise qualifies as "the main program". If you don't know what event loop to use, use the one returned from this -function. +function (or via the C macro). Note that this function is I thread-safe, so if you want to use it -from multiple threads, you have to lock (note also that this is unlikely, -as loops cannot be shared easily between threads anyway). +from multiple threads, you have to employ some kind of mutex (note also +that this case is unlikely, as loops cannot be shared easily between +threads anyway). + +The default loop is the only loop that can handle C watchers, +and to do this, it always registers a handler for C. If this is +a problem for your application you can either create a dynamic loop with +C which doesn't do that, or you can simply overwrite the +C signal handler I calling C. + +Example: This is the most typical usage. + + if (!ev_default_loop (0)) + fatal ("could not initialise libev, bad $LIBEV_FLAGS in environment?"); + +Example: Restrict libev to the select and poll backends, and do not allow +environment settings to be taken into account: + + ev_default_loop (EVBACKEND_POLL | EVBACKEND_SELECT | EVFLAG_NOENV); + +=item struct ev_loop *ev_loop_new (unsigned int flags) -The default loop is the only loop that can handle C and -C watchers, and to do this, it always registers a handler -for C. If this is a problem for your application you can either -create a dynamic loop with C that doesn't do that, or you -can simply overwrite the C signal handler I calling -C. +This will create and initialise a new event loop object. If the loop +could not be initialised, returns false. + +This function is thread-safe, and one common way to use libev with +threads is indeed to create one loop per thread, and using the default +loop in the "main" or "initial" thread. The flags argument can be used to specify special behaviour or specific backends to use, and is usually specified as C<0> (or C). @@ -343,8 +398,10 @@ or setgid) then libev will I look at the environment variable C. Otherwise (the default), this environment variable will override the flags completely if it is found in the environment. This is -useful to try out specific backends to test their performance, or to work -around bugs. +useful to try out specific backends to test their performance, to work +around bugs, or to make libev threadsafe (accessing environment variables +cannot be done in a threadsafe way, but usually it works if no other +thread modifies them). =item C @@ -368,14 +425,14 @@ =item C When this flag is specified, then libev will not attempt to use the -I API for it's C watchers. Apart from debugging and +I API for its C watchers. Apart from debugging and testing, this flag can be useful to conserve inotify file descriptors, as otherwise each loop using C watchers consumes one inotify handle. =item C When this flag is specified, then libev will attempt to use the -I API for it's C (and C) watchers. This API +I API for its C (and C) watchers. This API delivers signals synchronously, which makes it both faster and might make it possible to get the queued signal data. It can also simplify signal handling with threads, as long as you properly block signals in your @@ -385,6 +442,21 @@ there are a lot of shoddy libraries and programs (glib's threadpool for example) that can't properly initialise their signal masks. +=item C + +When this flag is specified, then libev will avoid to modify the signal +mask. Specifically, this means you have to make sure signals are unblocked +when you want to receive them. + +This behaviour is useful when you want to do your own signal handling, or +want to handle signals only in specific threads and want to avoid libev +unblocking the signals. + +It's also required by POSIX in a threaded program, as libev calls +C, whose behaviour is officially unspecified. + +This flag's behaviour will become the default in future versions of libev. + =item C (value 1, portable select backend) This is your standard select(2) backend. Not I standard, as @@ -421,27 +493,38 @@ Use the linux-specific epoll(7) interface (for both pre- and post-2.6.9 kernels). -For few fds, this backend is a bit little slower than poll and select, -but it scales phenomenally better. While poll and select usually scale -like O(total_fds) where n is the total number of fds (or the highest fd), -epoll scales either O(1) or O(active_fds). +For few fds, this backend is a bit little slower than poll and select, but +it scales phenomenally better. While poll and select usually scale like +O(total_fds) where total_fds is the total number of fds (or the highest +fd), epoll scales either O(1) or O(active_fds). The epoll mechanism deserves honorable mention as the most misdesigned of the more advanced event mechanisms: mere annoyances include silently dropping file descriptors, requiring a system call per change per file -descriptor (and unnecessary guessing of parameters), problems with dup and -so on. The biggest issue is fork races, however - if a program forks then -I parent and child process have to recreate the epoll set, which can -take considerable time (one syscall per file descriptor) and is of course -hard to detect. - -Epoll is also notoriously buggy - embedding epoll fds I work, but -of course I, and epoll just loves to report events for totally -I file descriptors (even already closed ones, so one cannot -even remove them from the set) than registered in the set (especially -on SMP systems). Libev tries to counter these spurious notifications by -employing an additional generation counter and comparing that against the -events to filter out spurious ones, recreating the set when required. +descriptor (and unnecessary guessing of parameters), problems with dup, +returning before the timeout value, resulting in additional iterations +(and only giving 5ms accuracy while select on the same platform gives +0.1ms) and so on. The biggest issue is fork races, however - if a program +forks then I parent and child process have to recreate the epoll +set, which can take considerable time (one syscall per file descriptor) +and is of course hard to detect. + +Epoll is also notoriously buggy - embedding epoll fds I work, +but of course I, and epoll just loves to report events for +totally I file descriptors (even already closed ones, so +one cannot even remove them from the set) than registered in the set +(especially on SMP systems). Libev tries to counter these spurious +notifications by employing an additional generation counter and comparing +that against the events to filter out spurious ones, recreating the set +when required. Epoll also erroneously rounds down timeouts, but gives you +no way to know when and by how much, so sometimes you have to busy-wait +because epoll returns immediately despite a nonzero timeout. And last +not least, it also refuses to work with some file descriptors which work +perfectly fine with C (or libev) on file descriptors +representing files, and expect it to become ready when their program +doesn't block on disk accesses (which can take a long time on their own). + +However, this cannot ever work in the "expected" way - you get a readiness +notification as soon as the kernel knows whether and how much data is +there, and in the case of open files, that's always the case, so you +always get a readiness notification instantly, and your read (or possibly +write) will still block on the disk I/O. + +Another way to view it is that in the case of sockets, pipes, character +devices and so on, there is another party (the sender) that delivers data +on its own, but in the case of files, there is no such thing: the disk +will not send data on its own, simply because it doesn't know what you +wish to read - you would first have to request some data. + +Since files are typically not-so-well supported by advanced notification +mechanism, libev tries hard to emulate POSIX behaviour with respect +to files, even though you should not use it. The reason for this is +convenience: sometimes you want to watch STDIN or STDOUT, which is +usually a tty, often a pipe, but also sometimes files or special devices +(for example, C on Linux works with F but not with +F), and even though the file might better be served with +asynchronous I/O instead of with non-blocking I/O, it is still useful when +it "just works" instead of freezing. + +So avoid file descriptors pointing to files when you know it (e.g. use +libeio), but use them when it is convenient, e.g. for STDIN/STDOUT, or +when you rarely read from a file instead of from a socket, and want to +reuse the same code path. + =head3 The special problem of fork Some backends (epoll, kqueue) do not support C at all or exhibit useless behaviour. Libev fully supports fork, but needs to be told about -it in the child. +it in the child if you want to continue to use it in the child. -To support fork in your programs, you either have to call -C or C after a fork in the child, -enable C, or resort to C or -C. +To support fork in your child processes, you have to call C after a fork in the child, enable C, or resort to +C or C. =head3 The special problem of SIGPIPE @@ -1624,7 +1768,7 @@ ev_io stdin_readable; ev_io_init (&stdin_readable, stdin_readable_cb, STDIN_FILENO, EV_READ); ev_io_start (loop, &stdin_readable); - ev_loop (loop, 0); + ev_run (loop, 0); =head2 C - relative and optionally repeating timeouts @@ -1640,10 +1784,11 @@ The callback is guaranteed to be invoked only I its timeout has passed (not I, so on systems with very low-resolution clocks this -might introduce a small delay). If multiple timers become ready during the -same loop iteration then the ones with earlier time-out values are invoked -before ones of the same priority with later time-out values (but this is -no longer true when a callback calls C recursively). +might introduce a small delay, see "the special problem of being too +early", below). If multiple timers become ready during the same loop +iteration then the ones with earlier time-out values are invoked before +ones of the same priority with later time-out values (but this is no +longer true when a callback calls C recursively). =head3 Be smart about timeouts @@ -1728,63 +1873,77 @@ but remember the time of last activity, and check for a real timeout only within the callback: + ev_tstamp timeout = 60.; ev_tstamp last_activity; // time of last activity + ev_timer timer; static void callback (EV_P_ ev_timer *w, int revents) { - ev_tstamp now = ev_now (EV_A); - ev_tstamp timeout = last_activity + 60.; + // calculate when the timeout would happen + ev_tstamp after = last_activity - ev_now (EV_A) + timeout; - // if last_activity + 60. is older than now, we did time out - if (timeout < now) + // if negative, it means we the timeout already occurred + if (after < 0.) { // timeout occurred, take action } else { - // callback was invoked, but there was some activity, re-arm - // the watcher to fire in last_activity + 60, which is - // guaranteed to be in the future, so "again" is positive: - w->repeat = timeout - now; - ev_timer_again (EV_A_ w); + // callback was invoked, but there was some recent + // activity. simply restart the timer to time out + // after "after" seconds, which is the earliest time + // the timeout can occur. + ev_timer_set (w, after, 0.); + ev_timer_start (EV_A_ w); } } -To summarise the callback: first calculate the real timeout (defined -as "60 seconds after the last activity"), then check if that time has -been reached, which means something I, in fact, time out. Otherwise -the callback was invoked too early (C is in the future), so -re-schedule the timer to fire at that future time, to see if maybe we have -a timeout then. - -Note how C is used, taking advantage of the -C optimisation when the timer is already running. +To summarise the callback: first calculate in how many seconds the +timeout will occur (by calculating the absolute time when it would occur, +C, and subtracting the current time, C from that). + +If this value is negative, then we are already past the timeout, i.e. we +timed out, and need to do whatever is needed in this case. + +Otherwise, we now the earliest time at which the timeout would trigger, +and simply start the timer with this timeout value. + +In other words, each time the callback is invoked it will check whether +the timeout occurred. If not, it will simply reschedule itself to check +again at the earliest time it could time out. Rinse. Repeat. This scheme causes more callback invocations (about one every 60 seconds minus half the average time between activity), but virtually no calls to libev to change the timeout. -To start the timer, simply initialise the watcher and set C -to the current time (meaning we just have some activity :), then call the -callback, which will "do the right thing" and start the timer: +To start the machinery, simply initialise the watcher and set +C to the current time (meaning there was some activity just +now), then call the callback, which will "do the right thing" and start +the timer: + + last_activity = ev_now (EV_A); + ev_init (&timer, callback); + callback (EV_A_ &timer, 0); - ev_init (timer, callback); - last_activity = ev_now (loop); - callback (loop, timer, EV_TIMER); - -And when there is some activity, simply store the current time in +When there is some activity, simply store the current time in C, no libev calls at all: - last_activity = ev_now (loop); + if (activity detected) + last_activity = ev_now (EV_A); + +When your timeout value changes, then the timeout can be changed by simply +providing a new value, stopping the timer and calling the callback, which +will again do the right thing (for example, time out immediately :). + + timeout = new_value; + ev_timer_stop (EV_A_ &timer); + callback (EV_A_ &timer, 0); This technique is slightly more complex, but in most cases where the time-out is unlikely to be triggered, much more efficient. -Changing the timeout is trivial as well (if it isn't hard-coded in the -callback :) - just change the timeout and invoke the callback, which will -fix things for you. - =item 4. Wee, just use a double-linked list for your timeouts. If there is not one request, but many thousands (millions...), all @@ -1820,11 +1979,48 @@ off after the first million or so of active timers, i.e. it's usually overkill :) +=head3 The special problem of being too early + +If you ask a timer to call your callback after three seconds, then +you expect it to be invoked after three seconds - but of course, this +cannot be guaranteed to infinite precision. Less obviously, it cannot be +guaranteed to any precision by libev - imagine somebody suspending the +process with a STOP signal for a few hours for example. + +So, libev tries to invoke your callback as soon as possible I the +delay has occurred, but cannot guarantee this. + +A less obvious failure mode is calling your callback too early: many event +loops compare timestamps with a "elapsed delay >= requested delay", but +this can cause your callback to be invoked much earlier than you would +expect. + +To see why, imagine a system with a clock that only offers full second +resolution (think windows if you can't come up with a broken enough OS +yourself). If you schedule a one-second timer at the time 500.9, then the +event loop will schedule your timeout to elapse at a system time of 500 +(500.9 truncated to the resolution) + 1, or 501. + +If an event library looks at the timeout 0.1s later, it will see "501 >= +501" and invoke the callback 0.1s after it was started, even though a +one-second delay was requested - this is being "too early", despite best +intentions. + +This is the reason why libev will never invoke the callback if the elapsed +delay equals the requested delay, but only when the elapsed delay is +larger than the requested delay. In the example above, libev would only invoke +the callback at system time 502, or 1.1s after the timer was started. + +So, while libev cannot guarantee that your callback will be invoked +exactly when requested, it I and I guarantee that the requested +delay has actually elapsed, or in other words, it always errs on the "too +late" side of things. + =head3 The special problem of time updates -Establishing the current time is a costly operation (it usually takes at -least two system calls): EV therefore updates its idea of the current -time only before and after C collects new events, which causes a +Establishing the current time is a costly operation (it usually takes +at least one system call): EV therefore updates its idea of the current +time only before and after C collects new events, which causes a growing difference between C and C when handling lots of events in one iteration. @@ -1840,6 +2036,39 @@ update of the time returned by C by calling C. +=head3 The special problem of unsynchronised clocks + +Modern systems have a variety of clocks - libev itself uses the normal +"wall clock" clock and, if available, the monotonic clock (to avoid time +jumps). + +Neither of these clocks is synchronised with each other or any other clock +on the system, so C might return a considerably different time +than C or C