[ViewVC] Diff of: cvs/libev/ev

Comparing libev/ev_iouring.c (file contents):
Revision 1.14 by root, Sat Dec 28 05:20:17 2019 UTC vs.
Revision 1.19 by root, Sat Dec 28 07:58:51 2019 UTC

…		…
44	* b) best is not necessarily very good.	44	* b) best is not necessarily very good.
45	* c) it's better than the aio mess, doesn't suffer from the fork problems	45	* c) it's better than the aio mess, doesn't suffer from the fork problems
46	* of linux aio or epoll and so on and so on. and you could do event stuff	46	* of linux aio or epoll and so on and so on. and you could do event stuff
47	* without any syscalls. what's not to like?	47	* without any syscalls. what's not to like?
48	* d) ok, it's vastly more complex, but that's ok, really.	48	* d) ok, it's vastly more complex, but that's ok, really.
49	* e) why 3 mmaps instead of one? one would be more space-efficient,	49	* e) why two mmaps instead of one? one would be more space-efficient,
50	* and I can't see what benefit three would have (other than being	50	* and I can't see what benefit two would have (other than being
51	* somehow resizable/relocatable, but that's apparently not possible).	51	* somehow resizable/relocatable, but that's apparently not possible).
52	* (FIXME: newer kernels can use 2 mmaps only, need to look into this).
53	* f) hmm, it's practiclaly undebuggable (gdb can't access the memory, and	52	* f) hmm, it's practically undebuggable (gdb can't access the memory, and
54	* the bizarre way structure offsets are communicated makes it hard to	53	* the bizarre way structure offsets are communicated makes it hard to
55	* just print the ring buffer heads, even iff the memory were visible	54	* just print the ring buffer heads, even iff the memory were visible
56	* in gdb. but then, that's also ok, really.	55	* in gdb. but then, that's also ok, really.
57	* g) well, you cannot specify a timeout when waiting for events. no,	56	* g) well, you cannot specify a timeout when waiting for events. no,
58	* seriously, the interface doesn't support a timeout. never seen _that_	57	* seriously, the interface doesn't support a timeout. never seen _that_
59	* before. sure, you can use a timerfd, but that's another syscall	58	* before. sure, you can use a timerfd, but that's another syscall
60	* you could have avoided. overall, this bizarre omission smells	59	* you could have avoided. overall, this bizarre omission smells
61	* like a µ-optimisation by the io_uring author for his personal	60	* like a µ-optimisation by the io_uring author for his personal
62	* applications, to the detriment of everybody else who just wants	61	* applications, to the detriment of everybody else who just wants
63	* an event loop. but, umm, ok, if that's all, it could be worse.	62	* an event loop. but, umm, ok, if that's all, it could be worse.
64	* (FIXME: jens mentioned timeout commands, need to investigate)	63	* (from what I gather from the author Jens Axboe, it simply didn't
		64	* occur to him, and he made good on it by adding an unlimited nuber
		65	* of timeouts later :).
65	* h) there is a hardcoded limit of 4096 outstanding events. okay,	66	* h) initially there was a hardcoded limit of 4096 outstanding events.
66	* at least there is no arbitrary low system-wide limit...	67	* later versions not only bump this to 32k, but also can handle
67	* (FIXME: apparently, this was increased to 32768 in later kernels(	68	* an unlimited amount of events, so this only affects the batch size.
68	* i) unlike linux aio, you can register more then the limit	69	* i) unlike linux aio, you can register more then the limit
69	* of fd events, and the kernel will "gracefully" signal an	70	* of fd events. while early verisons of io_uring signalled an overflow
70	* overflow, after which you could destroy and recreate the kernel	71	* and you ended up getting wet. 5.5+ does not do this anymore.
71	* state, a bit bigger, or fall back to e.g. poll. thats not
72	* totally insane, but kind of questions the point a high
73	* performance I/O framework when it doesn't really work
74	* under stress.
75	* (FIXME: iouring should no longer drop events, need to investigate)
76	* j) but, oh my! is has exactly the same bugs as the linux aio backend,	72	* j) but, oh my! it had exactly the same bugs as the linux aio backend,
77	* where some undocumented poll combinations just fail.	73	* where some undocumented poll combinations just fail. fortunately,
78	* so we need epoll AGAIN as a fallback. AGAIN! epoll!! and of course,	74	* after finally reaching the author, he was more than willing to fix
79	* this is completely undocumented, have I mantioned this already?	75	* this probably in 5.6+.
80	* k) overall, the API itself is, I dare to say, not a total trainwreck.	76	* k) overall, the API itself is, I dare to say, not a total trainwreck.
81	* the big isuess with it are the bugs requiring epoll, which might	77	* once the bugs ae fixed (probably in 5.6+), it will be without
82	* or might not get fixed (do I hold my breath?).	78	* competition.
83	*/	79	*/
84		80
85	/* TODO: use internal TIMEOUT */	81	/* TODO: use internal TIMEOUT */
86	/* TODO: take advantage of single mmap, NODROP etc. */	82	/* TODO: take advantage of single mmap, NODROP etc. */
87	/* TODO: resize cq/sq size independently */	83	/* TODO: resize cq/sq size independently */
…		…
228		224
229	/* the submit/completion queue entries */	225	/* the submit/completion queue entries */
230	#define EV_SQES ((struct io_uring_sqe *) iouring_sqes)	226	#define EV_SQES ((struct io_uring_sqe *) iouring_sqes)
231	#define EV_CQES ((struct io_uring_cqe )((char )iouring_cq_ring + iouring_cq_cqes))	227	#define EV_CQES ((struct io_uring_cqe )((char )iouring_cq_ring + iouring_cq_cqes))
232		228
233	/* TODO: this is not enough, we might have to reap events */	229	inline_speed
234	/* TODO: but we can't, as that will re-arm events, causing */	230	int
235	/* TODO: an endless loop in fd_reify */
236	static int
237	iouring_enter (EV_P_ ev_tstamp timeout)	231	iouring_enter (EV_P_ ev_tstamp timeout)
238	{	232	{
239	int res;	233	int res;
240		234
241	EV_RELEASE_CB;	235	EV_RELEASE_CB;
…		…
249		243
250	EV_ACQUIRE_CB;	244	EV_ACQUIRE_CB;
251		245
252	return res;	246	return res;
253	}	247	}
		248
		249	/* TODO: can we move things around so we don't need this forward-reference? */
		250	static void
		251	iouring_poll (EV_P_ ev_tstamp timeout);
254		252
255	static	253	static
256	struct io_uring_sqe *	254	struct io_uring_sqe *
257	iouring_sqe_get (EV_P)	255	iouring_sqe_get (EV_P)
258	{	256	{
		257	unsigned tail;
		258
		259	for (;;)
		260	{
259	unsigned tail = EV_SQ_VAR (tail);	261	tail = EV_SQ_VAR (tail);
260		262
261	while (ecb_expect_false (tail + 1 - EV_SQ_VAR (head) > EV_SQ_VAR (ring_entries)))	263	if (ecb_expect_true (tail + 1 - EV_SQ_VAR (head) <= EV_SQ_VAR (ring_entries)))
262	{	264	break; /* whats the problem, we have free sqes */
263	/* queue full, need to flush */
264		265
		266	/* queue full, need to flush and possibly handle some events */
		267
		268	#if EV_FEATURE_CODE
		269	/* first we ask the kernel nicely, most often this frees up some sqes */
265	int res = iouring_enter (EV_A_ EV_TS_CONST (0.));	270	int res = iouring_enter (EV_A_ EV_TS_CONST (0.));
266		271
267	/* io_uring_enter might fail with EBUSY and won't submit anything */	272	ECB_MEMORY_FENCE_ACQUIRE; /* better safe than sorry */
268	/* unfortunately, we can't handle this at the moment */
269		273
270	if (res < 0 && errno == EBUSY)	274	if (res >= 0)
271	//TODO	275	continue; /* yes, it worked, try again */
272	ev_syserr ("(libev) io_uring_enter could not clear sq");	276	#endif
273	else	277
274	break;	278	/* some problem, possibly EBUSY - do the full poll and let it handle any issues */
275		279
		280	iouring_poll (EV_A_ EV_TS_CONST (0.));
276	/* iouring_poll should have done ECB_MEMORY_FENCE_ACQUIRE */	281	/* iouring_poll should have done ECB_MEMORY_FENCE_ACQUIRE for us */
277	}	282	}
278		283
279	/assert (("libev: io_uring queue full after flush", tail + 1 - EV_SQ_VAR (head) <= EV_SQ_VAR (ring_entries)));/	284	/assert (("libev: io_uring queue full after flush", tail + 1 - EV_SQ_VAR (head) <= EV_SQ_VAR (ring_entries)));/
280		285
281	return EV_SQES + (tail & EV_SQ_VAR (ring_mask));	286	return EV_SQES + (tail & EV_SQ_VAR (ring_mask));
…		…
350		355
351	if (errno != EINVAL)	356	if (errno != EINVAL)
352	return -1; /* we failed */	357	return -1; /* we failed */
353		358
354	#if TODO	359	#if TODO
355	if ((~params.features) & (IORING_FEAT_NODROP \| IORING_FEATURE_SINGLE_MMAP))	360	if ((~params.features) & (IORING_FEAT_NODROP \| IORING_FEATURE_SINGLE_MMAP \| IORING_FEAT_SUBMIT_STABLE))
356	return -1; /* we require the above features */	361	return -1; /* we require the above features */
357	#endif	362	#endif
358		363
359	/* EINVAL: lots of possible reasons, but maybe	364	/* EINVAL: lots of possible reasons, but maybe
360	* it is because we hit the unqueryable hardcoded size limit	365	* it is because we hit the unqueryable hardcoded size limit
…		…
438	/* Jens Axboe notified me that user_data is not what is documented, but is	443	/* Jens Axboe notified me that user_data is not what is documented, but is
439	* some kind of unique ID that has to match, otherwise the request cannot	444	* some kind of unique ID that has to match, otherwise the request cannot
440	* be removed. Since we don't really have that, we pass in the old	445	* be removed. Since we don't really have that, we pass in the old
441	* generation counter - if that fails, too bad, it will hopefully be removed	446	* generation counter - if that fails, too bad, it will hopefully be removed
442	* at close time and then be ignored. */	447	* at close time and then be ignored. */
443	sqe->user_data = (uint32_t)fd \| ((__u64)(uint32_t)anfds [fd].egen << 32);	448	sqe->addr = (uint32_t)fd \| ((__u64)(uint32_t)anfds [fd].egen << 32);
		449	sqe->user_data = (uint64_t)-1;
444	iouring_sqe_submit (EV_A_ sqe);	450	iouring_sqe_submit (EV_A_ sqe);
445		451
446	/* increment generation counter to avoid handling old events */	452	/* increment generation counter to avoid handling old events */
447	++anfds [fd].egen;	453	++anfds [fd].egen;
448	}	454	}
…		…
488	iouring_process_cqe (EV_P_ struct io_uring_cqe *cqe)	494	iouring_process_cqe (EV_P_ struct io_uring_cqe *cqe)
489	{	495	{
490	int fd = cqe->user_data & 0xffffffffU;	496	int fd = cqe->user_data & 0xffffffffU;
491	uint32_t gen = cqe->user_data >> 32;	497	uint32_t gen = cqe->user_data >> 32;
492	int res = cqe->res;	498	int res = cqe->res;
		499
		500	/* user_data -1 is a remove that we are not atm. interested in */
		501	if (cqe->user_data == (uint64_t)-1)
		502	return;
493		503
494	assert (("libev: io_uring fd must be in-bounds", fd >= 0 && fd < anfdmax));	504	assert (("libev: io_uring fd must be in-bounds", fd >= 0 && fd < anfdmax));
495		505
496	/* documentation lies, of course. the result value is NOT like	506	/* documentation lies, of course. the result value is NOT like
497	* normal syscalls, but like linux raw syscalls, i.e. negative	507	* normal syscalls, but like linux raw syscalls, i.e. negative
…		…
622		632
623	static void	633	static void
624	iouring_poll (EV_P_ ev_tstamp timeout)	634	iouring_poll (EV_P_ ev_tstamp timeout)
625	{	635	{
626	/* if we have events, no need for extra syscalls, but we might have to queue events */	636	/* if we have events, no need for extra syscalls, but we might have to queue events */
		637	/* we also clar the timeout if there are outstanding fdchanges */
		638	/* the latter should only happen if both the sq and cq are full, most likely */
		639	/* because we have a lot of event sources that immediately complete */
		640	/* TODO: fdchacngecnt is always 0 because fd_reify does not have two buffers yet */
627	if (iouring_handle_cq (EV_A))	641	if (iouring_handle_cq (EV_A) \|\| fdchangecnt)
628	timeout = EV_TS_CONST (0.);	642	timeout = EV_TS_CONST (0.);
629	else	643	else
630	/* no events, so maybe wait for some */	644	/* no events, so maybe wait for some */
631	iouring_tfd_update (EV_A_ timeout);	645	iouring_tfd_update (EV_A_ timeout);
632		646

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing libev/ev_iouring.c (file contents): Revision 1.14 by root, Sat Dec 28 05:20:17 2019 UTC vs. Revision 1.19 by root, Sat Dec 28 07:58:51 2019 UTC

Diff Legend

Comparing libev/ev_iouring.c (file contents):
Revision 1.14 by root, Sat Dec 28 05:20:17 2019 UTC vs.
Revision 1.19 by root, Sat Dec 28 07:58:51 2019 UTC