Efficient IO with io_uring
io_uring1 is an new design in Linux IO, it may overcome the problems found in sync IO and async IO like aio, and embrace the polled IO found in mordern IO patterns. The article titled “Efficient IO with io_uring”1 is ease to follow. On Jan 24, 2020, LWN published an excellent summary the current status of io_uring, “The rapid growth of io_uring”2, which is also a must-read. I am expecting io_uring to be used in database and storage systems.
Followed is the hightlights of “Efficient IO with io_uring”.
Introduction
- sync io: read, write, preadv, pwritev, and even preadv2, pwritev23.
- async io: posix - aio_read, aio_write; linux aio.
- async aio problems in linux:.
- O_DIRECT only
- may block on metadata op or limit of request slots
- memcpy overhead and at least two syscalls for submit and wait-for-complete
The result is not so many apps use aio, and earlier tries on improvement failed.
Improve on aio APIs
The author tried to improve the IO based on aio APIS, but due to its complicateness, not clear, and finally given up.
New interface design goals
Performance first, easy to use, extendable etc.
Enter io_uring
Performance first, avoid memcpy (zero copy like SPDK?). Ring buffer shared between user land and system land. submission queue, completion quque.
- Struct of cqe is straight forward.
- Struct of sqe is complex.
- ioprio_set(2)
Communication channel
-
ring buffer It is a trick used by kernel 4. To make sure the wrap around value is the 0-th element, it must be in 2^N in size5.
- why indirect index in sqe?
- read/write barriers
Interface etc.
- liburing: helper libirary
- polled io vs. spdk?
- performance twice as aio; with polled io, 3x.
References
1 http://kernel.dk/io_uring.pdf
2 https://lwn.net/Articles/810414/
3 https://lwn.net/Articles/670231/
4 http://lkml.iu.edu/hypermail/linux/kernel/0409.1/2709.html
5 http://www.snellman.net/blog/archive/2016-12-13-ring-buffers/
io-uring
]