qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: [PATCH 0/1] introduce nvmf block driver


From: Stefan Hajnoczi
Subject: Re: Re: [PATCH 0/1] introduce nvmf block driver
Date: Tue, 8 Jun 2021 09:07:52 +0100

On Tue, Jun 08, 2021 at 10:52:05AM +0800, zhenwei pi wrote:
> On 6/7/21 11:08 PM, Stefan Hajnoczi wrote:
> > On Mon, Jun 07, 2021 at 09:32:52PM +0800, zhenwei pi wrote:
> > > Since 2020, I started to develop a userspace NVMF initiator library:
> > > https://github.com/bytedance/libnvmf
> > > and released v0.1 recently.
> > > 
> > > Also developed block driver for QEMU side:
> > > https://github.com/pizhenwei/qemu/tree/block-nvmf
> > > 
> > > Test with linux kernel NVMF target (TCP), QEMU gets about 220K IOPS,
> > > it seems good.
> > 
> > How does the performance compare to the Linux kernel NVMeoF initiator?
> > 
> > In case you're interested, some Red Hat developers have started to
> > working on a new library called libblkio. For now it supports io_uring
> > but PCI NVMe and virtio-blk are on the roadmap. The library supports
> > blocking, event-driven, and polling modes. There isn't a direct overlap
> > with libnvmf but maybe they can learn from each other.
> > https://gitlab.com/libblkio/libblkio/-/blob/main/docs/blkio.rst
> > 
> > Stefan
> > 
> 
> I'm sorry about that no enough information of QEMU block nvmf driver and
> libnvmf.
> 
> Kernel initiator & userspace initiator
> Rather than io_uring/libaio + kernel initiator solution(read 500K+ IOPS &
> write 200K+ IOPS), I prefer QEMU block nvmf + libnvmf(RW 200K+ IOPS):
> 1, I don't have to upgrade host kernel. I can also run it on a lower version
> of kernel.
> 2, During re-connection if target side hits a panic, initiator side would
> not get 'D' state(uninterruptable state in kernel), QEMU always could be
> killed.
> 3, It's easier to trouble shoot for a userspace application.

I see, thanks for sharing.

> Default NVMe-OF IO queues
> The mechanism of QEMU+libnvmf:
> 1, QEMU iothread creates a request and dispatches it to NVMe-OF IO queues
> thread by lockless list.
> 2, QEMU iothread tries to kick NVMe-OF IO queue thread.
> 3, NVMe-OF IO queue thread processes request and returns response to the
> QEMU iothread.
> 
> When the QEMU iothread reaches the limitation, 4 NVMe-OF IO queues get
> better performance.

Can you explain this bottleneck? Even with 4 NVMe-oF IO queues there is
still just 1 IOThread submitting requests, so why are 4 IO queues faster
than 1?

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]