[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re: [PATCH 0/1] introduce nvmf block driver
From: |
Stefan Hajnoczi |
Subject: |
Re: Re: [PATCH 0/1] introduce nvmf block driver |
Date: |
Tue, 8 Jun 2021 09:07:52 +0100 |
On Tue, Jun 08, 2021 at 10:52:05AM +0800, zhenwei pi wrote:
> On 6/7/21 11:08 PM, Stefan Hajnoczi wrote:
> > On Mon, Jun 07, 2021 at 09:32:52PM +0800, zhenwei pi wrote:
> > > Since 2020, I started to develop a userspace NVMF initiator library:
> > > https://github.com/bytedance/libnvmf
> > > and released v0.1 recently.
> > >
> > > Also developed block driver for QEMU side:
> > > https://github.com/pizhenwei/qemu/tree/block-nvmf
> > >
> > > Test with linux kernel NVMF target (TCP), QEMU gets about 220K IOPS,
> > > it seems good.
> >
> > How does the performance compare to the Linux kernel NVMeoF initiator?
> >
> > In case you're interested, some Red Hat developers have started to
> > working on a new library called libblkio. For now it supports io_uring
> > but PCI NVMe and virtio-blk are on the roadmap. The library supports
> > blocking, event-driven, and polling modes. There isn't a direct overlap
> > with libnvmf but maybe they can learn from each other.
> > https://gitlab.com/libblkio/libblkio/-/blob/main/docs/blkio.rst
> >
> > Stefan
> >
>
> I'm sorry about that no enough information of QEMU block nvmf driver and
> libnvmf.
>
> Kernel initiator & userspace initiator
> Rather than io_uring/libaio + kernel initiator solution(read 500K+ IOPS &
> write 200K+ IOPS), I prefer QEMU block nvmf + libnvmf(RW 200K+ IOPS):
> 1, I don't have to upgrade host kernel. I can also run it on a lower version
> of kernel.
> 2, During re-connection if target side hits a panic, initiator side would
> not get 'D' state(uninterruptable state in kernel), QEMU always could be
> killed.
> 3, It's easier to trouble shoot for a userspace application.
I see, thanks for sharing.
> Default NVMe-OF IO queues
> The mechanism of QEMU+libnvmf:
> 1, QEMU iothread creates a request and dispatches it to NVMe-OF IO queues
> thread by lockless list.
> 2, QEMU iothread tries to kick NVMe-OF IO queue thread.
> 3, NVMe-OF IO queue thread processes request and returns response to the
> QEMU iothread.
>
> When the QEMU iothread reaches the limitation, 4 NVMe-OF IO queues get
> better performance.
Can you explain this bottleneck? Even with 4 NVMe-oF IO queues there is
still just 1 IOThread submitting requests, so why are 4 IO queues faster
than 1?
Stefan
signature.asc
Description: PGP signature