qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: [PATCH 0/3] Introduce vhost-vdpa block device


From: Stefan Hajnoczi
Subject: Re: Re: [PATCH 0/3] Introduce vhost-vdpa block device
Date: Tue, 27 Apr 2021 15:28:23 +0100

On Tue, Apr 27, 2021 at 06:24:55PM +0800, Yongji Xie wrote:
> On Mon, Apr 26, 2021 at 11:34 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Thu, Apr 08, 2021 at 06:12:49PM +0800, Xie Yongji wrote:
> > > Since we already have some ways to emulate vDPA block device
> > > in kernel[1] or userspace[2]. This series tries to introduce a
> > > new vhost-vdpa block device for that. To use it, we can add
> > > something like:
> > >
> > > qemu-system-x86_64 \
> > >     -device vhost-vdpa-blk-pci,vdpa-dev=/dev/vhost-vdpa-0
> >
> > This device is similar to vhost-user-blk. QEMU does not see it as a
> > block device so storage migration, I/O throttling, image formats, etc
> > are not supported. Stefano Garzarella and I discussed how vdpa-blk
> > devices could integrate more deeply with QEMU's block layer. The QEMU
> > block layer could be enabled only when necessary and otherwise bypassed
> > for maximum performance.
> >
> 
> Do you mean we can make use of the shadow virtqueue for the slow path
> (I/O will go through the block layer) and add a fast path (like what
> we do now) to bypass the block layer?

Yes.

> > This alternative approach is similar to how vhost-net is implemented in
> > QEMU. A BlockDriver would handle the vdpa-blk device and the regular
> > virtio-blk-pci device would still be present. The virtqueues could be
> > delegated to the vdpa-blk device in order to bypass the QEMU block
> > layer.
> >
> > I wanted to mention this since it's likely that this kind of vdpa-blk
> > device implementation will be posted in the future and you might be
> > interested. It makes live migration with non-shared storage possible,
> > for example.
> >
> 
> That would be nice, I'm looking forward to it!
> 
> So do you think whether it's necessary to continue this approach?
> Looks like we don't need a vhost-vdpa-blk device any more in the new
> approach.

There is no code for the vdpa-blk BlockDriver yet and the implementation
will be more complex than this patch series (more risk the feature could
be delayed).

If you need vdpa-blk as soon as possible then this patch series may be a
pragmatic solution. As long as it doesn't limit the vdpa-blk interface
in a way that prevents the BlockDriver implementation then I think QEMU
could support both.

In the long term the vdpa-blk BlockDriver could replace -device
vdpa-blk-pci since the performance should be identical and it supports
more use cases.

It's up to you. You're welcome to wait for the BlockDriver, work
together with Stefano on the BlockDriver, or continue with your patch
series.

> > An issue with vhost-user-blk is that the ownership of qdev properties
> > and VIRTIO Configuration Space fields was not clearly defined. Some
> > things are handled by QEMU's vhost-user-blk code, some things are
> > handled by the vhost-user device backend, and some things are negotiated
> > between both entities. This patch series follows the existing
> > vhost-user-blk approach, which I think will show serious issues as the
> > device is more widely used and whenever virtio-blk or the implementation
> > is extended with new features. It is very hard to provide backwards
> > compatibility with the current approach where the ownership of qdev
> > properties and VIRTIO Configuration Space fields is ad-hoc and largely
> > undefined.
> >
> > Since vDPA has VIRTIO Configuration Space APIs, I suggest that the
> > vhost-vDPA device controls the entire configuration space. QEMU should
> > simply forward accesses between the guest and vhost-vdpa.
> >
> 
> Does this already achieved by vhost_ops->vhost_get_config? And I want
> to know how to handle the endianness issue if qemu just simply does
> forwarding and doesn't care about the content of config space.

QEMU just copies bytes betwen the driver and the device via
vdpa_config_ops->get/set_config(). I don't think it needs to worry about
endianness in VIRTIO Configuration Space access code or did I miss
something?

My understanding is that vDPA currently supports same-endian Legacy and
little-endian Modern devices. Cross-endian Legacy devices are currently
not supported because there is no API to set endianness.

If such an API is added in the future, then QEMU can use it to tell the
vDPA device about guest endianness. QEMU still won't need to modify
Configuration Space data itself.

> > Regarding qdev properties, it's a little trickier because QEMU needs to
> > do the emulated VIRTIO device setup (allocating virtqueues, setting
> > their sizes, etc). Can QEMU query this information from the vDPA device?
> > If not, which qdev properties are read-only and must match the
> > configuration of the vDPA device and which are read-write and can
> > control the vDPA device?
> >
> 
> Yes, that's an issue. We have to make sure the number of virtqueues
> and their size set by qemu is not greater than hardware limitation.
> Now I think we can query the max queue size, but looks like we don't
> have an interface to query the max number of virtqueues.

Okay, this is something that Jason and Stefano can discuss more. Ideally
the QEMU command-line does not need to duplicate information about the
vDPA device. The vdpa management tool
(https://github.com/shemminger/iproute2/tree/main/vdpa) and VDUSE
virtio-blk device will probably be where the main device configuration
happens.

As a starting point, can you describe how your VDUSE virtio-blk device
is configured? Does it have a command-line/config file/RPC API to set
the number of virtqueues, block size, etc?

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]