[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 0/8] nvme: v1.3, sgls, metadata and new 'ocssd'

From: Klaus Birkelund
Subject: Re: [Qemu-block] [PATCH 0/8] nvme: v1.3, sgls, metadata and new 'ocssd' device
Date: Mon, 20 May 2019 15:32:28 +0200
User-agent: Mutt/1.11.4 (2019-03-13)

On Mon, May 20, 2019 at 03:01:24PM +0200, Kevin Wolf wrote:
> Am 17.05.2019 um 10:42 hat Klaus Birkelund Jensen geschrieben:
> > Hi,
> > 
> > This series of patches contains a number of refactorings to the emulated
> > nvme device, adds additional features, such as support for metadata and
> > scatter gather lists, and bumps the supported NVMe version to 1.3.
> > Lastly, it contains a new 'ocssd' device.
> > 
> > The motivation for the first seven patches is to set everything up for
> > the final patch that adds a new 'ocssd' device and associated block
> > driver that implements the OpenChannel 2.0 specification[1]. Many of us
> > in the OpenChannel comunity have used a qemu fork[2] for emulation of
> > OpenChannel devices. The fork is itself based on Keith's qemu-nvme
> > tree[3] and we recently merged mainline qemu into it, but the result is
> > still a "hybrid" nvme device that supports both conventional nvme and
> > the OCSSD 2.0 spec through a 'dialect' mechanism. Merging instead of
> > rebasing also created a pretty messy commit history and my efforts to
> > try and rebase our work onto mainline was getting hairy to say the
> > least. And I was never really happy with the dialect approach anyway.
> > 
> > I have instead prepared this series of fresh patches that incrementally
> > adds additional features to the nvme device to bring it into shape for
> > finally introducing a new (and separate) 'ocssd' device that emulates an
> > OpenChannel 2.0 device by reusing core functionality from the nvme
> > device. Providing a separate ocssd device ensures that no ocssd specific
> > stuff creeps into the nvme device.
> > 
> > The ocssd device is backed by a new 'ocssd' block driver that holds
> > internal meta data and keeps state permanent across power cycles. In the
> > future I think we could use the same approach for the nvme device to
> > keep internal metadata such as utilization and deallocated blocks.
> A backend driver that is specific for a guest device model (i.e. the
> device model requires this driver, and the backend is useless without
> the device) sounds like a very questionable design.
> Metadata like OcssdFormatHeader that is considered part of the image
> data, which means that the _actual_ image content without metadata isn't
> directly accessible any more feels like a bad idea, too. Simple things
> like what a resize operation means (change only the actual disk size as
> usual, or is the new size disk + metadata?) become confusing. Attaching
> an image to a different device becomes impossible.
> The block format driver doesn't seem to actually add much functionality
> to a specially crafted raw image: It provides a convenient way to create
> such special images and it dumps some values in 'qemu-img info', but the
> actual interpretation of the data is left to the device model.
> Looking at the options it does provide, my impression is that these
> should really be qdev properties, and the place to store them
> persistently is something like the libvirt XML. The device doesn't
> change any of the values, so there is nothing that QEMU actually needs
> to store. What you invented is a one-off way to pass a config file to a
> device, but only for one specific device type.
> I think this needs to use a much more standard approach to be mergable.
> Markus (CCed) as the maintainer for the configuration mechanisms may
> have an opinion on this, too.

Hi Kevin,

Thank you for going through my motivations. I see what you mean. And
yes, the main reason I did it like that was for the convenience of being
able to `qemu-img create`'ing the image. I'll reconsider how to do this.

> > For now, the nvme device does not support the Deallocated and
> > Unwritten Logical Block Error (DULBE) feature or the Data Set
> > Management command as this would require such support.
> Doesn't bdrv_co_block_status() provide all the information you need for
> that?

That does look useful. I'll look into it.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]