qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-web PATCH] Add a blog post about zoned storage emulation


From: Stefan Hajnoczi
Subject: Re: [qemu-web PATCH] Add a blog post about zoned storage emulation
Date: Thu, 17 Nov 2022 14:12:07 -0500

Hi Sam,
Please send a git repo URL so Thomas can fetch the commit without
email/file size limitations.

> diff --git a/_posts/2022-11-17-zoned-emulation.md 
> b/_posts/2022-11-17-zoned-emulation.md
> new file mode 100644
> index 0000000..69ce4d7
> --- /dev/null
> +++ b/_posts/2022-11-17-zoned-emulation.md
> @@ -0,0 +1,45 @@
> +---
> +layout: post
> +title:  "Introduction to Zoned Storage Emulation"
> +date:   2022-11-17
> +author: Sam Li
> +categories: [storage, gsoc, outreachy, internships]
> +---
> +
> +## Zoned block devices
> +
> +Aimed for at-scale data infrastructures,

I don't know what at-scale data infrastructure is. Is it something
readers can relate to? Otherwise there's a risk that readers will
decide this doesn't apply to them and stop reading.

> zoned block devices (ZBDs) divide the LBA space into block regions called 
> zones that are larger than the LBA size.

LBA is not defined and also not used again after this sentence.
Readers will be familiar with disks but may not know what an LBA is.
Since the concept isn't used again I suggest dropping it:

  zoned block devices (ZBDs) are divided into regions called zones
that can only be written sequentially.

> By only allowing sequential writes, it can reduce write amplification in SSDs,

This sounds more natural:

  By only allowing sequential writes, SSD write amplification can be reduced

It might also be nice to provide a little bit of extra context:

  ... reduced by eliminating the need for a <a
href="https://en.wikipedia.org/wiki/Flash_translation_layer";>Flash
Translation Layer</a>

> and potentially lead to higher throughput and increased capacity. Providing 
> new storage software stack,

s/Providing new/Providing a new/

> zoned storage concept is standardized as ZBC(SCSI standard), ZAC(ATA 
> standard), ZNS(NVMe).

Small tweaks:

  zoned storage concepts are standardized in ZBC (SCSI standard), ZAC
(ATA standard), ZNS (NVMe).

There is a space before opening parentheses: hello (world) instead of
hello(world). Please check the rest of the article for more instances
of this.

It would be nice to include links but I didn't find good pages for
ZBC/ZAC/ZNS aside from the full standards that they are part of.

This intro section would be a good place to link to https://zonedstorage.io/!

> Meanwhile, the virtio protocol for block devices(virtio-blk) should also be 
> aware of ZBDs instead of taking them as regular block devices. It should be 
> able to pass such devices through to the guest. An overview of necessary work 
> is as follows:
> +
> +1. Virtio protocol: [extend virtio-blk protocol with main zoned storage 
> concept](https://lwn.net/Articles/914377/), Dmitry Fomichev
> +2. Linux: [implement the virtio specification 
> extensions](https://www.spinics.net/lists/linux-block/msg91944.html), Dmitry 
> Fomichev
> +3. QEMU: add zoned emulation support to virtio-blk, Sam Li, [Outreachy 2022 
> project](https://wiki.qemu.org/Internships/ProjectIdeas/VirtIOBlkZonedBlockDevices)

You could split the QEMU work into 2 points if you like:
3. QEMU: add zoned storage APIs to the block layer, Sam Li
4. QEMU: implement zoned storage support in virtio-blk emulation, Sam Li

> +
> +<img src="/screenshots/zbd.png" alt="zbd" style="zoom:50%;" />
> +
> +## Zoned emulation
> +
> +Currently, QEMU can support zoned devices by virtio-scsi or PCI device 
> passthrough. It needs to specify the device type it is talking to. While 
> storage controller emulation uses block layer APIs instead of directly 
> accessing disk images. Extending virtio-blk emulation avoids code duplication 
> and simplify the support by hiding the device types under a unified zoned 
> storage interface, simplifying VM deployment for different type of zoned 
> devices.

Another advantages that come to mind:
1. virtio-blk can be implemented in hardware. If those devices wish to
follow the zoned storage model then the virtio-blk specification needs
to natively support zoned storage.
2. Individual NVMe namespaces or anything that is a zoned Linux block
device can be exposed to the guest without passing through a full
device.

> +
> +For zoned storage emulation, zoned storage APIs support three zoned 
> models(conventional, host-managed, host-aware) , four zone management 
> commands(Report Zone, Open Zone, Close Zone, Finish Zone), and Append Zone.  
> QEMU block storage

Maybe:
s/QEMU block storage/The QEMU block layer/

> has a BlockDriverState graph that propagates device information inside block 
> layer. A root pointer at BlockBackend points to the graph. There are three 
> type of block driver nodes: filter node, format node, protocol node. 
> File-posix driver is the lowest level within the graph where zoned storage 
> APIs reside.

Is it possible to remove "A root pointer at BlockBackend points to the
graph. There are three type of block driver nodes: filter node, format
node, protocol node." so there are fewer new concepts? I didn't see
further use of BlockBackend or filter/format nodes in the text.

> +
> +<img src="/screenshots/storage_overview.png" alt="storage_overview" 
> style="zoom: 50%;" />
> +
> +After receiving the block driver states, Virtio-blk emulation recognizes 
> zoned devices and sends the zoned feature bit to guest. Then the guest can 
> see the zoned device in the host. When the guest executes zoned operations, 
> virtio-blk driver issues corresponding requests that will be captured by 
> virito-blk

s/virito/virtio/

> device inside QEMU. Afterwards, virtio-blk device sends the requests to 
> file-posix driver which will perform zoned operations.
> +
> +Unlike zone management operations, Linux doesn't have a user API

The Linux userspace API (<linux/blkzoned.h>) hasn't been mentioned
before. Maybe the previous paragraph should explain that file-posix
performs zoned operations using <linux/blkzoned.h> ioctls? Then this
sentence will be easier to understand.

> to issue zone append requests to zoned devices from user space. With the help 
> of write pointer emulation tracking locations of write pointer of each zone, 
> QEMU block layer performs append writes by modifying regular writes. Write 
> pointer locks guarantee the execution of requests. Upon failure it must not 
> update the write pointer location which is only got updated when the request 
> is successfully finished.
> +
> +Problems can always be sovled

s/sovled/solved/

Thanks,
Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]