qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Outreachy project task: Adding QEMU block layer APIs resembling Linu


From: Damien Le Moal
Subject: Re: Outreachy project task: Adding QEMU block layer APIs resembling Linux ZBD ioctls.
Date: Wed, 1 Jun 2022 14:47:22 +0900
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1

On 6/1/22 11:57, Sam Li wrote:
> Hi Stefan,
> 
> Stefan Hajnoczi <stefanha@gmail.com> 于2022年5月30日周一 19:19写道:
> 
> 
>>
>> On Mon, 30 May 2022 at 06:09, Sam Li <faithilikerun@gmail.com> wrote:
>>>
>>> Hi everyone,
>>> I'm Sam Li, working on the Outreachy project which is to add zoned
>>> device support to QEMU's virtio-blk emulation.
>>>
>>> For the first goal, adding QEMU block layer APIs resembling Linux ZBD
>>> ioctls, I think the naive approach would be to introduce a new stable
>>> struct zbd_zone descriptor for the library function interface. More
>>> specifically, what I'd like to add to the BlockDriver struct are:
>>> 1. zbd_info as zone block device information: includes numbers of
>>> zones, size of logical blocks, and physical blocks.
>>> 2. zbd_zone_type and zbd_zone_state
>>> 3. zbd_dev_model: host-managed zbd, host-aware zbd
>>> With those basic structs, we can start to implement new functions as
>>> bdrv*() APIs for BLOCK*ZONE ioctls.
>>>
>>> I'll start to finish this task based on the above description. If
>>> there is any problem or something I may miss in the design, please let
>>> me know.
>>
>> Hi Sam,
>> Can you propose function prototypes for the new BlockDriver callbacks
>> needed for zoned devices?
> 
> I have made some modifications based on Damien's device in design part
> 1 and added the function prototypes in design part 2. If there is any
> problem or part I missed, please let me know.
> 
> Design of Block Layer APIs in BlockDriver:
> 1. introduce a new stable struct zbd_zone descriptor for the library
> function interface.
>   a. zbd_info as zone block device information: includes numbers of
> zones, size of blocks, write granularity in byte(minimal write size
> and alignment
>     - write granularity: 512e SMRs: writes in units of physical block
> size, 4096 bytes; NVMe ZNS write granularity is equal to the block
> size.
>     - zone descriptor: start, length, capacity, write pointer, zone type
>   b. zbd_zone_type
>     - zone type: conventional, sequential write required, sequential
> write preferred
>   c. zbd_dev_model: host-managed zbd, host-aware zbd

This explanation is a little hard to understand. It seems to be mixing up
device level information and per-zone information. I think it would be a
lot simpler to write a struct definition to directly illustrate what you
are planning.

It is something like this ?

struct zbd_zone {
        enum zone_type  type;
        enum zone_cond  cond;
        uint64_t        start;
        uint32_t        length;
        uint32_t        cap;
        uint64_t        wp;
};

strcut zbd_dev {
        enum zone_model model;
        uint32_t        block_size;
        uint32_t        write_granularity;
        uint32_t        nr_zones
        struct zbd_zone *zones; /* array of zones */
};

If yes, then my comments are as follows.

For the device struct: It may be good to have also the maximum number of
open zones and the maximum number of active zones.

For the zone struct: You may need to add a read-write lock per zone to be
able to write lock zones to ensure a sequential write pattern (virtio
devices can be multi-queue and so writes may be coming in from different
contexts) and to correctly emulate zone append operations with an atomic
update of the wp field.

These need to be integrated into the generic block driver interface in
include/block/block_int-common.h or include/block/block-common.h.

> 
>  2. implement new functions as bdrv*() APIs for BLK*ZONE ioctls
>    a. support basic operations: get the APIs working when executing
> the zone operations from a guest
>     - zone information access: report
>     - zone manipulation: reset,open,close,finish

These operations are generally referred to as "zone management" operations.

>   b. support zone append operation: zone capacity, write pointer
> positions of all zones(excluded for now)
>     - can track the zone state we need: zone is full or not.
> 
> More specifically, the function prototypes for 2a are as follows:
> 
> int zbd_report_zones(int fd, off_t offset, off_t len, enum
> zbd_report_opetion ro, struct zbd_zone *zones, unsigned int
> *nr_zones);

The current virtio zone specification draft does not have a reporting
option field for the report zones command. However, it has a "partial"
field that will need to be passed to this function so that the correct
report zones reply can be built by the driver.

> int zbd_reset_zones(int fd, off_t offset, off_t len);
> int zbd_open_zones(int fd, off_t offset, off_t len);
> int zbd_close_zones(int fd, off_t offset, off_t len);
> int zbd_finish_zones(int fd, off_t offset, off_t len);

These 4 functions have the exact same signature, modulo the function name.
So we should simplify here to reduce the number of functions. Something like:

int zbd_zone_mgmt(int fd, enum zone_op op, off_t offset, off_t len);

where op can be BDRV_ZONE_RESET, BDRV_ZONE_OPEN, etc can aggregate all 4
functions into one.

In any case, if you look at how block drivers are defined (an example is
the one for raw files in qemu/block/file-posix.c) using the BlockDriver
data type (defined as a struct in qemu/include/block/block_int-common.h),
you will see that the driver callback functions do not use a file
descriptor (fd) but a pointer to some data structure (most of the time a
BlockDriverState pointer).

Another thing: you will need one more callback to get the device
information related to zones: maximum number of open and active zones at
least (the number of zones can be discovered with a report zones call).

Cheers.

-- 
Damien Le Moal
Western Digital Research



reply via email to

[Prev in Thread] Current Thread [Next in Thread]