qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v11 00/13] hw/block/nvme: Support Namespace Types and Zoned N


From: Klaus Jensen
Subject: Re: [PATCH v11 00/13] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set
Date: Wed, 9 Dec 2020 10:57:53 +0100

Hi Dmitry,

By and large, this looks OK to me. There are still some issues here and
there, and some comments of mine that you did not address, but I will
follow up with patches to fix that. Let's get this merged.

It looks like the nvme-next you rebased on is slightly old and missing
two commits:

  "hw/block/nvme: remove superfluous NvmeCtrl parameter" and
  "hw/block/nvme: pull aio error handling"

It caused a couple of conflicts, but nothing that I couldn't fix up.

Since I didn't manage to convince anyone about the zsze and zcap
parameters being in terms of LBAs, I'll revert that to be
'zoned.zone_size' and 'zoned.zone_capacity'.

Finally, would you accept that we skip "hw/block/nvme: Add injection of
Offline/Read-Only zones" for now? I'd like to discuss it a bit since I
think the random injects feels a bit ad-hoc. Back when I did OCSSD
emulation with Hans, we did something like this for setting up state
through a descriptor text file - I think we should explore something
like that before we lock down the two parameters. I'll amend the final
documentation commit to not include those parameters.

Sounds good?

Otherwise, I think this is mergeable to nvme-next. So, for the series
(excluding "hw/block/nvme: Add injection of Offline/Read-Only zones"):

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>

On Dec  9 05:03, Dmitry Fomichev wrote:
> v10 -> v11:
> 
>  - Address review comments by Klaus.
> 
>  - Add a patch to separate the handling of controller reset
>    and subsystem shutdown. Place the patch at the beginning
>    of the series so it can be picked up separately.
> 
>  - Rebase on the current nvme-next branch.
> 
> v9 -> v10:
> 
>  - Correctly check for MDTS in Zone Management Receive handler.
> 
>  - Change Klaus' "Reviewed-by" email in UUID patch.
> 
> v8 -> v9:
> 
>  - Move the modifications to "include/block/nvme.h" made to
>    introduce ZNS-related definitions to a separate patch.
> 
>  - Add a new struct, NvmeZonedResult, along the same lines as the
>    existing NvmeAerResult, to carry Zone Append LBA returned to
>    the host. Now, there is no need to modify NvmeCqe struct except
>    renaming DW1 field from "rsvd" to "dw1".
> 
>  - Add check for MDTS in Zone Management Receive handler.
> 
>  - Remove checks for ns->attached since the value of this flag
>    is always true for now.
> 
>  - Rebase to the current quemu-nvme/nvme-next branch.
> 
> v7 -> v8:
> 
>  - Move refactoring commits to the front of the series.
> 
>  - Remove "attached" and "fill_pattern" device properties.
> 
>  - Only close open zones upon subsystem shutdown, not when CC.EN flag
>    is set to 0. Avoid looping through all zones by iterating through
>    lists of open and closed zones.
> 
>  - Improve bulk processing of zones aka zoned operations with "all"
>    flag set. Avoid looping through the entire zone array for all zone
>    operations except Offline Zone.
> 
>  - Prefix ZNS-related property names with "zoned.". The "zoned" Boolean
>    property is retained to turn on zoned command set as it is much more
>    intuitive and user-friendly compared to setting a magic number value
>    to csi property.
> 
>  - Address review comments.
> 
>  - Remove unused trace events.
> 
> v6 -> v7:
> 
>  - Introduce ns->iocs initialization function earlier in the series,
>    in CSE Log patch.
> 
>  - Set NVM iocs for zoned namespaces when CC.CSS is set to
>    NVME_CC_CSS_NVM.
> 
>  - Clean up code in CSE log handler.
>  
> v5 -> v6:
> 
>  - Remove zoned state persistence code. Replace position-independent
>    zone lists with QTAILQs.
> 
>  - Close all open zones upon clearing of the controller. This is
>    a similar procedure to the one previously performed upon powering
>    up with zone persistence. 
> 
>  - Squash NS Types and ZNS triplets of commits to keep definitions
>    and trace event definitions together with the implementation code.
> 
>  - Move namespace UUID generation to a separate patch. Add the new
>    "uuid" property as suggested by Klaus.
> 
>  - Rework Commands and Effects patch to make sure that the log is
>    always in sync with the actual set of commands supported.
> 
>  - Add two refactoring commits at the end of the series to
>    optimize read and write i/o path.
> 
> - Incorporate feedback from Keith, Klaus and Niklas:
> 
>   * fix rebase errors in nvme_identify_ns_descr_list()
>   * remove unnecessary code from nvme_write_bar()
>   * move csi to NvmeNamespace and use it from the beginning in NSTypes
>     patch
>   * change zone read processing to cover all corner cases with RAZB=1
>   * sync w_ptr and d.wp in case of a i/o error at the preceding zone
>   * reword the commit message in active/inactive patch with the new
>     text from Niklas
>   * correct dlfeat reporting depending on the fill pattern set
>   * add more checks for "attached" n/s parameter to prevent i/o and
>     get/set features on inactive namespaces
>   * Use DEFINE_PROP_SIZE and DEFINE_PROP_SIZE32 for zone size/capacity
>     and ZASL respectively
>   * Improve zone size and capacity validation
>   * Correctly report NSZE
> 
> v4 -> v5:
> 
>  - Rebase to the current qemu-nvme.
> 
>  - Use HostMemoryBackendFile as the backing storage for persistent
>    zone metadata.
> 
>  - Fix the issue with filling the valid data in the next zone if RAZB
>    is enabled.
> 
> v3 -> v4:
> 
>  - Fix bugs introduced in v2/v3 for QD > 1 operation. Now, all writes
>    to a zone happen at the new write pointer variable, zone->w_ptr,
>    that is advanced right after submitting the backend i/o. The existing
>    zone->d.wp variable is updated upon the successful write completion
>    and it is used for zone reporting. Some code has been split from
>    nvme_finalize_zoned_write() function to a new function,
>    nvme_advance_zone_wp().
> 
>  - Make the code compile under mingw. Switch to using QEMU API for
>    mmap/msync, i.e. memory_region...(). Since mmap is not available in
>    mingw (even though there is mman-win32 library available on Github),
>    conditional compilation is added around these calls to avoid
>    undefined symbols under mingw. A better fix would be to add stub
>    functions to softmmu/memory.c for the case when CONFIG_POSIX is not
>    defined, but such change is beyond the scope of this patchset and it
>    can be made in a separate patch.
> 
>  - Correct permission mask used to open zone metadata file.
> 
>  - Fold "Define 64 bit cqe.result" patch into ZNS commit.
> 
>  - Use clz64/clz32 instead of defining nvme_ilog2() function.
> 
>  - Simplify rpt_empty_id_struct() code, move nvme_fill_data() back
>    to ZNS patch.
> 
>  - Fix a power-on processing bug.
> 
>  - Rename NVME_CMD_ZONE_APND to NVME_CMD_ZONE_APPEND.
> 
>  - Make the list of review comments addressed in v2 of the series
>    (see below).
> 
> v2 -> v3:
> 
>  - Moved nvme_fill_data() function to the NSTypes patch as it is
>    now used there to output empty namespace identify structs.
>  - Fixed typo in Maxim's email address.
> 
> v1 -> v2:
> 
>  - Rebased on top of qemu-nvme/next branch.
>  - Incorporated feedback from Klaus and Alistair.
>     * Allow a subset of CSE log to be read, not the entire log
>     * Assign admin command entries in CSE log to ACS fields
>     * Set LPA bit 1 to indicate support of CSE log page
>     * Rename CC.CSS value CSS_ALL_NSTYPES (110b) to CSS_CSI
>     * Move the code to assign lbaf.ds to a separate patch
>     * Remove the change in firmware revision
>     * Change "driver" to "device" in comments and annotations
>     * Rename ZAMDS to ZASL
>     * Correct a few format expressions and some wording in
>       trace event definitions
>     * Remove validation code to return NVME_CAP_EXCEEDED error
>     * Make ZASL to be equal to MDTS if "zone_append_size_limit"
>       module parameter is not set
>     * Clean up nvme_zoned_init_ctrl() to make size calculations
>       less confusing
>     * Avoid changing module parameters, use separate n/s variables
>       if additional calculations are necessary to convert parameters
>       to running values
>     * Use NVME_DEFAULT_ZONE_SIZE to assign the default zone size value
>     * Use default 0 for zone capacity meaning that zone capacity will
>       be equal to zone size by default
>     * Issue warnings if user MAR/MOR values are too large and have
>       to be adjusted
>     * Use unsigned values for MAR/MOR
>  - Dropped "Simulate Zone Active excursions" patch.
>    Excursion behavior may depend on the internal controller
>    architecture and therefore be vendor-specific.
>  - Dropped support for Zone Attributes and zoned AENs for now.
>    These features can be added in a future series.
>  - NS Types support is extended to handle active/inactive namespaces.
>  - Update the write pointer after backing storage I/O completion, not
>    before. This makes the emulation to run correctly in case of
>    backing device failures.
>  - Avoid division in the I/O path if the device zone size is
>    a power of two (the most common case). Zone index then can be
>    calculated by using bit shift.
>  - A few reported bugs have been fixed.
>  - Indentation in function definitions has been changed to make it
>    the same as the rest of the code.
> 
> 
> Zoned Namespace (ZNS) Command Set is a newly introduced command set
> published by the NVM Express, Inc. organization as TP 4053. The main
> design goals of ZNS are to provide hardware designers the means to
> reduce NVMe controller complexity and to allow achieving a better I/O
> latency and throughput. SSDs that implement this interface are
> commonly known as ZNS SSDs.
> 
> This command set is implementing a zoned storage model, similarly to
> ZAC/ZBC. As such, there is already support in Linux, allowing one to
> perform the majority of tasks needed for managing ZNS SSDs.
> 
> The Zoned Namespace Command Set relies on another TP, known as
> Namespace Types (NVMe TP 4056), which introduces support for having
> multiple command sets per namespace.
> 
> Both ZNS and Namespace Types specifications can be downloaded by
> visiting the following link -
> 
> https://nvmexpress.org/wp-content/uploads/NVM-Express-1.4-Ratified-TPs.zip
> 
> This patch series adds Namespace Types support and zoned namespace
> emulation capability to the existing NVMe PCI device.
> 
> 
> Based-on: <20201104102248.32168-1-its@irrelevant.dk>
> Dmitry Fomichev (11):
>   hw/block/nvme: Process controller reset and shutdown differently
>   hw/block/nvme: Generate namespace UUIDs
>   hw/block/nvme: Separate read and write handlers
>   hw/block/nvme: Combine nvme_write_zeroes() and nvme_write()
>   hw/block/nvme: Add Commands Supported and Effects log
>   block/nvme: Make ZNS-related definitions
>   hw/block/nvme: Support Zoned Namespace Command Set
>   hw/block/nvme: Introduce max active and open zone limits
>   hw/block/nvme: Support Zone Descriptor Extensions
>   hw/block/nvme: Add injection of Offline/Read-Only zones
>   hw/block/nvme: Document zoned parameters in usage text
> 
> Niklas Cassel (2):
>   hw/block/nvme: Add support for Namespace Types
>   hw/block/nvme: Support allocated CNS command variants
> 
>  hw/block/nvme-ns.h    |  108 +++-
>  hw/block/nvme.h       |    6 +
>  include/block/nvme.h  |  201 +++++-
>  hw/block/nvme-ns.c    |  271 +++++++-
>  hw/block/nvme.c       | 1432 ++++++++++++++++++++++++++++++++++++++---
>  hw/block/trace-events |   32 +-
>  6 files changed, 1927 insertions(+), 123 deletions(-)
> 
> -- 
> 2.28.0
> 
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]