qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v8 07/11] hw/block/nvme: Support Zoned Namespace Command Set


From: Dmitry Fomichev
Subject: RE: [PATCH v8 07/11] hw/block/nvme: Support Zoned Namespace Command Set
Date: Tue, 3 Nov 2020 19:48:53 +0000

> -----Original Message-----
> From: Philippe Mathieu-Daudé <philmd@redhat.com>
> Sent: Friday, October 30, 2020 3:16 AM
> To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>; Keith Busch
> <kbusch@kernel.org>; Klaus Jensen <k.jensen@samsung.com>; Kevin Wolf
> <kwolf@redhat.com>; Maxim Levitsky <mlevitsk@redhat.com>; Fam Zheng
> <fam@euphon.net>
> Cc: Alistair Francis <Alistair.Francis@wdc.com>; Matias Bjorling
> <Matias.Bjorling@wdc.com>; Niklas Cassel <Niklas.Cassel@wdc.com>;
> Damien Le Moal <Damien.LeMoal@wdc.com>; qemu-block@nongnu.org;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v8 07/11] hw/block/nvme: Support Zoned Namespace
> Command Set
> 
> Hi Dmitry,
> 
> On 10/30/20 3:32 AM, Dmitry Fomichev wrote:
> > The emulation code has been changed to advertise NVM Command Set
> when
> > "zoned" device property is not set (default) and Zoned Namespace
> > Command Set otherwise.
> >
> > Define values and structures that are needed to support Zoned
> > Namespace Command Set (NVMe TP 4053) in PCI NVMe controller
> emulator.
> > Define trace events where needed in newly introduced code.
> >
> > In order to improve scalability, all open, closed and full zones
> > are organized in separate linked lists. Consequently, almost all
> > zone operations don't require scanning of the entire zone array
> > (which potentially can be quite large) - it is only necessary to
> > enumerate one or more zone lists.
> >
> > Handlers for three new NVMe commands introduced in Zoned Namespace
> > Command Set specification are added, namely for Zone Management
> > Receive, Zone Management Send and Zone Append.
> >
> > Device initialization code has been extended to create a proper
> > configuration for zoned operation using device properties.
> >
> > Read/Write command handler is modified to only allow writes at the
> > write pointer if the namespace is zoned. For Zone Append command,
> > writes implicitly happen at the write pointer and the starting write
> > pointer value is returned as the result of the command. Write Zeroes
> > handler is modified to add zoned checks that are identical to those
> > done as a part of Write flow.
> >
> > Subsequent commits in this series add ZDE support and checks for
> > active and open zone limits.
> >
> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
> > Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
> > Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com>
> > Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
> > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> > Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com>
> > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
> > ---
> >  block/nvme.c          |   2 +-
> >  hw/block/nvme-ns.c    | 173 ++++++++
> >  hw/block/nvme-ns.h    |  54 +++
> >  hw/block/nvme.c       | 977
> +++++++++++++++++++++++++++++++++++++++++-
> >  hw/block/nvme.h       |   8 +
> >  hw/block/trace-events |  18 +-
> >  include/block/nvme.h  | 113 ++++-
> 
> When you start modifying include/ files, it is recommended
> to start using scripts/git.orderfile as this makes review
> easier (no need to scroll back / up constantly).
>

Hi Philippe,

Thanks for the suggestion, indeed it makes browsing through
diffs much easier!

> As "block/nvme.h" is shared by 2 subsystems, keeping its
> changes in a separate patch is preferred.
>

No problem, I'll move all changes to include/block/nvme.h to
a separate patch.
 
> >  7 files changed, 1322 insertions(+), 23 deletions(-)
> >
> > diff --git a/block/nvme.c b/block/nvme.c
> > index 05485fdd11..7a513c9a17 100644
> > --- a/block/nvme.c
> > +++ b/block/nvme.c
> > @@ -333,7 +333,7 @@ static inline int nvme_translate_error(const
> NvmeCqe *c)
> >  {
> >      uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF;
> >      if (status) {
> > -        trace_nvme_error(le32_to_cpu(c->result),
> > +        trace_nvme_error(le32_to_cpu(c->result32),
> >                           le16_to_cpu(c->sq_head),
> >                           le16_to_cpu(c->sq_id),
> >                           le16_to_cpu(c->cid),
> ...
> 
> > diff --git a/include/block/nvme.h b/include/block/nvme.h
> > index 3653b4aefc..ba8a45edf5 100644
> > --- a/include/block/nvme.h
> > +++ b/include/block/nvme.h
> > @@ -489,6 +489,9 @@ enum NvmeIoCommands {
> >      NVME_CMD_COMPARE            = 0x05,
> >      NVME_CMD_WRITE_ZEROES       = 0x08,
> >      NVME_CMD_DSM                = 0x09,
> > +    NVME_CMD_ZONE_MGMT_SEND     = 0x79,
> > +    NVME_CMD_ZONE_MGMT_RECV     = 0x7a,
> > +    NVME_CMD_ZONE_APPEND        = 0x7d,
> >  };
> >
> >  typedef struct QEMU_PACKED NvmeDeleteQ {
> > @@ -649,8 +652,10 @@ typedef struct QEMU_PACKED NvmeAerResult {
> >  } NvmeAerResult;
> >
> >  typedef struct QEMU_PACKED NvmeCqe {
> > -    uint32_t    result;
> > -    uint32_t    rsvd;
> > +    union {
> > +        uint64_t     result64;
> > +        uint32_t     result32;
> > +    };
> 
> When using packed structure you want to define all fields to
> avoid alignment confusion (and I'm surprised the compiler doesn't
> complain...). So this would be:
> 
>        union {
>            uint64_t     result64;
>            struct {
>                uint32_t    result32;
>                uint32_t    rsvd32;
>            };
>        };
> 

IMO, the compiler doesn't complain because it's a union. Smaller
variants in unions are "padded" to the size of the largest variant
regardless of whether the struct is packed or not.

> But since the ZNS is still a technical proposal and not in the spec,
> this doesn't look correct (the spec list this field as 32-bit).
> 
> What do you think about adding NvmeCqeZNS?
> 
> Maybe:
> 
>   typedef struct QEMU_PACKED NvmeCqeZNS {
>       uint64_t    result;
>       uint16_t    sq_head;
>       uint16_t    sq_id;
>       uint16_t    cid;
>       uint16_t    status;
>   } NvmeCqeZNS;
> 
> Or clever:
> 
>   typedef union QEMU_PACKED NvmeCqeZNS {
>       union {
>           struct {
>               uint64_t result;
>               uint32_t dw2;
>               uint32_t dw3;
>           };
>           NvmeCqe      cqe;
>       };
>   } NvmeCqeZNS;
> 

The 1.4 base spec changes Reserved DW1 in CQE to become the
Command Specific DW1, so it would rather make sense to define
a command-specific CQE for Zone Append -

In include/block/nvme.h:

typedef struct QEMU_PACKED NvmeCqe {
     uint32_t    result;
-    uint32_t    rsvd;
+    uint32_t    dw1;
     uint16_t    sq_head;
     uint16_t    sq_id;
     uint16_t    cid;
     uint16_t    status;
} NvmeCqe;

+/* Zone Append - specific CQE */
+typedef struct QEMU_PACKED NvmeCqeZA {
+    uint64_t    za_slba;
+    uint16_t    sq_head;
+    uint16_t    sq_id;
+    uint16_t    cid;
+    uint16_t    status;
+} NvmeCqeZA;

...

+    QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != sizeof(NvmeCqeZA));

This will go away with all CQE unions and it will also allow the returned SLBA
value to be properly named. What do you think?

> I wonder what part could go in hw/block/nvme-ns.h or "block/nvme-zns.h".

NvmeCqeZA could simply be defined in include/block/nvme.h next to NvmeCqe.
The problem with adding include/block/nvme-zns.h is that it would be hard if
not impossible to separate all ZNS-specific content from block/nvme.h and it
would become necessary for developers to deal with two files that present
different parts of ZNS definitions instead of just one.

Best regards,
Dmitry

> 
> >      uint16_t    sq_head;
> >      uint16_t    sq_id;
> >      uint16_t    cid;
> > @@ -678,6 +683,7 @@ enum NvmeStatusCodes {
> >      NVME_SGL_DESCR_TYPE_INVALID = 0x0011,
> >      NVME_INVALID_USE_OF_CMB     = 0x0012,
> >      NVME_CMD_SET_CMB_REJECTED   = 0x002b,
> > +    NVME_INVALID_CMD_SET        = 0x002c,
> >      NVME_LBA_RANGE              = 0x0080,
> >      NVME_CAP_EXCEEDED           = 0x0081,
> >      NVME_NS_NOT_READY           = 0x0082,
> > @@ -702,6 +708,14 @@ enum NvmeStatusCodes {
> >      NVME_CONFLICTING_ATTRS      = 0x0180,
> >      NVME_INVALID_PROT_INFO      = 0x0181,
> >      NVME_WRITE_TO_RO            = 0x0182,
> > +    NVME_ZONE_BOUNDARY_ERROR    = 0x01b8,
> > +    NVME_ZONE_FULL              = 0x01b9,
> > +    NVME_ZONE_READ_ONLY         = 0x01ba,
> > +    NVME_ZONE_OFFLINE           = 0x01bb,
> > +    NVME_ZONE_INVALID_WRITE     = 0x01bc,
> > +    NVME_ZONE_TOO_MANY_ACTIVE   = 0x01bd,
> > +    NVME_ZONE_TOO_MANY_OPEN     = 0x01be,
> > +    NVME_ZONE_INVAL_TRANSITION  = 0x01bf,
> >      NVME_WRITE_FAULT            = 0x0280,
> >      NVME_UNRECOVERED_READ       = 0x0281,
> >      NVME_E2E_GUARD_ERROR        = 0x0282,
> > @@ -886,6 +900,11 @@ typedef struct QEMU_PACKED NvmeIdCtrl {
> >      uint8_t     vs[1024];
> >  } NvmeIdCtrl;
> >
> > +typedef struct NvmeIdCtrlZoned {
> > +    uint8_t     zasl;
> > +    uint8_t     rsvd1[4095];
> > +} NvmeIdCtrlZoned;
> ...


reply via email to

[Prev in Thread] Current Thread [Next in Thread]