qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 3/3] hw/block/nvme: add nvme_inject_state HMP command


From: Minwoo Im
Subject: Re: [RFC PATCH 3/3] hw/block/nvme: add nvme_inject_state HMP command
Date: Thu, 11 Feb 2021 15:06:32 +0900
User-agent: Mutt/1.11.4 (2019-03-13)

On 21-02-11 13:24:22, Keith Busch wrote:
> On Thu, Feb 11, 2021 at 12:38:48PM +0900, Minwoo Im wrote:
> > On 21-02-11 12:00:11, Keith Busch wrote:
> > > But I would prefer to see advanced retry tied to real errors that can be
> > > retried, like if we got an EBUSY or EAGAIN errno or something like that.
> > 
> > I have seen a thread [1] about ACRE.  Forgive me If I misunderstood this
> > thread or missed something after this thread.  It looks like CRD field in
> > the CQE can be set for any NVMe error state which means it *may* depend on
> > the device status.
> 
> Right! Setting CRD values is at the controller's discretion for any
> error status as long as the host enables ACRE.
> 
> > And this patch just introduced a internal temporarily error state of
> > the controller by returning Command Intrrupted status.
> 
> It's just purely synthetic, though. I was hoping something more natural
> could trigger the status. That might not provide the deterministic
> scenario you're looking for, though.

That makes snese.  If some status can be triggered more naturally,  that
would be much better.

> I'm not completely against using QEMU as a development/test vehicle for
> corner cases like this, but we are introducing a whole lot of knobs
> recently, and you practically need to be a QEMU developer to even find
> them. We probably should step up the documentation in the wiki along
> with these types of features.

Oh, that's a really good advice, really appreciate that one.

> > I think, in this stage, we can go with some errors in the middle of the
> > AIO (nvme_aio_err()) for advanced retry.  Shouldn't AIO errors are
> > retry-able and supposed to be retried ?
> 
> Sure, we can assume that receiving an error in the AIO callback means
> the lower layers exhausted available recovery mechanisms.

Okay, please let me find a way to trigger this kind of errors more
naturally.  I think this HMP command should be the last one to try if
there's nothing we can do really.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]