[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] Is NBD_CMD_FLAG_FUA valid during NBD_CMD_FLUSH?

From: Alex Bligh
Subject: Re: [Qemu-devel] [Nbd] Is NBD_CMD_FLAG_FUA valid during NBD_CMD_FLUSH?
Date: Thu, 31 Mar 2016 21:17:12 +0100

On 31 Mar 2016, at 20:54, Eric Blake <address@hidden> wrote:
> Oh, and I also just found that qemu's nbd-server tries to honor FUA on
> read, even though the protocol doesn't document that as valid either.

Potentially useful, but I believe not required (I don't believe the
kernel does that, and I *believe* qemu's block layer does the
same as the kernel).

>> This turned out to be an easier way of describing the operations
>> than describing them semantically (in particular FLUSH, where I
>> couldn't get an entirely consistent answer of what it required
>> of inflight requests, specifically whether it required all
>> requests inflight at the time of making the request to be written
>> to disk prior to answering, or all requests inflight prior to the
>> time of replying to be written to disk prior to answering, though
>> I believe the former).
>> FUA just requires that particular request to be persisted to
>> disk, and does not require other requests to be persisted to disk
> As written, NBD says that FUA requires the current write operation to
> land on disk (but says nothing about any other writes, whether those
> writes had an early reply).

That is my understanding.

>  And for flush, NBD only requires that all
> writes that have _sent_ their reply to the client must land on disk, but
> this can certainly be a smaller set of write requests than _all_ writes
> issued prior to that point in time.  So maybe flush+FUA is a valid thing
> to support, and means that ALL in-flight writes must land, whether or
> not a reply has been sent to the client, for an even stronger barrier?

OK so I actually went and researched what my answer was last time I
was asked ( :-) ):

Here was my conclusion last time after trawling through lkml
on the subject:

From https://sourceforge.net/p/nbd/mailman/message/27569820/

> You may process commands out of order, and reply out of order,
> save that
> a) all write commands *completed* before you process a REQ_FLUSH
>  must be written to non-volatile storage prior to completing
>  that REQ_FLUSH (though apparently you should, if possible, make
>  this true for all write commands *received*, which is a stronger
>  condition) [Ignore this if you don't set SEND_REQ_FLUSH]
> b) a REQ_FUA flagged write must not complete until its payload
>  is written to non-volatile storage [ignore this if you don't
>  set SEND_REQ_FUA]

Perhaps it would be good for that to actually go in the docs!

I don't think we need a 'stronger barrier' as the client can
implement that itself merely by waiting for all commands to
complete prior to sending FLUSH.

Incidentally, last time I looked, the linux kernel always sent
a FLUSH immediately after any bio marked FUA. Does qemu use
more interesting behavioural modes?

>> So in answer to your question, my understanding is that FLUSH requires
>> (some subset) of otherwise potentially non-persisted requests to
>> be persisted to disk. In that sense it implies FUA. It is permitted
>> to set FUA (as it is permitted, I believe, in the linux block layer)
>> but it will make no difference.
>> I once thought FUA on read should bypass any local read cache, though
>> that is not part of the spec currently.
> In qemu, read+FUA just triggers blk_co_flush() prior to reading; but
> that's the same function it calls for write+FUA.

That's harmless, but unnecessary in the sense that current documented
behaviour doesn't require it. Perhaps it should?

I suppose TRIM etc. should support FUA too?

>  And for flush (whether
> or not FUA was specified), qemu still calls blk_co_flush().  So from
> qemu's perspective, FUA is synonymous with "finish ALL pending
> transactions", which is stronger than what the NBD protocol requires.
> (Nothing wrong with an implementation doing more work than required,
> although it may be less efficient).  Alas, that means I can't use qemu's
> behavior as a good reference for how to improve the NBD spec.
> Meanwhile, it sounds like FUA is valid on read, write, AND flush
> (because the kernel supports all three),

Do you have a pointer to what FUA means on kernel reads? Does it
mean "force unit access for the read" or does it mean "flush any
write for that block first"? The first is subtly different if the
file is remote and being accessed by multiple people (e.g. NFS, Ceph etc.)

> even if we aren't quite sure
> what to document of those flags.  And that means qemu is correct, and
> the NBD protocol has a bug.  Since you contributed the FUA flag, is that
> something you can try to improve?

Yeah. My mess so I should clean it up. I think FUA should be valid
on essentially everything.

I think I might wait until structured replies is in though!

Alex Bligh

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

reply via email to

[Prev in Thread] Current Thread [Next in Thread]