[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard |
Date: |
Fri, 12 Mar 2021 18:39:14 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 |
12.03.2021 17:58, Max Reitz wrote:
On 12.03.21 13:32, Vladimir Sementsov-Ogievskiy wrote:
12.03.2021 14:17, Max Reitz wrote:
On 12.03.21 10:09, Vladimir Sementsov-Ogievskiy wrote:
11.03.2021 22:58, Max Reitz wrote:
On 05.03.21 18:35, Vladimir Sementsov-Ogievskiy wrote:
There is a bug in qcow2: host cluster can be discarded (refcount
becomes 0) and reused during data write. In this case data write may
[..]
@@ -885,6 +1019,13 @@ static int QEMU_WARN_UNUSED_RESULT
update_refcount(BlockDriverState *bs,
if (refcount == 0) {
void *table;
+ Qcow2InFlightRefcount *infl = find_infl_wr(s, cluster_index);
+
+ if (infl) {
+ infl->refcount_zero = true;
+ infl->type = type;
+ continue;
+ }
I don’t understand what this is supposed to do exactly. It seems like it wants
to keep metadata structures in the cache that are still in use (because
dropping them from the caches is what happens next), but users of metadata
structures won’t set in-flight counters for those metadata structures, will
they?
Don't follow.
We want the code in "if (refcount == 0)" to be triggered only when full reference count
of the host cluster becomes 0, including inflight-write-cnt. So, if at this point
inflight-write-cnt is not 0, we postpone freeing the host cluster, it will be done later from
"slow path" in update_inflight_write_cnt().
But the code under “if (refcount == 0)” doesn’t free anything, does it? All I
can see is code to remove metadata structures from the metadata caches (if the
discarded cluster was an L2 table or a refblock), and finally the discard on
the underlying file. I don’t see how that protocol-level discard has anything
to do with our problem, though.
Hmm. Still, if we do this discard, and then our in-flight write, we'll have
data instead of a hole. Not a big deal, but seems better to postpone discard.
On the other hand, clearing caches is OK, as its related only to
qcow2-refcount, not to inflight-write-cnt
As far as I understand, the freeing happens immediately above the “if (refcount ==
0)” block by s->set_refcount() setting the refcount to 0. (including updating
s->free_cluster_index if the refcount is 0).
Hmm.. And that (setting s->free_cluster_index) what I should actually prevent
until total reference count becomes zero.
And about s->set_refcount(): it only update a refcount itself, and don't free
anything.
That is what freeing is, though. I consider something to be free when
allocation functions will allocate it. The allocation functions look at the
refcount, so once a cluster’s refcount is 0, it is free.
And with this patch I try to update allocation function to look also at
inflight-write-counters. If I missed something its a bug in the patch.
If that isn’t what freeing is, nothing in update_refcount() frees anything
(when looking at how data clusters are handled). Passing the discard through
to the protocol layer isn’t “freeing”, because it’s independent of qcow2.
Now, your patch adds an additional check to the allocation functions (whether
there are ongoing writes on the cluster), so it’s indeed possible that a
cluster can have a refcount of 0 but still won’t be used by allocation
functions.
But that means you’ve just changed the definition of what a free cluster is.
In fact, that means that nothing in update_refcount() can free a cluster that
has active writes to it, because now a cluster is only free if there are no
such writes. It follows that you needn’t change update_refcount() to prevent
clusters with such writes from being freed, because with this new definition of
what a free cluster is, it’s impossible for update_refcount() to free them.
But as I noted somewhere else, update_refcount should not discard the host
cluster in parallel with inflight write. It's not completely wrong, but it's
inefficient.
(Yes, you’re right that it would be nice to postpone the protocol-level discard
still, but not doing so wouldn’t be a catastrophe – which shows that it has
little to do with actually freeing something, as far as qcow2 is concerned.
If it’s just about postponing the discard, we can do exactly that: Let
update_refcount() skip discarding for clusters that are still in use, and then
let update_inflight_write_cnt() only do that discard instead of invoking all of
qcow2_update_cluster_refcount().)
Agree, yes.
Alternatively, we could also not change the definition of what a free cluster
is, which means we wouldn’t need to change the allocation functions, but
instead postpone the refcount update that update_refcount() does. That would
mean we’d actually need to really drop the refcount in
update_inflight_write_cnt() instead of doing a -0.
Hmm, that should work too. Do you think it is better? With first approach
meaning of zero refcount is changed (it's not a free cluster now, keep in mind
inflight-write-cnt too). So we should update functions interested in zero
refcount. With second approach refcount=1 changes the meaning (it my be
actually referenced by inflight-write-cnt object, not by some qcow2 metadata
object).. Shouldn't we update some functions that are interested in refcount=1?
Intuitively it seems safe enough. Nothing dangerous is in refcount=1 for
cluster which is actually unused at all.
--
Best regards,
Vladimir
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, (continued)
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/12
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Max Reitz, 2021/03/12
- Re: [PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard,
Vladimir Sementsov-Ogievskiy <=
[PATCH v3 4/6] util: implement seqcache, Vladimir Sementsov-Ogievskiy, 2021/03/05
[PATCH v3 5/6] block-coroutine-wrapper: allow non bdrv_ prefix, Vladimir Sementsov-Ogievskiy, 2021/03/05