qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)
Date: Tue, 30 Mar 2021 16:25:32 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0

30.03.2021 15:51, Max Reitz wrote:
On 30.03.21 12:51, Vladimir Sementsov-Ogievskiy wrote:
30.03.2021 12:49, Max Reitz wrote:
On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:
ping. Do we want it for 6.0?

I’d rather wait.  I think the conclusion was that guests shouldn’t hit this 
because they serialize discards?

I think, that we never had bugs, so we of course can wait.


There’s also something Kevin wrote on IRC a couple of weeks ago, for which I 
had hoped he’d sent an email but I don’t think he did, so I’ll try to remember 
and paraphrase as well as I can...

He basically asked whether it wouldn’t be conceptually simpler to take a 
reference to some cluster in get_cluster_offset() and later release it with a 
to-be-added put_cluster_offset().

He also noted that reading is problematic, too, because if you read a discarded 
and reused cluster, this might result in an information leak (some guest 
application might be able to read data it isn’t allowed to read); that’s why 
making get_cluster_offset() the point of locking clusters against discarding 
would be better.

Yes, I thought about read too, (RFCed in cover letter of [PATCH v5 0/6] qcow2: 
fix parallel rewrite and discard (lockless))


This would probably work with both of your solutions.  For the in-memory 
solutions, you’d take a refcount to an actual cluster; in the CoRwLock 
solution, you’d take that lock.

What do you think?


Hmm. What do you mean? Just rename my qcow2_inflight_writes_inc() and 
qcow2_inflight_writes_dec() to get_cluster_offset()/put_cluster_offset(), to 
make it more native to use for read operations as well?

Hm.  Our discussion wasn’t so detailed.

I interpreted it to mean all qcow2 functions that find an offset to a qcow2 
cluster, namely qcow2_get_host_offset(), qcow2_alloc_host_offset(), and 
qcow2_alloc_compressed_cluster_offset().

What about qcow2_alloc_clusters() ?


When those functions return an offset (in)to some cluster, that cluster (or the 
image as a whole) should be locked against discards.  Every offset received 
this way would require an accompanying qcow2_put_host_offset().

Or to update any kind of "getting cluster offset" in the whole qcow2 driver to take a 
kind of "dynamic reference count" by get_cluster_offset() and then call corresponding 
put() somewhere? In this case I'm afraid it's a lot more work..

Hm, really?  I would have assumed we need to do some locking in all functions 
that get a cluster offset this way, so it should be less work to take the lock 
in the functions they invoke to get the offset.

It would be also the problem that a lot of paths in qcow2 are not in coroutine and 
don't even take s->lock when they actually should.

I’m not sure what you mean here, because all functions that invoke any of the 
three functions I listed above are coroutine_fns (or, well, I didn’t look it 
up, but they all have *_co_* in their name).

qcow2_alloc_clusters() has a lot more callers..


This will also mean that we do same job as normal qcow2 refcounts already do: no sense in 
keeping additional "dynamic refcount" for L2 table cluster while reading it, as 
we already have non-zero qcow2 normal refcount for it..

I’m afraid I don’t understand how normal refcounts relate to this.  For 
example, qcow2_get_host_offset() doesn’t touch refcounts at all.


I mean the following: remember our discussion about what is free-cluster. If we add 
"dynamic-refcount", or "infligth-write-counter" thing only to count inflight data-writing 
(or, as discussed, we should count data-reads as well) operations, than "full reference count" of 
the cluster is inflight-write-count + qcow2-metadata-refcount.

But if we add a kind of "dynamic refcount" for any use of host cluster, for example reading of L2 table, than 
we duplicate the reference in qcow2-metadata to this L2 table (represented as refcount) by our "dynamic 
refcount", and we don't have a concept of "full reference count" as the sum above.. We still should 
treat a cluster as free when both "dynamic refcount" and qcow2-metadata-refcount are zero, but their sum 
doesn't have a good sense. Not a problem maybe.. But looks like a complication with no benefit.


==

OK, I think now that you didn't mean qcow2_alloc_clusters(). So, we are saying about only 
functions returning an offset to cluster with "guest data", not to any kind of 
host cluster. Than what you propose looks like this to me:

 - take my v5
 - rename qcow2_inflight_writes_dec() to put_cluster_offset()
 - call qcow2_inflight_writes_inc() from the three functions you mention

That make sense to me. Still, put_cluster_offset() name doesn't make it obvious that it's 
only for clusters with "guest data", and we shouldn't call it when work with 
metadata clusters.

--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]