|
From: | Max Reitz |
Subject: | Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock) |
Date: | Tue, 30 Mar 2021 14:51:06 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 |
On 30.03.21 12:51, Vladimir Sementsov-Ogievskiy wrote:
30.03.2021 12:49, Max Reitz wrote:On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:ping. Do we want it for 6.0?I’d rather wait. I think the conclusion was that guests shouldn’t hit this because they serialize discards?I think, that we never had bugs, so we of course can wait.There’s also something Kevin wrote on IRC a couple of weeks ago, for which I had hoped he’d sent an email but I don’t think he did, so I’ll try to remember and paraphrase as well as I can...He basically asked whether it wouldn’t be conceptually simpler to take a reference to some cluster in get_cluster_offset() and later release it with a to-be-added put_cluster_offset().He also noted that reading is problematic, too, because if you read a discarded and reused cluster, this might result in an information leak (some guest application might be able to read data it isn’t allowed to read); that’s why making get_cluster_offset() the point of locking clusters against discarding would be better.Yes, I thought about read too, (RFCed in cover letter of [PATCH v5 0/6] qcow2: fix parallel rewrite and discard (lockless))This would probably work with both of your solutions. For the in-memory solutions, you’d take a refcount to an actual cluster; in the CoRwLock solution, you’d take that lock.What do you think?Hmm. What do you mean? Just rename my qcow2_inflight_writes_inc() and qcow2_inflight_writes_dec() to get_cluster_offset()/put_cluster_offset(), to make it more native to use for read operations as well?
Hm. Our discussion wasn’t so detailed.I interpreted it to mean all qcow2 functions that find an offset to a qcow2 cluster, namely qcow2_get_host_offset(), qcow2_alloc_host_offset(), and qcow2_alloc_compressed_cluster_offset().
When those functions return an offset (in)to some cluster, that cluster (or the image as a whole) should be locked against discards. Every offset received this way would require an accompanying qcow2_put_host_offset().
Or to update any kind of "getting cluster offset" in the whole qcow2 driver to take a kind of "dynamic reference count" by get_cluster_offset() and then call corresponding put() somewhere? In this case I'm afraid it's a lot more work..
Hm, really? I would have assumed we need to do some locking in all functions that get a cluster offset this way, so it should be less work to take the lock in the functions they invoke to get the offset.
It would be also the problem that a lot of paths in qcow2 are not in coroutine and don't even take s->lock when they actually should.
I’m not sure what you mean here, because all functions that invoke any of the three functions I listed above are coroutine_fns (or, well, I didn’t look it up, but they all have *_co_* in their name).
This will also mean that we do same job as normal qcow2 refcounts already do: no sense in keeping additional "dynamic refcount" for L2 table cluster while reading it, as we already have non-zero qcow2 normal refcount for it..
I’m afraid I don’t understand how normal refcounts relate to this. For example, qcow2_get_host_offset() doesn’t touch refcounts at all.
Max
[Prev in Thread] | Current Thread | [Next in Thread] |