[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes |
Date: |
Mon, 15 Mar 2021 17:40:49 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 |
15.03.2021 12:58, Max Reitz wrote:
On 12.03.21 19:43, Vladimir Sementsov-Ogievskiy wrote:
12.03.2021 21:15, Max Reitz wrote:
On 05.03.21 18:35, Vladimir Sementsov-Ogievskiy wrote:
Compressed writes are unaligned to 512, which works very slow in
O_DIRECT mode. Let's use the cache.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
block/coroutines.h | 3 +
block/qcow2.h | 4 ++
block/qcow2-refcount.c | 10 +++
block/qcow2.c | 158 ++++++++++++++++++++++++++++++++++++++---
4 files changed, 164 insertions(+), 11 deletions(-)
[...]
@@ -2699,6 +2796,12 @@ static void qcow2_close(BlockDriverState *bs)
qcow2_inactivate(bs);
}
+ /*
+ * Cache should be flushed in qcow2_inactivate() and should be empty in
+ * inactive mode. So we are safe to free it.
+ */
+ seqcache_free(s->compressed_cache);
+
cache_clean_timer_del(bs);
qcow2_cache_destroy(s->l2_table_cache);
qcow2_cache_destroy(s->refcount_block_cache);
@@ -4558,18 +4661,42 @@ qcow2_co_pwritev_compressed_task(BlockDriverState *bs,
goto fail;
}
- qcow2_inflight_writes_inc(bs, cluster_offset, out_len);
+ if (s->compressed_cache) {
Why is this conditional?
We don't have compressed_cache for non o_direct.
Oh right.
+ /*
+ * It's important to do seqcache_write() in the same critical section
+ * (by s->lock) as qcow2_alloc_compressed_cluster_offset(), so that the
+ * cache is filled sequentially.
+ */
Yes.
+ seqcache_write(s->compressed_cache, cluster_offset, out_len, out_buf);
- qemu_co_mutex_unlock(&s->lock);
+ qemu_co_mutex_unlock(&s->lock);
- BLKDBG_EVENT(s->data_file, BLKDBG_WRITE_COMPRESSED);
- ret = bdrv_co_pwrite(s->data_file, cluster_offset, out_len, out_buf, 0);
+ ret = qcow2_co_compressed_flush_one(bs, false);
The qcow2 doc says a compressed cluster can span multiple host clusters. I
don’t know whether that can happen with this driver, but if it does, wouldn’t
that mean we’d need to flush two clusters here? Oh, no, never mind. Only the
first one would be finished and thus flushed, not the second one.
I could have now removed the above paragraph, but it made me think, so I kept
it:
Hm. Actually, if we unconditionally flush here, doesn’t that mean that we’ll
never have a finished cluster in the cache for longer than the span between the
seqcache_write() and this qcow2_co_compressed_flush_one()? I.e., the
qcow2_co_flush_compressed_cache() is supposed to never flush any finished
cluster, but only the currently active unfinished cluster (if there is one),
right?
Hmm. Maybe if we have parallel write and flush requests, it's a kind of race
condition: may be flush will flush both finished and unfinished cluster, maybe
write will flush the finished cluster and flush will flush only unfinished
one.. Moreover we may have several parallel requests, so they make several
finished clusters, and sudden flush will flush them all.
OK. I was mostly asking because I was wondering how much you expect the cache
to be filled, i.e., how much you expect the read cache to help.
It should depend on how many parallel write requests allowed.. (and that's an actual
limitation for cache size). We can add a "bottom-limit" for the cache, i.e.
don't flush if there less than X clusters in the cache. Still I don't think there actual
scenarios where both compressed reads and writes done. And anyway, to accelerate
compressed read we'd better do some simple read-ahead.
@@ -4681,10 +4808,19 @@ qcow2_co_preadv_compressed(BlockDriverState *bs,
out_buf = qemu_blockalign(bs, s->cluster_size);
- BLKDBG_EVENT(bs->file, BLKDBG_READ_COMPRESSED);
- ret = bdrv_co_pread(bs->file, coffset, csize, buf, 0);
- if (ret < 0) {
- goto fail;
+ /*
+ * seqcache_read may return less bytes than csize, as csize may exceed
+ * actual compressed data size. So we are OK if seqcache_read returns
+ * something > 0.
I was about to ask what happens when a compressed cluster spans two host
clusters (I could have imagined that in theory the second one could have been
discarded, but not the first one, so reading from the cache would really be
short -- we would have needed to check that we only fell short in the range of
512 bytes, not more).
But then I realized that in this version of the series, all finished clusters
are immediately discarded and only the current unfinished one is kept. Does it
even make sense to try seqcache_read() here, then?
Hmm. Not immediately, but after flush. An flush is not under mutex. So in theory at some
moment we may have several finished clusters "in-flight". And your question
make sense. The cache supports reading from consequitive clusters. But we also should
support here reading one part of data from disk and another from the cache to be safe.
The question is whether it really makes sense to even have a seqcache_read()
path when in reality it’s probably never accessed. I mean, besides the fact
that it seems based purely on chance whether a read might fetch something from
the cache even while we’re writing, in practice I don’t know any case where
we’d write to and read from a compressed qcow2 image at the same time. (I
don’t know what you’re doing with the 'compress' filter, though.)
Note, that for user that's not a parallel write and read to the same cluster:
1. user writes cluster A, request succeeded, data is in the cache
2. user writes some other clusters, cache filled, flush started
3. in parallel to [2] user reads cluster A. From the POV of user, cluster A is
written already, and should be read successfully
And seqcache_read() gives a simple non-blocking way to support read operation.
I have two additional thoughts in mind:
- After introducing compress filter we actually support parallel compressed
reads and writes.. Even if nobody uses it, it works, and it should continue
working correctly
- The only think that breaks using compressed device (with help of compressed
filter) is that we can't rewrite compressed cluster (by compressed cluster I
mean). But it's not hard to implement.. So, at some moment the limitation will
go away.. Hmm. still I can't imagine a usecase :)
I can image that both reads and writes may be used on temporary image in backup
scheme with reverse delta, or for external backup.
But rewriting compressed clusters is sensible only when we run real guest on
compressed image.. Can it be helpful? Maybe for scenarios with low disk usage
ratio..
--
Best regards,
Vladimir
- [PATCH v3 0/6] qcow2: compressed write cache, Vladimir Sementsov-Ogievskiy, 2021/03/05
- [PATCH v3 1/6] block-jobs: flush target at the end of .run(), Vladimir Sementsov-Ogievskiy, 2021/03/05
- [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Vladimir Sementsov-Ogievskiy, 2021/03/05
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Max Reitz, 2021/03/12
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Vladimir Sementsov-Ogievskiy, 2021/03/12
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Max Reitz, 2021/03/15
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes,
Vladimir Sementsov-Ogievskiy <=
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Max Reitz, 2021/03/16
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Vladimir Sementsov-Ogievskiy, 2021/03/16
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Max Reitz, 2021/03/17
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Vladimir Sementsov-Ogievskiy, 2021/03/12
- Re: [PATCH v3 6/6] block/qcow2: use seqcache for compressed writes, Vladimir Sementsov-Ogievskiy, 2021/03/29
[PATCH v3 2/6] iotests: add qcow2-discard-during-rewrite, Vladimir Sementsov-Ogievskiy, 2021/03/05
[PATCH v3 3/6] block/qcow2: introduce inflight writes counters: fix discard, Vladimir Sementsov-Ogievskiy, 2021/03/05