qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update
Date: Wed, 14 Jan 2015 15:30:17 +0300
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 14/01/15 15:01, Roman Kagan wrote:
On Tue, Jan 13, 2015 at 11:16:04PM +0300, Denis V. Lunev wrote:
On 13/01/15 18:17, Denis V. Lunev wrote:
On 13/01/15 17:50, Roman Kagan wrote:
On Tue, Dec 30, 2014 at 01:07:09PM +0300, Denis V. Lunev wrote:
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -194,7 +194,7 @@ static int64_t allocate_sector(BlockDriverState
*bs, int64_t sector_num)
        tmp = cpu_to_le32(s->bat[idx]);
  -    ret = bdrv_pwrite_sync(bs->file, bat_offset(idx), &tmp,
sizeof(tmp));
+    ret = bdrv_pwrite(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
      if (ret < 0) {
          return ret;
      }
if the preceding bdrv_truncate(+cluster_size) gets reordered with this
bat entry update, there's a risk that on poweroff (snapshot, etc.) the
bat entry gets written to the storage while the file size is not
updated.  As a result, the bat entry in the file would point at a
cluster which hadn't been allocated yet.  When a new block is eventually
added to the file, you'd have two bat entries pointing at the same
cluster.

The _sync version used to leave this window quite narrow due to the
flush following the write.  The patch makes the reordering more likely.

I'm afraid the only reliable way to handle it is to put a barrier
between truncate and bat update, and mitigate the costs by batching the
file expansions, as you seem to do in the followup patches.
Thinking a bit more on this. IMHO there is no problem
with a normal workflow, the problem could come
from a crash when the file state and a BAT state will
be not consistent and the BAT entry will point out
of the file (could point out of the file).
Yes that's exactly the scenario I've been talking about.

But it is not possible to fix at all this way. Single sync
does not provide much warranty.
How's that?

We should write proper magic into ParallelsHeader->inuse
on open and call sync immediately. On subsequent
open of the broken image we should perform
check consistency. The magic should be set to 0
on clean close.
This seems to address the problem, but only if this protocol is adhered
to by all the tools that may access the image (Parallels Desktop,
Parallels Server, ploop, etc.).  Furthermore, if it is, it *has* to be
adhered to by this code, too, while adding write support, otherwise
we'll break the assumptions of those other tools.

I think that this stuff could be implemented separately
in the next patchset. If you this that this is mandatory,
OK, I can do that, be this will increase patchset a lot.
Check consistency is not an easy stuff.
I think as a 1st approximation, instead of a full-fledged consistency
check and repair, just refusing write access to the image that has
->inuse set would suffice.

Roman.
ok, seems reasonable



reply via email to

[Prev in Thread] Current Thread [Next in Thread]