qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-8.0] block/vhdx: fix dynamic VHDX BAT corruption


From: Kevin Wolf
Subject: Re: [PATCH for-8.0] block/vhdx: fix dynamic VHDX BAT corruption
Date: Tue, 11 Apr 2023 13:08:14 +0200

Am 08.04.2023 um 00:11 hat Lukas Tschoke geschrieben:
> The corruption occurs when a BAT entry aligned to 4096 bytes is changed.
> 
> Specifically, the corruption occurs during the creation of the LOG Data
> Descriptor. The incorrect behavior involves copying 4088 bytes from the
> original 4096 bytes aligned offset to `tmp[8..4096]` and then copying
> the new value for the first BAT entry to the beginning `tmp[0..8]`.
> This results in all existing BAT entries inside the 4K region being
> incorrectly moved by 8 bytes and the last entry being lost.
> 
> This bug did not cause noticeable corruption when only sequentially
> writing once to an empty dynamic VHDX (e.g.
> using `qemu-img convert -O vhdx -o subformat=dynamic ...`), but it
> still resulted in invalid values for the (unused) Sector Bitmap BAT
> entries.
> 
> Importantly, this corruption would only become noticeable after the
> corrupted BAT is re-read from the file.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/727
> Signed-off-by: Lukas Tschoke <lukts330@gmail.com>

Thanks, applied to the block branch.

Reviewing this function was a bit confusing, but after understanding
what each variable means, your change is clearly fixing a local bug and
therefore an improvement.

But now I'm wondering if it's really right that we don't have to handle
the case where we write only a few bytes and therefore can have a
leading and a trailing part in the same log sector.

In fact, having everything in the same sector actually seems to be the
only case that really happens because vhdx_log_write_and_flush() is only
called with length = 8 and offset = bat_entry_offset, which is a
multiple of 8.

Most of the cases should be in the middle of the BAT. vhdx_log_write()
uses the leading_length != 0 code path for them. This reads the part
before the written entry from the image, replaces the entry itself, but
seems to leave the buffer uninitialised for everything after the entry.
So does that part get corrupted?

I suspect that while your patch fixes some cases, the function still
isn't entirely right.

Do you happen to have a simple reproducer using only qemu-io and
qemu-img for the case you fixed? If so, maybe it could be relatively
easily adjusted to cover these other cases, too.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]