qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PULL v2 6/8] vmdk: Add read-only support for seSparse


From: Peter Maydell
Subject: Re: [Qemu-block] [PULL v2 6/8] vmdk: Add read-only support for seSparse snapshots
Date: Tue, 2 Jul 2019 20:13:10 +0100

On Mon, 24 Jun 2019 at 15:48, Max Reitz <address@hidden> wrote:
>
> From: Sam Eiderman <address@hidden>
>
> Until ESXi 6.5 VMware used the vmfsSparse format for snapshots (VMDK3 in
> QEMU).
>
> This format was lacking in the following:
>
>     * Grain directory (L1) and grain table (L2) entries were 32-bit,
>       allowing access to only 2TB (slightly less) of data.
>     * The grain size (default) was 512 bytes - leading to data
>       fragmentation and many grain tables.
>     * For space reclamation purposes, it was necessary to find all the
>       grains which are not pointed to by any grain table - so a reverse
>       mapping of "offset of grain in vmdk" to "grain table" must be
>       constructed - which takes large amounts of CPU/RAM.
>

Hi; this change has confused Coverity a bit (CID 1402786):

> @@ -481,7 +529,7 @@ static int vmdk_init_tables(BlockDriverState *bs, 
> VmdkExtent *extent,
>      int i;
>
>      /* read the L1 table */
> -    l1_size = extent->l1_size * sizeof(uint32_t);
> +    l1_size = extent->l1_size * extent->entry_size;
>      extent->l1_table = g_try_malloc(l1_size);
>      if (l1_size && extent->l1_table == NULL) {
>          return -ENOMEM;

>      }
>
>      if (extent->l1_backup_table_offset) {
> +        assert(!extent->sesparse);
>          extent->l1_backup_table = g_try_malloc(l1_size);
>          if (l1_size && extent->l1_backup_table == NULL) {
>              ret = -ENOMEM;

Here we allocate extent->l1_backup_table via g_try_malloc(),
and our "ENOMEM" guard checks l1_size. However later on we do:

        for (i = 0; i < extent->l1_size; i++) {
            le32_to_cpus(&extent->l1_backup_table[i]);
        }

which is a dereference of l1_backup_table whose guard
is looking at extent->l1_size. Previously Coverity could
(probably) see this was OK because we were setting
   l1_size = extent->l1_size * sizeof(uint32_t)
so l1_size is 0 if and only if extent->l1_size is zero.
But now there's another possibility because we have changed
the initialization to
   l1_size = extent->l1_size * extent->entry_size;
so if extent->entry_size is zero then l1_size could be 0
(getting us past the ENOMEM check) but extent->l1_size
non-zero (causing us to try to dereference a NULL l1_backup_table
pointer).

This can't in fact happen because if extent->l1_size is
non-zero we'll assert that extent->entry_size is
either 8 or 4, but the logic has become sufficiently
convoluted that it's no longer clear to Coverity that this
is an impossible case. I could mark this as a false positive,
but maybe there's a way of rephrasing this code that makes
the logic a little clearer to both humans and robots?

Supplementary:

(1) this code:
    for (i = 0; i < extent->l1_size; i++) {
        if (extent->entry_size == sizeof(uint64_t)) {
            le64_to_cpus((uint64_t *)extent->l1_table + i);
        } else {
            assert(extent->entry_size == sizeof(uint32_t));
            le32_to_cpus((uint32_t *)extent->l1_table + i);
        }
    }
only asserts that extent->entry_size is sane if the l1_size
is non-zero, and it does it inside the loop so we assert
once for each entry. Hiding the assert like this might be
part of what's confusing Coverity.

(2) we allocate the l1_backup_table() with a size of l1_size,
which will differ depending on extent->entry_size; but then
we do the endianness conversion on it using
        for (i = 0; i < extent->l1_size; i++) {
            le32_to_cpus(&extent->l1_backup_table[i]);
        }
which assumes always 32-bit entries. Is that correct?

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]