qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RESEND 1/3] vfio/pci: fix a null pointer reference in vfio_ro


From: Markus Armbruster
Subject: Re: [PATCH RESEND 1/3] vfio/pci: fix a null pointer reference in vfio_rom_read
Date: Wed, 11 Mar 2020 08:04:28 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Alex Williamson <address@hidden> writes:

> On Mon, 24 Feb 2020 14:42:17 +0800
> "Longpeng(Mike)" <address@hidden> wrote:
>
>> From: Longpeng <address@hidden>
>> 
>> vfio_pci_load_rom() maybe failed and then the vdev->rom is NULL in
>> some situation (though I've not encountered yet), maybe we should
>> avoid the VM abort.

What "VM abort" exactly?

>> 
>> Signed-off-by: Longpeng <address@hidden>
>> ---
>>  hw/vfio/pci.c | 13 ++++++++-----
>>  1 file changed, 8 insertions(+), 5 deletions(-)
>> 
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 5e75a95..ed798ae 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -768,7 +768,7 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
>>      }
>>  }
>>  
>> -static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
>> +static bool vfio_pci_load_rom(VFIOPCIDevice *vdev)
>>  {
>>      struct vfio_region_info *reg_info;
>>      uint64_t size;
>> @@ -778,7 +778,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
>>      if (vfio_get_region_info(&vdev->vbasedev,
>>                               VFIO_PCI_ROM_REGION_INDEX, &reg_info)) {
>>          error_report("vfio: Error getting ROM info: %m");
>> -        return;
>> +        return false;
>>      }
>>  
>>      trace_vfio_pci_load_rom(vdev->vbasedev.name, (unsigned 
>> long)reg_info->size,
>> @@ -797,7 +797,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
>>          error_printf("Device option ROM contents are probably invalid "
>>                      "(check dmesg).\nSkip option ROM probe with rombar=0, "
>>                      "or load from file with romfile=\n");
>> -        return;
>> +        return false;
>>      }
>>  
>>      vdev->rom = g_malloc(size);
>> @@ -849,6 +849,8 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
>>              data[6] = -csum;
>>          }
>>      }
>> +
>> +    return true;
>>  }
>>  
>>  static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size)
>> @@ -863,8 +865,9 @@ static uint64_t vfio_rom_read(void *opaque, hwaddr addr, 
>> unsigned size)
    {
        VFIOPCIDevice *vdev = opaque;
        union {
            uint8_t byte;
            uint16_t word;
            uint32_t dword;
            uint64_t qword;
        } val;
>>      uint64_t data = 0;
>>  
>>      /* Load the ROM lazily when the guest tries to read it */
>> -    if (unlikely(!vdev->rom && !vdev->rom_read_failed)) {
>> -        vfio_pci_load_rom(vdev);
>> +    if (unlikely(!vdev->rom && !vdev->rom_read_failed) &&
>> +        !vfio_pci_load_rom(vdev)) {
>> +        return 0;
>>      }
>>  
>>      memcpy(&val, vdev->rom + addr,
>
> Looks like an obvious bug, until you look at the rest of this memcpy():
>
> memcpy(&val, vdev->rom + addr,
>            (addr < vdev->rom_size) ? MIN(size, vdev->rom_size - addr) : 0);
>
> IOW, we'll do a zero sized memcpy() if rom_size is zero, so there's no
> risk of the concern identified in the commit log.  This patch is
> unnecessary.  Thanks,

I'm blind: why does !vdev->rom imply !vdev->rom_size?

Moreover, when MIN(size, vdev->rom_size - addr) < size, we seem to read
uninitialized data from @val:

        switch (size) {
        case 1:
            data = val.byte;
            break;
        case 2:
            data = le16_to_cpu(val.word);
            break;
        case 4:
            data = le32_to_cpu(val.dword);
            break;
        default:
            hw_error("vfio: unsupported read size, %d bytes\n", size);
            break;
        }

        trace_vfio_rom_read(vdev->vbasedev.name, addr, size, data);

        return data;
    }

Why is that okay?

Why do we initialize @data?

How can we get to the default case?  If we can get there, is hw_error()
really the right thing to do?  It almost never is...  If getting there
is the guest's fault, we need to tell it off the same way physical
hardware does.  If we should not ever get there (i.e. it's a QEMU bug),
then a plain abort() would be clearer.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]