qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH qemu v7 2/4] vfio/pci: Relax DMA map


From: Alexey Kardashevskiy
Subject: Re: [Qemu-ppc] [Qemu-devel] [PATCH qemu v7 2/4] vfio/pci: Relax DMA map errors for MMIO regions
Date: Thu, 22 Mar 2018 18:45:33 +1100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Thunderbird/59.0

On 22/3/18 2:29 am, Alex Williamson wrote:
> On Mon, 19 Mar 2018 21:49:58 +0100
> Auger Eric <address@hidden> wrote:
> 
>> Hi Alexey,
>>
>> On 09/02/18 08:55, Alexey Kardashevskiy wrote:
>>> At the moment if vfio_memory_listener is registered in the system memory
>>> address space, it maps/unmaps every RAM memory region for DMA.
>>> It expects system page size aligned memory sections so vfio_dma_map
>>> would not fail and so far this has been the case. A mapping failure
>>> would be fatal. A side effect of such behavior is that some MMIO pages
>>> would not be mapped silently.
>>>
>>> However we are going to change MSIX BAR handling so we will end having
>>> non-aligned sections in vfio_memory_listener (more details is in
>>> the next patch) and vfio_dma_map will exit QEMU.
>>>
>>> In order to avoid fatal failures on what previously was not a failure and
>>> was just silently ignored, this checks the section alignment to
>>> the smallest supported IOMMU page size and prints an error if not aligned;
>>> it also prints an error if vfio_dma_map failed despite the page size check.
>>> Both errors are not fatal; only MMIO RAM regions are checked
>>> (aka "RAM device" regions).
>>>
>>> If the amount of errors printed is overwhelming, the MSIX relocation
>>> could be used to avoid excessive error output.
>>>
>>> This is unlikely to cause any behavioral change.
>>>
>>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>>> ---
>>>  hw/vfio/common.c | 55 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++------
>>>  1 file changed, 49 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index f895e3c..736f271 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -544,18 +544,40 @@ static void vfio_listener_region_add(MemoryListener 
>>> *listener,
>>>  
>>>      llsize = int128_sub(llend, int128_make64(iova));
>>>  
>>> +    if (memory_region_is_ram_device(section->mr)) {
>>> +        hwaddr pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
>>> +
>>> +        if ((iova & pgmask) || (llsize & pgmask)) {
>>> +            error_report("Region 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx
>>> +                         " is not aligned to 0x%"HWADDR_PRIx
>>> +                         " and cannot be mapped for DMA",
>>> +                         section->offset_within_region,
>>> +                         int128_getlo(section->size),  
>> Didn't you want to print the section->offset_within_address_space
>> instead of section->offset_within_region?


I do, my bad - this offset_within_region is a leftover from an earlier
version, iova is offset_within_address_space essentially.


>>
>> For an 1000e card*, with a 64kB page, it outputs:
>> "Region 0x50..0xffb0 is not aligned to 0x10000 and cannot be mapped for DMA"
>>
>> an absolute gpa range would be simpler I think.
> 
> I agree, the offset within the address space seems like it'd be much
> easier to map back to the device.  Since we're handling it as
> non-fatal, perhaps we can even add "MMIO Region" to give the user a bit
> more context where the issue occurs.  Alexey, do you want to submit a
> follow-up patch, or Eric, you may have the best resources to tune the
> message.  Thanks,


I can if somebody gives a hint about what needs to be tuned in the message.
I am also fine if Eric does this too :)


> 
> Alex
>  
>> * where      Region 3: Memory at 100e0000 (32-bit, non-prefetchable)
>> [size=16K] contains the MSIX table
>>
>>      Capabilities: [a0] MSI-X: Enable+ Count=5 Masked-
>>              Vector table: BAR=3 offset=00000000
>>              PBA: BAR=3 offset=00002000
>>
>> Thanks
>>
>> Eric
>>> +                         pgmask + 1);
>>> +            return;
>>> +        }
>>> +    }
>>> +
>>>      ret = vfio_dma_map(container, iova, int128_get64(llsize),
>>>                         vaddr, section->readonly);
>>>      if (ret) {
>>>          error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
>>>                       "0x%"HWADDR_PRIx", %p) = %d (%m)",
>>>                       container, iova, int128_get64(llsize), vaddr, ret);
>>> +        if (memory_region_is_ram_device(section->mr)) {
>>> +            /* Allow unexpected mappings not to be fatal for RAM devices */
>>> +            return;
>>> +        }
>>>          goto fail;
>>>      }
>>>  
>>>      return;
>>>  
>>>  fail:
>>> +    if (memory_region_is_ram_device(section->mr)) {
>>> +        error_report("failed to vfio_dma_map. pci p2p may not work");
>>> +        return;
>>> +    }
>>>      /*
>>>       * On the initfn path, store the first error in the container so we
>>>       * can gracefully fail.  Runtime, there's not much we can do other
>>> @@ -577,6 +599,7 @@ static void vfio_listener_region_del(MemoryListener 
>>> *listener,
>>>      hwaddr iova, end;
>>>      Int128 llend, llsize;
>>>      int ret;
>>> +    bool try_unmap = true;
>>>  
>>>      if (vfio_listener_skipped_section(section)) {
>>>          trace_vfio_listener_region_del_skip(
>>> @@ -629,13 +652,33 @@ static void vfio_listener_region_del(MemoryListener 
>>> *listener,
>>>  
>>>      trace_vfio_listener_region_del(iova, end);
>>>  
>>> -    ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
>>> +    if (memory_region_is_ram_device(section->mr)) {
>>> +        hwaddr pgmask;
>>> +        VFIOHostDMAWindow *hostwin;
>>> +        bool hostwin_found = false;
>>> +
>>> +        QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
>>> +            if (hostwin->min_iova <= iova && end <= hostwin->max_iova) {
>>> +                hostwin_found = true;
>>> +                break;
>>> +            }
>>> +        }
>>> +        assert(hostwin_found); /* or region_add() would have failed */
>>> +
>>> +        pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
>>> +        try_unmap = !((iova & pgmask) || (llsize & pgmask));
>>> +    }
>>> +
>>> +    if (try_unmap) {
>>> +        ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
>>> +        if (ret) {
>>> +            error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>>> +                         "0x%"HWADDR_PRIx") = %d (%m)",
>>> +                         container, iova, int128_get64(llsize), ret);
>>> +        }
>>> +    }
>>> +
>>>      memory_region_unref(section->mr);
>>> -    if (ret) {
>>> -        error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>>> -                     "0x%"HWADDR_PRIx") = %d (%m)",
>>> -                     container, iova, int128_get64(llsize), ret);
>>> -    }
>>>  
>>>      if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) {
>>>          vfio_spapr_remove_window(container,
>>>   
> 


-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]