qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_ini


From: Si-Wei Liu
Subject: Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init
Date: Mon, 1 Apr 2024 23:19:32 -0700
User-agent: Mozilla Thunderbird



On 2/14/2024 11:11 AM, Eugenio Perez Martin wrote:
On Wed, Feb 14, 2024 at 7:29 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
Hi Michael,

On 2/13/2024 2:22 AM, Michael S. Tsirkin wrote:
On Mon, Feb 05, 2024 at 05:10:36PM -0800, Si-Wei Liu wrote:
Hi Eugenio,

I thought this new code looks good to me and the original issue I saw with
x-svq=on should be gone. However, after rebase my tree on top of this,
there's a new failure I found around setting up guest mappings at early
boot, please see attached the specific QEMU config and corresponding event
traces. Haven't checked into the detail yet, thinking you would need to be
aware of ahead.

Regards,
-Siwei
Eugenio were you able to reproduce? Siwei did you have time to
look into this?
Didn't get a chance to look into the detail yet in the past week, but
thought it may have something to do with the (internals of) iova tree
range allocation and the lookup routine. It started to fall apart at the
first vhost_vdpa_dma_unmap call showing up in the trace events, where it
should've gotten IOVA=0x2000001000,  but an incorrect IOVA address
0x1000 was ended up returning from the iova tree lookup routine.

HVA                    GPA                IOVA
-------------------------------------------------------------------------------------------------------------------------
Map
[0x7f7903e00000, 0x7f7983e00000)    [0x0, 0x80000000) [0x1000, 0x80000000)
[0x7f7983e00000, 0x7f9903e00000)    [0x100000000, 0x2080000000)
[0x80001000, 0x2000001000)
[0x7f7903ea0000, 0x7f7903ec0000)    [0xfeda0000, 0xfedc0000)
[0x2000001000, 0x2000021000)

Unmap
[0x7f7903ea0000, 0x7f7903ec0000)    [0xfeda0000, 0xfedc0000) [0x1000,
0x20000) ???
                                      shouldn't it be [0x2000001000,
0x2000021000) ???

It looks the SVQ iova tree lookup routine vhost_iova_tree_find_iova(), which is called from vhost_vdpa_listener_region_del(), can't properly deal with overlapped region. Specifically, q35's mch_realize() has the following:

579     memory_region_init_alias(&mch->open_high_smram, OBJECT(mch), "smram-open-high", 580                              mch->ram_memory, MCH_HOST_BRIDGE_SMRAM_C_BASE,
581                              MCH_HOST_BRIDGE_SMRAM_C_SIZE);
582     memory_region_add_subregion_overlap(mch->system_memory, 0xfeda0000,
583 &mch->open_high_smram, 1);
584     memory_region_set_enabled(&mch->open_high_smram, false);

#0  0x0000564c30bf6980 in iova_tree_find_address_iterator (key=0x564c331cf8e0, value=0x564c331cf8e0, data=0x7fffb6d749b0) at ../util/iova-tree.c:96
#1  0x00007f5f66479654 in g_tree_foreach () at /lib64/libglib-2.0.so.0
#2  0x0000564c30bf6b53 in iova_tree_find_iova (tree=<optimized out>, map=map@entry=0x7fffb6d74a00) at ../util/iova-tree.c:114 #3  0x0000564c309da0a9 in vhost_iova_tree_find_iova (tree=<optimized out>, map=map@entry=0x7fffb6d74a00) at ../hw/virtio/vhost-iova-tree.c:70 #4  0x0000564c3085e49d in vhost_vdpa_listener_region_del (listener=0x564c331024c8, section=0x7fffb6d74aa0) at ../hw/virtio/vhost-vdpa.c:444 #5  0x0000564c309f4931 in address_space_update_topology_pass (as=as@entry=0x564c31ab1840 <address_space_memory>, old_view=old_view@entry=0x564c33364cc0, new_view=new_view@entry=0x564c333640f0, adding=adding@entry=false) at ../system/memory.c:977 #6  0x0000564c309f4dcd in address_space_set_flatview (as=0x564c31ab1840 <address_space_memory>) at ../system/memory.c:1079 #7  0x0000564c309f86d0 in memory_region_transaction_commit () at ../system/memory.c:1132 #8  0x0000564c309f86d0 in memory_region_transaction_commit () at ../system/memory.c:1117 #9  0x0000564c307cce64 in mch_realize (d=<optimized out>, errp=<optimized out>) at ../hw/pci-host/q35.c:584

However, it looks like iova_tree_find_address_iterator() only check if the translated address (HVA) falls in to the range when trying to locate the desired IOVA, causing the first DMAMap that happens to overlap in the translated address (HVA) space to be returned prematurely:

 89 static gboolean iova_tree_find_address_iterator(gpointer key, gpointer value,
 90                                                 gpointer data)
 91 {
 :
 :
 99     if (map->translated_addr + map->size < needle->translated_addr ||
100         needle->translated_addr + needle->size < map->translated_addr) {
101         return false;
102     }
103
104     args->result = map;
105     return true;
106 }

In the QEMU trace file, it reveals that the first DMAMap as below gets returned incorrectly instead the second, the latter of which is what the actual IOVA corresponds to:

HVA                                                             GPA             
                                IOVA
[0x7f7903e00000, 0x7f7983e00000)        [0x0, 0x80000000)                       
[0x1000, 0x80001000)
[0x7f7903ea0000, 0x7f7903ec0000)        [0xfeda0000, 0xfedc0000)        
[0x2000001000, 0x2000021000)


Maybe other than check the HVA range, we should also match GPA, or at least the size should exactly match?

Yes, I'm still not able to reproduce. In particular, I don't know how
how the memory listener adds a region and then release a region with a
different size. I'm talking about these log entries:

1706854838.154394:vhost_vdpa_listener_region_add vdpa: 0x556d45c75140
iova 0x0 llend 0x80000000 vaddr: 0x7f7903e00000 read-only: 0
452:vhost_vdpa_listener_region_del vdpa: 0x556d45c75140 iova 0x0 llend
0x7fffffff
Didn't see a different size here, though if you referred to the discrepancy in the traces around llend, I thought the two between _add() and _del() would have to be interpreted differently due to:

3d1e4d34 "vhost_vdpa: fix the input in trace_vhost_vdpa_listener_region_del()"

Regards,
-Siwei
Is it possible for you to also trace the skipped regions? We should
add a debug trace there too...

Thanks!

PS, I will be taking off from today and for the next two weeks. Will try
to help out looking more closely after I get back.

-Siwei
   Can't merge patches which are known to break things ...




reply via email to

[Prev in Thread] Current Thread [Next in Thread]