|
From: | Si-Wei Liu |
Subject: | Re: [PATCH v2 6/7] vdpa: move iova_tree allocation to net_vhost_vdpa_init |
Date: | Mon, 1 Apr 2024 23:19:32 -0700 |
User-agent: | Mozilla Thunderbird |
On 2/14/2024 11:11 AM, Eugenio Perez Martin wrote:
It looks the SVQ iova tree lookup routine vhost_iova_tree_find_iova(), which is called from vhost_vdpa_listener_region_del(), can't properly deal with overlapped region. Specifically, q35's mch_realize() has the following:On Wed, Feb 14, 2024 at 7:29 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:Hi Michael, On 2/13/2024 2:22 AM, Michael S. Tsirkin wrote:On Mon, Feb 05, 2024 at 05:10:36PM -0800, Si-Wei Liu wrote:Hi Eugenio, I thought this new code looks good to me and the original issue I saw with x-svq=on should be gone. However, after rebase my tree on top of this, there's a new failure I found around setting up guest mappings at early boot, please see attached the specific QEMU config and corresponding event traces. Haven't checked into the detail yet, thinking you would need to be aware of ahead. Regards, -SiweiEugenio were you able to reproduce? Siwei did you have time to look into this?Didn't get a chance to look into the detail yet in the past week, but thought it may have something to do with the (internals of) iova tree range allocation and the lookup routine. It started to fall apart at the first vhost_vdpa_dma_unmap call showing up in the trace events, where it should've gotten IOVA=0x2000001000, but an incorrect IOVA address 0x1000 was ended up returning from the iova tree lookup routine. HVA GPA IOVA ------------------------------------------------------------------------------------------------------------------------- Map [0x7f7903e00000, 0x7f7983e00000) [0x0, 0x80000000) [0x1000, 0x80000000) [0x7f7983e00000, 0x7f9903e00000) [0x100000000, 0x2080000000) [0x80001000, 0x2000001000) [0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000) [0x2000001000, 0x2000021000) Unmap [0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000) [0x1000, 0x20000) ??? shouldn't it be [0x2000001000, 0x2000021000) ???
579 memory_region_init_alias(&mch->open_high_smram, OBJECT(mch), "smram-open-high", 580 mch->ram_memory, MCH_HOST_BRIDGE_SMRAM_C_BASE,
581 MCH_HOST_BRIDGE_SMRAM_C_SIZE); 582 memory_region_add_subregion_overlap(mch->system_memory, 0xfeda0000, 583 &mch->open_high_smram, 1); 584 memory_region_set_enabled(&mch->open_high_smram, false);#0 0x0000564c30bf6980 in iova_tree_find_address_iterator (key=0x564c331cf8e0, value=0x564c331cf8e0, data=0x7fffb6d749b0) at ../util/iova-tree.c:96
#1 0x00007f5f66479654 in g_tree_foreach () at /lib64/libglib-2.0.so.0#2 0x0000564c30bf6b53 in iova_tree_find_iova (tree=<optimized out>, map=map@entry=0x7fffb6d74a00) at ../util/iova-tree.c:114 #3 0x0000564c309da0a9 in vhost_iova_tree_find_iova (tree=<optimized out>, map=map@entry=0x7fffb6d74a00) at ../hw/virtio/vhost-iova-tree.c:70 #4 0x0000564c3085e49d in vhost_vdpa_listener_region_del (listener=0x564c331024c8, section=0x7fffb6d74aa0) at ../hw/virtio/vhost-vdpa.c:444 #5 0x0000564c309f4931 in address_space_update_topology_pass (as=as@entry=0x564c31ab1840 <address_space_memory>, old_view=old_view@entry=0x564c33364cc0, new_view=new_view@entry=0x564c333640f0, adding=adding@entry=false) at ../system/memory.c:977 #6 0x0000564c309f4dcd in address_space_set_flatview (as=0x564c31ab1840 <address_space_memory>) at ../system/memory.c:1079 #7 0x0000564c309f86d0 in memory_region_transaction_commit () at ../system/memory.c:1132 #8 0x0000564c309f86d0 in memory_region_transaction_commit () at ../system/memory.c:1117 #9 0x0000564c307cce64 in mch_realize (d=<optimized out>, errp=<optimized out>) at ../hw/pci-host/q35.c:584
However, it looks like iova_tree_find_address_iterator() only check if the translated address (HVA) falls in to the range when trying to locate the desired IOVA, causing the first DMAMap that happens to overlap in the translated address (HVA) space to be returned prematurely:
89 static gboolean iova_tree_find_address_iterator(gpointer key, gpointer value,
90 gpointer data) 91 { : : 99 if (map->translated_addr + map->size < needle->translated_addr || 100 needle->translated_addr + needle->size < map->translated_addr) { 101 return false; 102 } 103 104 args->result = map; 105 return true; 106 }In the QEMU trace file, it reveals that the first DMAMap as below gets returned incorrectly instead the second, the latter of which is what the actual IOVA corresponds to:
HVA GPA IOVA [0x7f7903e00000, 0x7f7983e00000) [0x0, 0x80000000) [0x1000, 0x80001000) [0x7f7903ea0000, 0x7f7903ec0000) [0xfeda0000, 0xfedc0000) [0x2000001000, 0x2000021000)Maybe other than check the HVA range, we should also match GPA, or at least the size should exactly match?
Didn't see a different size here, though if you referred to the discrepancy in the traces around llend, I thought the two between _add() and _del() would have to be interpreted differently due to:Yes, I'm still not able to reproduce. In particular, I don't know how how the memory listener adds a region and then release a region with a different size. I'm talking about these log entries: 1706854838.154394:vhost_vdpa_listener_region_add vdpa: 0x556d45c75140 iova 0x0 llend 0x80000000 vaddr: 0x7f7903e00000 read-only: 0 452:vhost_vdpa_listener_region_del vdpa: 0x556d45c75140 iova 0x0 llend 0x7fffffff
3d1e4d34 "vhost_vdpa: fix the input in trace_vhost_vdpa_listener_region_del()"
Regards, -Siwei
Is it possible for you to also trace the skipped regions? We should add a debug trace there too... Thanks!PS, I will be taking off from today and for the next two weeks. Will try to help out looking more closely after I get back. -SiweiCan't merge patches which are known to break things ...
[Prev in Thread] | Current Thread | [Next in Thread] |