Re: [Qemu-devel] Qemu and 32 PCIe devices

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Qemu and 32 PCIe devices

From:	Laszlo Ersek
Subject:	Re: [Qemu-devel] Qemu and 32 PCIe devices
Date:	Wed, 9 Aug 2017 12:00:44 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 08/09/17 09:26, Paolo Bonzini wrote:
> On 09/08/2017 03:06, Laszlo Ersek wrote:
>>>   20.14%  qemu-system-x86_64                  [.] render_memory_region
>>>   17.14%  qemu-system-x86_64                  [.] subpage_register
>>>   10.31%  qemu-system-x86_64                  [.] int128_add
>>>    7.86%  qemu-system-x86_64                  [.] addrrange_end
>>>    7.30%  qemu-system-x86_64                  [.] int128_ge
>>>    4.89%  qemu-system-x86_64                  [.] int128_nz
>>>    3.94%  qemu-system-x86_64                  [.] phys_page_compact
>>>    2.73%  qemu-system-x86_64                  [.] phys_map_node_alloc
> 
> Yes, this is the O(n^3) thing.  An optimized build should be faster
> because int128 operations will be inlined and become much more efficient.
> 
>> With this patch, I only tested the "93 devices" case, as the slowdown
>> became visible to the naked eye from the trace messages, as the firmware
>> enabled more and more BARs / command registers (and inversely, the
>> speedup was perceivable when the firmware disabled more and more BARs /
>> command registers).
> 
> This is an interesting observation, and it's expected.  Looking at the
> O(n^3) complexity more in detail you have N operations, where the "i"th
> operates on "i" DMA address spaces, all of which have at least "i"
> memory regions (at least 1 BAR per device).

- Can you please give me a pointer to the code where the "i"th operation
works on "i" DMA address spaces? (Not that I dream about patching *that*
code, wherever it may live :) )

- You mentioned that changing this is on the ToDo list. I couldn't find
it under <https://wiki.qemu.org/index.php/ToDo>. Is it tracked somewhere
else?

(I'm not trying to urge any changes in the area, I'd just like to learn
about the code & the tracker item, if there's one.)

Thanks!
Laszlo

> 
> So the total cost is sum i=1..N i^2 = N(N+1)(2N+1)/6 = O(n^3).
> Expressing it as a sum shows why it gets slower as time progresses.
> 
> The solution is to note that those "i" address spaces are actually all
> the same, so we can get it down to sum i=1..N i = N(N+1)/2 = O(n^2).
> 
> Thanks,
> 
> Paolo
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] Qemu and 32 PCIe devices, Marcin Juszkiewicz, 2017/08/08
- Re: [Qemu-devel] Qemu and 32 PCIe devices, Laszlo Ersek, 2017/08/08
  - Re: [Qemu-devel] Qemu and 32 PCIe devices, Laszlo Ersek, 2017/08/08
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Paolo Bonzini, 2017/08/09
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Laszlo Ersek <=
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Paolo Bonzini, 2017/08/09
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Laszlo Ersek, 2017/08/09
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Peter Maydell, 2017/08/09
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Paolo Bonzini, 2017/08/09
    - Re: [Qemu-devel] Qemu and 32 PCIe devices, Michael S. Tsirkin, 2017/08/09

Prev by Date: [Qemu-devel] [PATCH v2 4/4] contrib/vhost-user-blk: introduce a vhost-user-blk sample application
Next by Date: [Qemu-devel] [PATCH v4 0/7] add throttle block driver filter
Previous by thread: Re: [Qemu-devel] Qemu and 32 PCIe devices
Next by thread: Re: [Qemu-devel] Qemu and 32 PCIe devices
Index(es):
- Date
- Thread