qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/2] vhost: Defer filtering memory sections until building


From: David Hildenbrand
Subject: Re: [PATCH v1 1/2] vhost: Defer filtering memory sections until building the vhost memory structure
Date: Thu, 16 Feb 2023 13:39:03 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0

On 16.02.23 13:21, Michael S. Tsirkin wrote:
On Thu, Feb 16, 2023 at 01:10:54PM +0100, David Hildenbrand wrote:
On 16.02.23 13:04, Michael S. Tsirkin wrote:
On Thu, Feb 16, 2023 at 12:47:51PM +0100, David Hildenbrand wrote:
Having multiple devices, some filtering memslots and some not filtering
memslots, messes up the "used_memslot" accounting. If we'd have a device
the filters out less memory sections after a device that filters out more,
we'd be in trouble, because our memslot checks stop working reliably.
For example, hotplugging a device that filters out less memslots might end
up passing the checks based on max vs. used memslots, but can run out of
memslots when getting notified about all memory sections.

Further, it will be helpful in memory device context in the near future
to know that a RAM memory region section will consume a memslot, and be
accounted for in the used vs. free memslots, such that we can implement
reservation of memslots for memory devices properly. Whether a device
filters this out and would theoretically still have a free memslot is
then hidden internally, making overall vhost memslot accounting easier.

Let's filter the memslots when creating the vhost memory array,
accounting all RAM && !ROM memory regions as "used_memslots" even if
vhost_user isn't interested in anonymous RAM regions, because it needs
an fd.

When a device actually filters out regions (which should happen rarely
in practice), we might detect a layout change although only filtered
regions changed. We won't bother about optimizing that for now.

That caused trouble in the past when using VGA because it is playing
with mappings in weird ways.
I think we have to optimize it, sorry.

We still filter them out, just later.


The issue is sending lots of unnecessary system calls to update the kernel which
goes through a slow RCU.

I don't think this is the case when deferring the device-specific filtering. As discussed, the generic filtering (ignore !ram, ignore rom, ignore VMA) remains in place because that is identical for all devices.


Note: we cannot simply filter out the region and count them as
"filtered" to add them to used, because filtered regions could get
merged and result in a smaller effective number of memslots. Further,
we won't touch the hmp/qmp virtio introspection output.

Fixes: 988a27754bbb ("vhost: allow backends to filter memory sections")
Cc: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>

I didn't review this yet but maybe you can answer:
will this create more slots for the backend?
Because some backends are limited in # of slots and breaking them is
not a good idea.

It restores the handling we had before 988a27754bbb. RAM without an fd
should be rare for vhost-user setups (where we actually filter) I assume?

Hmm, I guess so.

At least on simplistic QEMU invocations with vhost-user (and proper shared memory as backend) I don't see any such filtering happening, because everything that is RAM is proper fd-based.

IMHO the chance of braking a sane VM setup are very small.

--
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]