[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFCv2 for-4.0 4/5] virtio-balloon: Use ram_block_disca
From: |
David Gibson |
Subject: |
Re: [Qemu-devel] [RFCv2 for-4.0 4/5] virtio-balloon: Use ram_block_discard_range() instead of raw madvise() |
Date: |
Thu, 6 Dec 2018 09:56:50 +1100 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Wed, Dec 05, 2018 at 08:59:06AM +0100, David Hildenbrand wrote:
> On 05.12.18 06:06, David Gibson wrote:
> > Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually
> > discard RAM pages inserted into the balloon. This is basically a Linux
> > only interface (MADV_DONTNEED exists on some other platforms, but doesn't
> > always have the same semantics). It also doesn't work on hugepages and has
> > some other limitations.
> >
> > It turns out that postcopy also needs to discard chunks of memory, and uses
> > a better interface for it: ram_block_discard_range(). It doesn't cover
> > every case, but it covers more than going direct to madvise() and this
> > gives us a single place to update for more possibilities in future.
> >
> > There are some subtleties here to maintain the current balloon behaviour:
> >
> > * For now, we just ignore requests to balloon in a hugepage backed region.
> > That matches current behaviour, because MADV_DONTNEED on a hugepage would
> > simply fail, and we ignore the error.
> >
> > * If host page size is > BALLOON_PAGE_SIZE we can frequently call this on
> > non-host-page-aligned addresses. These would also fail in madvise(),
> > which we then ignored. ram_block_discard_range() error_report()s calls
> > on unaligned addresses, so we explicitly check that case to avoid
> > spamming the logs.
> >
> > * We now call ram_block_discard_range() with the *host* page size, whereas
> > we previously called madvise() with BALLOON_PAGE_SIZE. Surprisingly,
> > this also matches existing behaviour. Although the kernel fails madvise
> > on unaligned addresses, it will round unaligned sizes *up* to the host
> > page size. Yes, this means that if BALLOON_PAGE_SIZE < guest page size
> > we can incorrectly discard more memory than the guest asked us to. I'm
> > planning to address that soon.
> >
> > Errors other than the ones discussed above, will now be reported by
> > ram_block_discard_range(), rather than silently ignored, which means we
> > have a much better chance of seeing when something is going wrong.
> >
> > Signed-off-by: David Gibson <address@hidden>
> > Reviewed-by: Michael S. Tsirkin <address@hidden>
> > ---
> > hw/virtio/virtio-balloon.c | 23 ++++++++++++++++++++++-
> > 1 file changed, 22 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index c3a19aa27d..4435905c87 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> > @@ -37,8 +37,29 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
> > MemoryRegion *mr, hwaddr offset)
> > {
> > void *addr = memory_region_get_ram_ptr(mr) + offset;
> > + RAMBlock *rb;
> > + size_t rb_page_size;
> > + ram_addr_t ram_offset;
> >
> > - qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED);
> > + /* XXX is there a better way to get to the RAMBlock than via a
> > + * host address? */
>
> We have qemu_get_ram_block(). That one should work as long as we know
> that it is a valid guest ram address. (would have to make it !static)
>
> Then we would only have to pass to this function the original ram_addr_t
> handed over by the guest (which looks somewhat cleaner to me than going
> via memory regions)
So, I didn't use that because it's a hwaddr, not a ram_addr_t that the
guest gives us. I think they have the same value for guest RAM
addresses, but I wasn't sure if it was safe to rely on that.
>
> > + rb = qemu_ram_block_from_host(addr, false, &ram_offset);
> > + rb_page_size = qemu_ram_pagesize(rb);
> > +
> > + /* Silently ignore hugepage RAM blocks */
> > + if (rb_page_size != getpagesize()) {
> > + return;
> > + }
> > +
> > + /* Silently ignore unaligned requests */
> > + if (ram_offset & (rb_page_size - 1)) {
> > + return;
> > + }
> > +
> > + ram_block_discard_range(rb, ram_offset, rb_page_size);
> > + /* We ignore errors from ram_block_discard_range(), because it has
> > + * already reported them, and failing to discard a balloon page is
> > + * not fatal */
> > }
> >
> > static const char *balloon_stat_names[] = {
> >
>
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- [Qemu-devel] [RFCv2 for-4.0 0/5] Improve balloon handling of pagesizes other than 4kiB, David Gibson, 2018/12/05
- [Qemu-devel] [RFCv2 for-4.0 4/5] virtio-balloon: Use ram_block_discard_range() instead of raw madvise(), David Gibson, 2018/12/05
- [Qemu-devel] [RFCv2 for-4.0 5/5] virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size, David Gibson, 2018/12/05
- [Qemu-devel] [RFCv2 for-4.0 3/5] virtio-balloon: Rework ballon_page() interface, David Gibson, 2018/12/05
- [Qemu-devel] [RFCv2 for-4.0 2/5] virtio-balloon: Corrections to address verification, David Gibson, 2018/12/05
- [Qemu-devel] [RFCv2 for-4.0 1/5] virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate, David Gibson, 2018/12/05