qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/2] accel: kvm: Add aligment assert for kvm_log_clear_one


From: Peter Xu
Subject: Re: [PATCH v2 2/2] accel: kvm: Add aligment assert for kvm_log_clear_one_slot
Date: Tue, 9 Mar 2021 11:08:14 -0500

On Tue, Mar 09, 2021 at 02:57:53PM +0000, Dr. David Alan Gilbert wrote:
> * Thomas Huth (thuth@redhat.com) wrote:
> > On 09/03/2021 15.05, Keqian Zhu wrote:
> > > 
> > > 
> > > On 2021/3/9 21:48, Thomas Huth wrote:
> > > > On 17/12/2020 02.49, Keqian Zhu wrote:
> > > > > The parameters start and size are transfered from QEMU memory
> > > > > emulation layer. It can promise that they are TARGET_PAGE_SIZE
> > > > > aligned. However, KVM needs they are qemu_real_page_size aligned.
> > > > > 
> > > > > Though no caller breaks this aligned requirement currently, we'd
> > > > > better add an explicit assert to avoid future breaking.
> > > > > 
> > > > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> > > > > ---
> > > > >    accel/kvm/kvm-all.c | 7 +++++++
> > > > >    1 file changed, 7 insertions(+)
> > > > > 
> > > > > ---
> > > > > v2
> > > > >    - Address Andrew's commment (Use assert instead of return err).
> > > > > 
> > > > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > > > > index f6b16a8df8..73b195cc41 100644
> > > > > --- a/accel/kvm/kvm-all.c
> > > > > +++ b/accel/kvm/kvm-all.c
> > > > > @@ -692,6 +692,10 @@ out:
> > > > >    #define KVM_CLEAR_LOG_ALIGN  (qemu_real_host_page_size << 
> > > > > KVM_CLEAR_LOG_SHIFT)
> > > > >    #define KVM_CLEAR_LOG_MASK   (-KVM_CLEAR_LOG_ALIGN)
> > > > >    +/*
> > > > > + * As the granule of kvm dirty log is qemu_real_host_page_size,
> > > > > + * @start and @size are expected and restricted to align to it.
> > > > > + */
> > > > >    static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, 
> > > > > uint64_t start,
> > > > >                                      uint64_t size)
> > > > >    {
> > > > > @@ -701,6 +705,9 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, 
> > > > > int as_id, uint64_t start,
> > > > >        unsigned long *bmap_clear = NULL, psize = 
> > > > > qemu_real_host_page_size;
> > > > >        int ret;
> > > > >    +    /* Make sure start and size are qemu_real_host_page_size 
> > > > > aligned */
> > > > > +    assert(QEMU_IS_ALIGNED(start | size, psize));
> > > > 
> > > > Sorry, but that was a bad idea: It triggers and kills my Centos 6 VM:
> > > > 
> > > > $ qemu-system-x86_64 -accel kvm -hda ~/virt/images/centos6.qcow2 -m 1G
> > > > qemu-system-x86_64: ../../devel/qemu/accel/kvm/kvm-all.c:690: 
> > > > kvm_log_clear_one_slot: Assertion `QEMU_IS_ALIGNED(start | size, 
> > > > psize)' failed.
> > > > Aborted (core dumped)
> > > Hi Thomas,
> > > 
> > > I think this patch is ok, maybe it trigger a potential bug?
> > 
> > Well, sure, there is either a bug somewhere else or in this new code. But 
> > it's certainly not normal that the assert() triggers, is it?
> > 
> > FWIW, here's a backtrace:
> > 
> > #0  0x00007ffff2c1584f in raise () at /lib64/libc.so.6
> > #1  0x00007ffff2bffc45 in abort () at /lib64/libc.so.6
> > #2  0x00007ffff2bffb19 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
> > #3  0x00007ffff2c0de36 in .annobin_assert.c_end () at /lib64/libc.so.6
> > #4  0x0000555555ba25f3 in kvm_log_clear_one_slot
> >     (size=6910080, start=0, as_id=0, mem=0x555556e1ee00)
> >     at ../../devel/qemu/accel/kvm/kvm-all.c:691
> > #5  0x0000555555ba25f3 in kvm_physical_log_clear
> >     (section=0x7fffffffd0b0, section=0x7fffffffd0b0, kml=0x555556dbaac0)
> >     at ../../devel/qemu/accel/kvm/kvm-all.c:843
> > #6  0x0000555555ba25f3 in kvm_log_clear (listener=0x555556dbaac0, 
> > section=0x7fffffffd0b0)
> >     at ../../devel/qemu/accel/kvm/kvm-all.c:1253
> > #7  0x0000555555b023d8 in memory_region_clear_dirty_bitmap
> >     (mr=mr@entry=0x5555573394c0, start=start@entry=0, len=len@entry=6910080)
> >     at ../../devel/qemu/softmmu/memory.c:2132
> > #8  0x0000555555b313d9 in cpu_physical_memory_snapshot_and_clear_dirty
> >     (mr=mr@entry=0x5555573394c0, offset=offset@entry=0, 
> > length=length@entry=6910080, client=client@entry=0) at 
> > ../../devel/qemu/softmmu/physmem.c:1109
> > #9  0x0000555555b02483 in memory_region_snapshot_and_clear_dirty
> >     (mr=mr@entry=0x5555573394c0, addr=addr@entry=0, 
> > size=size@entry=6910080, client=client@entry=0)
> >     at ../../devel/qemu/softmmu/memory.c:2146
> 
> Could you please figure out which memory region this is?
> WTH is that size? Is that really the problem that the size is just
> crazy?

It seems vga_draw_graphic() could call memory_region_snapshot_and_clear_dirty()
with not-page-aligned size.  cpu_physical_memory_snapshot_and_clear_dirty()
actually took care of most of it on alignment, however still the "length"
parameter got passed in without alignment check or so.

Cc Gerd too.

I'm not sure how many use cases are there like this.. if there're a lot maybe
we can indeed drop this assert patch, but instead in kvm_log_clear_one_slot()
we should ALIGN_DOWN the size to smallest host page size. Say, if we need to
clear dirty bit for range (0, 0x1020), we should only clean (0, 0x1000) since
there can still be dirty data on range (0x1020, 0x2000).

Thanks,

> 
> Dave
> 
> > #10 0x0000555555babe99 in vga_draw_graphic (full_update=0, s=0x5555573394b0)
> >     at ../../devel/qemu/hw/display/vga.c:1661
> > #11 0x0000555555babe99 in vga_update_display (opaque=0x5555573394b0)
> >     at ../../devel/qemu/hw/display/vga.c:1784
> > #12 0x0000555555babe99 in vga_update_display (opaque=0x5555573394b0)
> >     at ../../devel/qemu/hw/display/vga.c:1757
> > #13 0x00005555558ddd32 in graphic_hw_update (con=0x555556a11800)
> >     at ../../devel/qemu/ui/console.c:279
> > #14 0x00005555558dccd2 in dpy_refresh (s=0x555556c17da0) at 
> > ../../devel/qemu/ui/console.c:1742
> > #15 0x00005555558dccd2 in gui_update (opaque=opaque@entry=0x555556c17da0)
> >     at ../../devel/qemu/ui/console.c:209
> > #16 0x0000555555dbd520 in timerlist_run_timers (timer_list=0x555556937c50)
> >     at ../../devel/qemu/util/qemu-timer.c:574
> > #17 0x0000555555dbd520 in timerlist_run_timers (timer_list=0x555556937c50)
> >     at ../../devel/qemu/util/qemu-timer.c:499
> > #18 0x0000555555dbd74a in qemu_clock_run_timers (type=<optimized out>)
> >     at ../../devel/qemu/util/qemu-timer.c:670
> > #19 0x0000555555dbd74a in qemu_clock_run_all_timers () at 
> > ../../devel/qemu/util/qemu-timer.c:670
> > 
> > Looks like something in the vga code calls this with size=6910080
> > and thus triggers the alignment assertion?
> > 
> >  Thomas
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]