[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] QEMU terminates during reboot after memory unplug with
From: |
Igor Mammedov |
Subject: |
Re: [Qemu-devel] QEMU terminates during reboot after memory unplug with vhost=on |
Date: |
Thu, 14 Sep 2017 10:59:05 +0200 |
On Thu, 14 Sep 2017 13:48:26 +0530
Bharata B Rao <address@hidden> wrote:
> On Thu, Sep 14, 2017 at 10:00:11AM +0200, Igor Mammedov wrote:
> > On Thu, 14 Sep 2017 12:31:18 +0530
> > Bharata B Rao <address@hidden> wrote:
> >
> > > Hi,
> > >
> > > QEMU hits the below assert
> > >
> > > qemu-system-ppc64: used ring relocated for ring 2
> > > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion `r
> > > >= 0' failed.
> > >
> > > in the following scenario:
> > >
> > > 1. Boot guest with vhost=on
> > > -netdev tap,id=mynet0,script=qemu-ifup,downscript=qemu-ifdown,vhost=on
> > > -device virtio-net-pci,netdev=mynet0
> > > 2. Hot add a DIMM device
> > > 3. Reboot
> > > When the guest reboots, we can see
> > > vhost_virtqueue_start:vq->used_phys getting assigned an address that
> > > falls in the hotplugged memory range.
> > > 4. Remove the DIMM device
> > > Guest refuses the removal as the hotplugged memory is under use.
> > > 5. Reboot
> >
> > > QEMU forces the removal of the DIMM device during reset and that's
> > > when we hit the above assert.
> > I don't recall implementing forced removal om DIMM,
> > could you point out to the related code, pls?
>
> This is ppc specific. We have DR Connector objects for each LMB (multiple
> LMBs make up one DIMM device) and during reset we invoke the
> release routine for these LMBs which will further invoke
> pc_dimm_memory_unplug().
>
> See hw/ppc/spapr_drc.c: spapr_drc_reset()
> hw/ppc/spapr.c: spapr_lmb_release()
>
> >
> > > Any pointers on why we are hitting this assert ? Shouldn't vhost be
> > > done with using the hotplugged memory when we hit reset ?
> >
> > >From another point of view,
> > DIMM shouldn't be removed unless guest explicitly ejects it
> > (at least that should be so in x86 case).
>
> While that is true for ppc also, shouldn't we start fresh from reset ?
we should.
when it aborts vhost should print out error from vhost_verify_ring_mappings()
if (r == -ENOMEM) {
error_report("Unable to map %s for ring %d", part_name[j], i);
} else if (r == -EBUSY) {
error_report("%s relocated for ring %d", part_name[j], i);
that might give a clue where that memory stuck in.
Michael might point out where to start look at, but he's on vacation
so ...
> Related comment from hw/ppc/spapr_drc.c: spapr_drc_reset()
>
> /* immediately upon reset we can safely assume DRCs whose devices
> * are pending removal can be safely removed.
> */
> if (drc->unplug_requested) {
> spapr_drc_release(drc);
> }
>
> So essentially in the scenario I listed, the unplug request is rejected
> by the guest, but during next reboot, QEMU excersies the above code
> and removes any devices (memory, CPU etc) that are marked for removal.
I'd remove pending removal on reset for DIMM/CPU if it's acceptable from
PPC HW pov.
>
> Regards,
> Bharata.
>