qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VM


From: David Gibson
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] ppc: add CPU IRQ state to PPC VMStateDescription
Date: Thu, 14 Sep 2017 13:30:22 +1000
User-agent: Mutt/1.8.3 (2017-05-23)

On Wed, Sep 13, 2017 at 05:44:54PM +0100, Mark Cave-Ayland wrote:
> On 13/09/17 07:02, David Gibson wrote:
> 
> >>> Alexey - do you recall from your analysis why these fields were no
> >>> longer deemed necessary, and how your TCG tests were configured?
> >>
> >> I most certainly did not do analysis (my bad. sorry) - I took the patch
> >> from David as he left the team, fixed to compile and pushed away. I am also
> >> very suspicions we did not try migrating TCG or anything but pseries. My
> >> guest that things did not break (if they did not which I am not sure about,
> >> for the TCG case) because the interrupt controller (XICS) or the
> >> pseries-guest took care of resending an interrupt which does not seem to be
> >> the case for mac99.
> > 
> > Right, that's probably true.  The main point, though, is that these
> > fields were dropped a *long* time ago, when migration was barely
> > working to begin with.  In particular I'm pretty sure most of the
> > non-pseries platforms were already pretty broken for migration
> > (amongst other things).
> > 
> > Polishing the mac platforms up to working again, including migration,
> > is a reasonable goal.  But it can't be at the expense of pseries,
> > which is already working, used in production, and much better tested
> > than mac99 or g3beige ever were.
> 
> Oh I completely agree since I'm well aware pseries likely has more users
> than the Mac machines - my question was directed more about why we
> support backwards migration.
> 
> I spent several hours yesterday poking my Darwin test case with trying
> the different combinations of pending_interrupts, irq_input_state and
> access_type and could easily provoke migration failures unless all 3 of
> the fields were present so a practical test shows they are still
> required for TCG migration. I think ppc_set_irq()'s use of the interrupt
> fields in hw/ppc/ppc.c and the subsequent reference to pending
> interrupts in target/ppc may explain why I see freezes/hangs until a key
> is pressed in many cases.

Ok, I think we need to consider (pending_interrupts and irq_input_state)
separately from access_type.  The first two are pretty closely related
to each other, and I've got at least a rough idea of what the problems
there might be.  access_type I'm pretty sure has to be an unrelated
problem, and I've got much less of a handle on it.

I suspect we could work around the problems with pending_interrupts
and irq_input_state by having a post_load hook in the board level
interrupt controller to reassert its output irq line based on its
current state.  I believe the relevant irq inputs to the cpu are
effectively level triggered, so I think that will be enough.

access_type I don't have any good ideas for yet.  We really need to
work out what the exact race is here that's causing its state to be
lost harmfully.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]