Re: [Qemu-discuss] ppc and icount

From: Steven Seeger
Subject: Re: [Qemu-discuss] ppc and icount
Date: Wed, 10 Jan 2018 03:57:03 -0500

Sorry for another post. I did a bisect and found what is the bad commit for 

044897ef4a22af89aecb8df509477beba0a2e0ce is the first bad commit
commit 044897ef4a22af89aecb8df509477beba0a2e0ce
Author: Richard Purdie <address@hidden>
Date:   Mon Dec 4 22:25:43 2017 +0000

    target/ppc: Fix system lockups caused by interrupt_request state 
    Occasionally in Linux guests on x86_64 we're seeing logs like:
    ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 
    when they should read:
    ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 
    The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls
    cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function
    just before the log message. Something is causing the HARD bit setting
    to get lost.
    The knock on effect of losing that bit is the decrementer timer interrupts
    don't get delivered which causes the guest to sit idle in its idle handler
    and 'hang'.
    The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB.
    Rather than poking directly into cs->interrupt_request, that code needs 
    a) hold BQL
    b) use the cpu_interrupt() helper
    This patch fixes the call sites to do this, fixing the hang. The calls
    are made from a variety of contexts so a helper function is added to 
    the necessary locking. This can likely be improved and optimised in the 
    but it ensures the code is correct and doesn't lockup as it stands today.
    Signed-off-by: Richard Purdie <address@hidden>
    Signed-off-by: David Gibson <address@hidden>

