qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip


From: Jan Kiszka
Subject: Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip
Date: Wed, 02 Feb 2011 17:51:32 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2011-02-02 17:39, Gleb Natapov wrote:
> On Wed, Feb 02, 2011 at 05:36:53PM +0100, Jan Kiszka wrote:
>> On 2011-02-02 17:29, Gleb Natapov wrote:
>>> On Wed, Feb 02, 2011 at 04:52:11PM +0100, Jan Kiszka wrote:
>>>> On 2011-02-02 16:46, Gleb Natapov wrote:
>>>>> On Wed, Feb 02, 2011 at 04:35:25PM +0100, Jan Kiszka wrote:
>>>>>> On 2011-02-02 16:09, Avi Kivity wrote:
>>>>>>> On 02/02/2011 04:52 PM, Jan Kiszka wrote:
>>>>>>>> On 2011-02-02 15:43, Jan Kiszka wrote:
>>>>>>>>>  On 2011-02-02 15:35, Avi Kivity wrote:
>>>>>>>>>>  On 02/02/2011 04:30 PM, Jan Kiszka wrote:
>>>>>>>>>>>  On 2011-02-02 14:05, Avi Kivity wrote:
>>>>>>>>>>>>   On 02/02/2011 02:50 PM, Jan Kiszka wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    Opps, -smp 1. With -smp 2 it boot almost completely and then 
>>>>>>>>>>>>>> hangs.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   Ah, good (or not good). With Windows 2003 Server, I actually 
>>>>>>>>>>>>> get a Blue
>>>>>>>>>>>>>   Screen (Stop 0x000000b8).
>>>>>>>>>>>>
>>>>>>>>>>>>   Userspace APIC is broken since it may run with an outdated cr8, 
>>>>>>>>>>>> does
>>>>>>>>>>>>   reverting 27a4f7976d5 help?
>>>>>>>>>>>
>>>>>>>>>>>  Can you elaborate on what is broken? The way hw/apic.c maintains 
>>>>>>>>>>> the
>>>>>>>>>>>  tpr? Would it make sense to compare this against the in-kernel 
>>>>>>>>>>> model? Or
>>>>>>>>>>>  do you mean something else?
>>>>>>>>>>
>>>>>>>>>>  The problem, IIRC, was that we look up the TPR but it may already 
>>>>>>>>>> have
>>>>>>>>>>  been changed by the running vcpu.  Not 100% sure.
>>>>>>>>>>
>>>>>>>>>>  If that is indeed the problem then the fix would be to process the 
>>>>>>>>>> APIC
>>>>>>>>>>  in vcpu context (which is what the kernel does - we set a bit in 
>>>>>>>>>> the IRR
>>>>>>>>>>  and all further processing is synchronous).
>>>>>>>>>
>>>>>>>>>  You mean: user space changes the tpr value while the vcpu is in 
>>>>>>>>> KVM_RUN,
>>>>>>>>>  then we return from the kernel and overwrite the tpr in the apic with
>>>>>>>>>  the vcpu's view, right?
>>>>>>>>
>>>>>>>> Hmm, probably rather that there is a discrepancy between tpr and irr.
>>>>>>>> The latter is changed asynchronously /wrt to the vcpu, the former /wrt
>>>>>>>> the user space device model.
>>>>>>>
>>>>>>> And yet, both are synchronized via qemu_mutex.  So we're still missing 
>>>>>>> something in this picture.
>>>>>>>
>>>>>>>> Run apic_set_irq on the vcpu?
>>>>>>>
>>>>>>> static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
>>>>>>> {
>>>>>>>      apic_irq_delivered += !get_bit(s->irr, vector_num);
>>>>>>>
>>>>>>>      trace_apic_set_irq(apic_irq_delivered);
>>>>>>>
>>>>>>>      set_bit(s->irr, vector_num);
>>>>>>>
>>>>>>> This is even more async with kernel irqchip
>>>>>>>
>>>>>>>      if (trigger_mode)
>>>>>>>          set_bit(s->tmr, vector_num);
>>>>>>>      else
>>>>>>>          reset_bit(s->tmr, vector_num);
>>>>>>>
>>>>>>> This is protected by qemu_mutex
>>>>>>>
>>>>>>>      apic_update_irq(s);
>>>>>>>
>>>>>>> This will be run the next time the vcpu exits, via apic_get_interrupt().
>>>>>>
>>>>>> The decision to pend an IRQ (and potentially kick the vcpu) takes place
>>>>>> immediately in acip_update_irq. And it is based on current irr as well
>>>>>> as tpr. But we update again when user space returns with a new value.
>>>>>>
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> Did you check whether reverting that commit helps?
>>>>>>>
>>>>>>
>>>>>> Just did so, and I can no longer reproduce the problem. Hmm...
>>>>>>
>>>>> If there is no problem in the logic of this commit (and I do not see
>>>>> one yet) then we somewhere miss kicking vcpu when interrupt, that should 
>>>>> be
>>>>> handled, arrives?
>>>>
>>>> I'm not yet confident about the logic of the kernel patch: mov to cr8 is
>>>> serializing. If the guest raises the tpr and then signals this with a
>>>> succeeding, non vm-exiting instruction to the other vcpus, one of those
>>>> could inject an interrupt with a higher priority than the previous tpr,
>>>> but a lower one than current tpr. QEMU user space would accept this
>>>> interrupt - and would likely surprise the guest. Do I miss something?
>>>>
>>> Injection happens by vcpu thread on cpu entry:
>>> run->request_interrupt_window = kvm_arch_try_push_interrupts(env);
>>> and tpr is synced on vcpu exit, so I do not yet see how what you describe
>>> above may happen since during injection vcpu should see correct tpr.
>>
>> Hmm, maybe this is the key: Once we call into apic_get_interrupt
>> (because CPU_INTERRUPT_HARD was set as described above) and we find a
>> pending irq below the tpr, we inject a spurious vector instead.
>>
> That should be easy to verify. I expect Windows to BSOD upon receiving
> spurious vector though.

I hacked spurious irq injection away, but the issue remains. At the same
time, Windows is receiving tons of spurious interrupts without any
complaints, even without that tpr optimization in the kernel. So this is
obviously not yet the key.

Let's try your idea that we miss a wakeup.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



reply via email to

[Prev in Thread] Current Thread [Next in Thread]