Re: [Qemu-devel] Help on TLB Flush

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Help on TLB Flush

From:	Mark Burton
Subject:	Re: [Qemu-devel] Help on TLB Flush
Date:	Thu, 12 Feb 2015 16:11:23 +0100

OK - Alex - your implication is that it has to be atomic, we need the sync…

        :-(

I have a horrid feeling that the atomicity of global flush can’t be causing the 
(almost, but not quite reproducible) errors we’re seeing - but… anyway ;-)

Cheers

Mark.

> On 12 Feb 2015, at 15:45, Alexander Graf <address@hidden> wrote:
> 
> 
>> On 12.02.2015, at 15:35, Mark Burton <address@hidden> wrote:
>> 
>> 
>> TLB Flush:
>> 
>> We have spent a few days on this issue, and still haven’t resolved the best 
>> path.
>> 
>> Our solution seems to work, most of the time, but we still have some strange 
>> issues - so I want to check that what we are proposing has a chance of 
>> working.
>> 
>> 
>> Our plan is to allow all CPU’s to continue. Potentially one CPU will want to 
>> write to the TLBs. Subsequent to the write, it requests a TLB Flush.
> 
> Local or global? For local TLB flushes you don't notify the other CPUs at 
> all. For global ones, the semantics of the call usually dictate atomicity.
> 
>> We are proposing to implement this by signalling all other CPU’s to exit 
>> (and requesting they flush before re-starting). In other words, this would 
>> happen asynchronously.
> 
> For global flushes, give them a pointer payload along with the flush request 
> and tell all cpus to increment it atomically. In your main thread, wait until 
> *ptr == nKickedCpus.
> 
> FWIW TLBs are always CPU local. When there's a "global TLB flush" 
> instruction, it pretty much does stall the CPU, notifies the others to also 
> flush their TLBs, waits and then continues.
> 
> If this really does become a performance bottleneck (which I doubt it does, 
> almost nobody except x86 does global flushes), you can also do some nasty 
> hacky tricks, such as (atomically) change the valid bit in remote CPUs TLB 
> entries. But really only do this as a last resort if the clean version 
> doesn't perform well.
> 
> 
> Alex
> 
>> This means - there is a theoretical period of time when one CPU is writing 
>> to the TLBs while other CPU’s are executing.  Our belief is that this has to 
>> be handled by software anyway, and this should not be an issue from Qemu’s 
>> point of view. 
>> The alternative would be to force all other CPU’s to exit before writing the 
>> TLB’s - this is both expensive and very painful to organise (as we get into 
>> horrid deadlocks whichever way we turn)…
>> 
>> We’d appreciate some thoughts on this...
>> 
>> Cheers
>> 
>> Mark.
>> 
>> 
>> 
>>       +44 (0)20 7100 3485 x 210
>> +33 (0)5 33 52 01 77x 210
>> 
>>      +33 (0)603762104
>>      mark.burton
>> 
>> 
> 


         +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

        +33 (0)603762104
        mark.burton

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Help on TLB Flush, (continued)

Prev by Date: Re: [Qemu-devel] [v5 09/12] migration: Make compression co-work with xbzrle
Next by Date: Re: [Qemu-devel] [PATCH] sheepdog: Fix misleading error messages in sd_snapshot_create()
Previous by thread: Re: [Qemu-devel] Help on TLB Flush
Next by thread: Re: [Qemu-devel] [PATCH RFC 0/1] qtest: Generic PCI device test
Index(es):
- Date
- Thread