Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM Use

qemu-discuss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM Use

From:	Jakob Bohm
Subject:	Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation
Date:	Thu, 14 May 2020 12:31:15 +0200
User-agent:	Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0

On 13/05/2020 12:02, Alex Bennée wrote:

Vijay Daita <address@hidden> writes:

Hello

It is my understanding that one would be unable to do x86-to-ARM user space
emulation while utilizing all cores because of x86 barriers.

Actually the utilisation of multiple cores (often referred to at MTTCG)
is a function of system emulation and you are correct for x86-on-ARM we
don't enable MTTCG because we don't currently add barrier instructions
to fully emulate the x86 memory model. However for linux-user we have
always followed the guest threading model because the guest clone() is
passed down to the host. However because the memory modelling isn't
perfect you can run into problems because of the mismatch.

I wanted to
know if there is difference between what QEMU aims to do and using a
interpreter of sorts to convert x86 instructions directly to ARM
instructions so that when run on the system directly, the system can
decide, itself, how to apportion the task.

This is what the TCG does - it translates guest instructions into groups
of host instructions. We could insert the extra barriers for all loads
and stores but the effect would be to cripple performance. In an ideal
world we would only do these for the load/store instructions involved in
inter-thread synchronisation operations but that's a fairly tricky
problem to solve.

Especially because the x86 memory model traditionally hasbarrier/synchronizationinstructions automatically push through their ordering to all othercores/CPUs,

and as a result, "barrier load" wasn't really a thing until CMPXCHG was
introduced in a later CPU generation than the basic sync instructions

(8086) and cache coherency mechanisms (80486). In fact, the LOCKbarrier prefix

triggers an #UD exception with most load instructions.

The one exception to this lack was instruction decoding, where certaincommonlyused branch instructions were defined as implicitly picking up anychanges ininstruction memory. This of cause corresponds to the TCG checking forneeded

retranslation of buffers at those points.

Additionally, x86 barriers generally guarantee total ordering relativeto thebarrier operation of all memory accesses that occur before or after inprogram

order, which some other CPU families do not.

I am new to this, so sorry if
this doesn't make very much sense.

Thank you



Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2860 Soborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

[Prev in Thread]

Current Thread

[Next in Thread]

Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation, Vijay Daita, 2020/05/11
- Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation, Alex Bennée, 2020/05/13
  - Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation, Jakob Bohm <=
    - Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation, Peter Maydell, 2020/05/14

Prev by Date: Re: Automating Qemu and GDB together
Next by Date: Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation
Previous by thread: Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation
Next by thread: Re: Using All Cores of CPU on Snapdragon Processor during x86-to-ARM User Space Emulation
Index(es):
- Date
- Thread