qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: An issue with x86 tcg and MMIO


From: Jonathan Cameron
Subject: Re: An issue with x86 tcg and MMIO
Date: Thu, 2 Feb 2023 09:39:11 +0000

On Wed, 1 Feb 2023 11:50:50 -1000
Richard Henderson <richard.henderson@linaro.org> wrote:

> On 2/1/23 05:24, Jørgen Hansen wrote:
> > Hello Richard,
> > 
> > We are using x86 qemu to test some CXL stuff, and in that process we are
> > running into an issue with tcg. In qemu, CXL memory is mapped as an MMIO
> > region, so when using CXL memory as part of the system memory,
> > application code and data can be spread out across a combination of DRAM
> > and CXL memory (we are using the Linux tiered memory numa balancing,
> > that will migrate individual pages between DRAM and CXL memory based on
> > access patterns). When we are running memory intensive applications, we
> > hit the following assert in translator_access:
> > 
> >                /* We cannot handle MMIO as second page. */
> >                assert(tb->page_addr[1] != -1);
> > 
> > introduced in your commit 50627f1b. This is using existing applications
> > and standard Linux. We discussed this with Alistair Francis and he
> > mentioned that it looks like a load across a page boundary is happening,
> > and it so happens that the first page is DRAM and second page MMIO. We
> > tried - as a workaround - to return NULL instead of the assert to
> > trigger the slow path processing, and that allows the system to make
> > forward progress, but we aren't familiar with tcg, and as such aren't
> > sure if that is a correct fix.
> > 
> > So we'd like to get your input on this - and understand whether the
> > above usage isn't supported at all or if there are other possible
> > workarounds.  
> 
> Well, this may answer my question in
> 
> 1d6b1894-9c45-2d70-abde-9c10c1b3b93f@linaro.org/">https://lore.kernel.org/qemu-devel/1d6b1894-9c45-2d70-abde-9c10c1b3b93f@linaro.org/
> 
> as to how this could occur.
> 
> Until relatively recently, TCG would refuse to execute out of MMIO *at all*.  
> This was 
> relaxed to support Arm m-profile, which needs to execute a few instructions 
> out of MMIO 
> during the boot process, before jumping into flash.
> 
> This works by reading one instruction, translating it, executing it, and 
> immediately 
> discarding the translation.  It could be possible to adjust the translator to 
> allow the 
> second half of an instruction to be in MMIO, such that we execute and 
> discard, however...
> 
> What is it about CXL that requires modeling with MMIO?  If it is intended to 
> be used 
> interchangeably with RAM by the guest, then you really won't like the 
> performance you will 
> see with TCG executing out of these regions.

To be honest I wasn't aware of this restriction.

I 'thought' it was necessary to support interleaving as we need some callbacks
in the read / write paths to map through to the right address space.
So performance will suck even if we can model as memory (I'm not sure how to do
whilst maintaining the interleaving code).

To have anything approaching accurate modeling we need to apply a memory 
decoders
in the host, host-bridge and switches.  We could have faked all this and mapped 
directly
through to a single memory backend, but that would then have been useless for 
actually
testing the interleaving.

Specifically what happens in cxl_cfmws_find_device()
https://elixir.bootlin.com/qemu/latest/source/hw/cxl/cxl-host.c#L129

If we can do that for other types of region then I'm fine with changing it.

> 
> Could memory across the CXL link be modeled as a ROM device, similar to 
> flash?  This does 
> not have the same restrictions as MMIO.

Not sure - if we can do the handling above then sure we could make that change.
I can see there is a path to register the callbacks but I'd kind of assumed
ROM meant read only...

Jonathan

> 
> 
> r~
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]