[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-discuss] Possible Qemu inconsistency when emulating TLB for rp
Re: [Qemu-discuss] Possible Qemu inconsistency when emulating TLB for rpi3 machine
Mon, 4 Feb 2019 16:40:46 +0000
On Mon, 4 Feb 2019 at 16:10, Mattia Maldini <address@hidden> wrote:
> Everything was running smoothly until I started using the MMU: I find
> myself in a situation where the same code yields different results on Qemu
> and on RPi3, and I believe only the real hardware is behaving correctly. I
> set up a very simplified example to showcase my problem.
> Basically I have two page tables (identical mapping, just to test the
> virtual memory): one for the "kernel" running at EL1 and one for a "user
> process" running at EL0.
> When the process is running TTBR0 must point to its page table because of
> permissions: page entries in the EL0 table must include EL0 read-write
> permissions, while page entries in the EL1 table must forbid them. Each
> actor (kernel & process) MUST then run on its own set of page tables.
> Everything the kernel does is setting up the MMU and launching the process
> at EL0; while doing so, he also sets TTBR0 with the correct page table for
> the process. Note that at this point page entries are configured as global
> and I'm not using any ASID.
> This situation works on Qemu but not on real hardware, and I think the
> emulator is wrong. Both kernel and process have their code in the same
> block of memory, so they share MMU entries. When the kernel starts
> executing its code pages are loaded and saved in the TLB; when the process
> starts it should ask for the same entries and receive the kernel pages
> saved in the TLB, failing because they are configured to prevent EL0
> access. This is what happens on the real RPi3.
If your guest code is relying on entries staying in the TLB then it
is not correct -- there is no architectural guarantee that an entry
is ever kept in the TLB. An implementation is free to throw out a
TLB entry any time it likes (it's just a cache, effectively).
It's hard to say what's actually going on here without a test case,
but QEMU only promises to run architecturally correct code the
way the architecture says it should run. It doesn't guarantee to
run incorrect code in the same way the hardware happens to run it.
QEMU aims to be an architecturally valid implementation, not
an implementation that matches the real hardware CPU. (That is, we are
free to behave differently for things which the architecture manual
defines as IMPLEMENTATION DEFINED or UNPREDICTABLE.)
> To solve my problem I started using an ASID for the process: I put some
> value into the two most significant bytes of the TTBR0 register and set all
> page entries as Non-Global. This way when the process asks for a table
> entry the ones from the kernel found in the TLB cache are discarded because
> of the difference in ASID.
> This solution works as expected on the Raspberry Pi 3 but results in an
> abort exception (permission fault) on Qemu.
Again, if you're relying on entries being present or not in the
TLB your code isn't correct. You need to make sure you get the
right answer whether the CPU has cached a TLB entry or whether
it does a complete page table walk to get the answer.
> Furthermore, it seems that Qemu aborts every time it looks for a different
> ASID in the TLB (this is a conjecture, I'm not a Qemu developer myself). I
> have studied the ARMv8 MMU thoroughly and I think my configuration is
We don't abort (unless the guest page table entry indicates that
we should abort). We will take the slow path for a TLB miss
(and then do a page table walk); ASID changes will mean we
get TLB misses (see below).
> Despite being confused about the situation, the fact holds that the real
> hardware behaves as expected while Qemu does not. What I want to ask at
> this point is:
> 1. Does Qemu emulate the ASID-TLB interaction for AArch64? Is this
> behaviour expected? Do you see an obvious mistake on my side?
> 2. Can this be a bug in the rpi3 machine Qemu implementation? How could
> I investigate further?
It's certainly possible that there's a QEMU bug, but the TLB/MMU
code has been around a while. A QEMU bug is more likely if you're
doing something odd that Linux guests don't do with the MMU.
We don't store ASIDs in our TLB, we only keep a note of the current
ASID. If the current ASID changes, we flush the entire TLB. This
produces architecturally correct behaviour (you never get a TLB hit
for the wrong ASID because there's never an entry in the TLB for
the wrong ASID), but it's not exactly what the h/w does.
If your guest code is correctly holding up its end of the architectural
contract it should not care.
Investigating this sort of bug is a bit painful because you need
to be familiar with all of (a) the guest code (b) the Arm architecture
(c) QEMU's internals.