On Sat, 21 Jun 2014, BALATON Zoltan wrote:
On Fri, 20 Jun 2014, Mark Cave-Ayland wrote:
As for the code that generates the ISIs, is this in MorphOS as
opposed to OpenBIOS? I guess something must have previously accessed
an entry on the same page before the registers were updated, or
maybe there is some kind of hardware readahead?
The code is in the MorphOS boot loader and what it does is trying to
take over memory management. Unfortunately it seems there is a period
when it already replaced the vectors but have not set up the TLB hash
table yet so it cannot actually handle exceptions. I could prevent
DSIs but running the code during this period generates ISI-s. If the
code is run in the same order on real hardware then it's not likely
that the page is accessed there and not on QEMU. A readahead could
explain it but I don't know if that happens. I have no better idea
now than manually generating faults for all pages where the client
code is loaded before calling it. I'll try to implement that unless
someone can suggest a better solution.
Experimenting with it some more I could not make it work that way
probably because by the time the ISI happens the vectors are already
replaced and the sr0 register is overwritten so even if I manage to
put translations in our hash table for that code that won't be correct
any more by the time the ISI that causes a crash is handled.
So the only things that might work are using IBATs or disabling the
MMU bits when MorphOS overwrites the vectors. I think this latter
option is a bit cleaner but how can I get an interrupt when a memory
address is written to? (Apart from setting up a watch point for it.)
In the OpenBIOS code there are comments stating that page 0 is not
mapped to catch NULL pointer dereferences but this does not seem to
work as MorphOS can write over page 0 and only get an exception when
reaching the next page. (Also I've found documented in one version of
the PPC programming environments document that 0x00-0xff is reserved
for operating system use so they can legally write there.) I'm sure
I'm missing something here again.
It could possibly work on real hardware because of caches and that's
where the read ahead might happen during the critical part. This is
what seems to happen during MorphOS's MMU take over as far as I
understand:
1. memcpy to 0, len=0x2000
2. fixup jumps and write base address to 0x80 (this is what's zeroed at
the earlier write at the beginning)
3. Set MSR_BE and tweak HID0 (this may cause code to be preloaded in
cache?)
4. Set sr0-15 from stack variables
5. Values for IBAT and DBAT registers are loaded from the stack and SDR1
is set to 0 dropping the hash table
6. MSR_DR and MSR_IR are cleared
7. BAT registers are set up as shown in this message:
http://www.openfirmware.info/pipermail/openbios/2014-June/008419.html
which means that the stack should be within the first 256MB
8. TLB entries are invalidated then MMU bits are re-enabled (but
hashed page tables are not there yet so it relies on BATs at this
point)
9. After doing some other stuff eventually a function is called that sets
up the hash table
10. Vectors are replaced again later during boot after the microkernel
has
started its servers