[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU, self-modifying code, and Windows 7 64-bit (no KVM

From: Hulin, Patrick - 0559 - MITLL
Subject: Re: [Qemu-devel] QEMU, self-modifying code, and Windows 7 64-bit (no KVM)
Date: Thu, 14 Aug 2014 13:53:22 +0000

I suppose I should probably add a tl;dr.

I have a diagnosis of the reason Windows 7 64-bit won’t run without KVM, as 
well as a hack to fix it, but I’d like input on a real fix.

Details below.

On Aug 13, 2014, at 2:36 PM, Hulin, Patrick - 0559 - MITLL <address@hidden> 

> Hi QEMU devs,
> QEMU 2.10 does not currently run Windows 7 64-bit without KVM. There have 
> been a few threads about this over the past few years (such as 
> https://bugs.launchpad.net/qemu/+bug/921208 and 
> http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg02603.html), but the 
> problem was never resolved. I think I've identified the cause, but I am not 
> sure what the correct way to fix it is. I'm working on PANDA, a set of 
> analysis extensions to QEMU (github.com/moyix/panda) and I'd really like to 
> be able to use our analyses on Windows 7 64-bit.
> There are two issues right now. The first is that QEMU is missing a CPUID bit 
> (for debug extensions, CPUID_DE) because the feature isn't implemented in 
> QEMU. This can easily be hacked around by just enabling the bit, but I 
> imagine you all aren't excited about advertising features that don't exist. 
> The second issue is that both the installer and the OS itself fail with blue 
> to illegal instruction). This is a little trickier.
> One of the major differences between Windows 7 x86 and x64 is that the 64-bit 
> version has Microsoft's Kernel Patch Protection, aka PatchGuard. In order to 
> protect itself, PatchGuard lives encrypted in memory and follows a two-stage 
> decryption process. The process begins with a series of xor's which 
> successively decrypt the PatchGuard code. This is self-modifying code (in 
> particular, the first xor overwrites itself and the next instruction).
> For the uninitiated, as I understand it, QEMU's self-modifying code support 
> works in the following way. Before executing a translation block, QEMU 
> write-protects (using host MMU features) the _host_ page that contains the 
> section of guest memory on which the guest TB code lives. When self-modifying 
> code attempts to write to that page, it triggers a host segmentation fault. 
> QEMU then catches this segmentation fault using standard POSIX signal 
> infrastructure. Once caught it walks into the software MMU code. If the write 
> intersects the current TB, QEMU splits the TB into two: the single 
> instruction that is being executed and the rest of the block, which is 
> invalidated so it will be retranslated as soon as QEMU tries to run it. QEMU 
> then restores the pre-write CPU state (cpu_restore_state) and longjmp's out 
> (cpu_resume_from_signal). The instruction then executes again, and this time 
> it actually makes the write to QEMU's memory state. QEMU translates the new 
> code, which is now in its own TB, and continues from there.
> In this case, the write is 8 bytes and unaligned, so it gets split into 8 
> single-byte writes. In stock QEMU, these writes are done in reverse order 
> (see the loop in softmmu_template.h, line 402). The third decryption xor from 
> Kernel Patch Protection should hit 4 bytes that are in the current TB and 4 
> bytes in the TB afterwards in linear order. Since this happens in reverse 
> order, and the last 4 bytes of the write do not intersect the current TB, 
> those writes happen successfully and QEMU's memory is modified. The 4th byte 
> in linear order (the 5th in temporal order) then triggers the 
> current_tb_modified flag and cpu_restore_state, longjmp'ing out. However, 
> cpu_restore_state only goes back to right before that byte is written, so the 
> last 4 bytes—the ones off the current TB—have been modified. QEMU then 
> invalidates, retranslates, and runs the xor again. This successfully decrypts 
> the 4 bytes inside the current TB, but because the write to the last 4 bytes 
> was not reversed as it should have been, those bytes get xor'd a second time. 
> Effectively, QEMU mistakenly re-encrypts those bytes. Once the code is 
> incorrect, inaccuracies build up until something blue screen-able happens (in 
> this case, an illegal instruction or various kinds of bad accesses).
> I am not sure how to fix this issue. For now, in our tool, PANDA, we have 
> just reversed the order of the loop. But that change will fail in any 
> situation in which the write happens off the front end of the TB and then the 
> self-modifying code loops back to the previous TB. This modification enables 
> Windows 7 x64 to run successfully without KVM, which is all we really need 
> for our purposes.
> I looked back in the commit history for this area of the code. It looks like 
> the order of the loop was changed from forwards to backwards back in 2007 by 
> the following two commits:
> commit 6c41b2723f5cac6e62e68925e7a73f30b11a7a06
> Author: balrog <address@hidden>
> Date:   Sat Nov 17 12:12:29 2007 +0000
>     Don't compare '\0' against pointers.
>     Add a note from Fabrice in slow_st template.
>     git-svn-id: svn://svn.savannah.nongnu.org/qemu/address@hidden 
> c046a42c-6fe2-441c-8c8c-71466251a162
> commit 7221fa98d381a19b8809979934554644381fb88c
> Author: balrog <address@hidden>
> Date:   Sat Nov 17 09:53:42 2007 +0000
>     Check permissions for the last byte first in unaligned slow_st accesses 
> (patch from TeLeMan).
>     git-svn-id: svn://svn.savannah.nongnu.org/qemu/address@hidden 
> c046a42c-6fe2-441c-8c8c-71466251a162
> The relevant qemu-devel thread is here: 
> https://lists.gnu.org/archive/html/qemu-devel/2007-10/msg00646.html. It looks 
> like the author was trying to fix a page boundary bug where the write was off 
> the front of the write-protected page and would happen twice, just as in this 
> case. Unfortunately, the "fix" just moved the problem to a different case. 
> Fabrice commented on that patch in this thread: 
> https://lists.gnu.org/archive/html/qemu-devel/2007-11/msg00538.html, saying 
> that the reverse-order code would work across forward page boundaries, 
> essentially by chance. Unfortunately, it caused the code to fail on forward 
> TB boundaries.
> If it's not too complicated, I'd like to contribute an actual fix back 
> upstream. I don't understand the MMU code completely, so if I've gotten 
> anything wrong please correct me. As I see it, there are two options, neither 
> of which seem too easy under the current control flow:
> - Make sure cpu_restore_state goes all the way back to the beginning of the 
> stq, and not just the most recent stb.
> - Specifically check to see if an stq intersects the current TB before 
> splitting it into the 8 stb's. 
> There are probably others though. Thoughts? Questions? It would be really 
> awesome to get a real fix for this bug.
> P.S. Windows 8 x64 still fails, even after my forward-loop patch. I'm working 
> on debugging that too.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]