qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support


From: Aravinda Prasad
Subject: Re: [Qemu-ppc] [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests
Date: Tue, 01 Sep 2015 11:51:47 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6


On Friday 07 August 2015 09:07 AM, Sam Bobroff wrote:
> Hello Aravinda and all,
> 
> On Wed, Jul 08, 2015 at 01:58:13PM +0530, Aravinda Prasad wrote:
>> On Friday 03 July 2015 11:31 AM, David Gibson wrote:
>>> On Thu, Jul 02, 2015 at 07:11:52PM +1000, Alexey Kardashevskiy wrote:
>>>> On 04/02/2015 03:46 PM, David Gibson wrote:
>>>>> On Thu, Apr 02, 2015 at 03:28:11PM +1100, Alexey Kardashevskiy wrote:
>>>>>> On 11/19/2014 04:48 PM, Aravinda Prasad wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
>>>>>>>
>>>>>>> [..]
>>>>>>>
>>>>>>>>
>>>>>>>> So, this may not still be possible depending on whether the KVM side
>>>>>>>> of this is already merged, but it occurs to me that there's a simpler
>>>>>>>> way.
>>>>>>>>
>>>>>>>> Rather than mucking about with having to update the hypervisor on the
>>>>>>>> RTAS location, they have qemu copy the code out of RTAS, patch it and
>>>>>>>> copy it back into the vector, you could instead do this:
>>>>>>>>
>>>>>>>>   1. Make KVM instead of immediately delivering a 0x200 for a guest
>>>>>>>> machine check, cause a special exit to qemu.
>>>>>>>>
>>>>>>>>   2. Have the register-nmi RTAS call store the guest side MC handler
>>>>>>>> address in the spapr structure, but perform no actual guest code
>>>>>>>> patching.
>>>>>>>>
>>>>>>>>   3. Allocate the error log buffer independently from the RTAS blob,
>>>>>>>> so qemu always knows where it is.
>>>>>>>>
>>>>>>>>   4. When qemu gets the MC exit condition, instead of going via a
>>>>>>>> patched 0x200 vector, just directly set the guest register state and
>>>>>>>> jump straight into the guest side MC handler.
>>>>>>>>
>>>>>>>
>>>>>>> Before I proceed further I would like to know what others think about
>>>>>>> the approach proposed above (except for step 3 - as per PAPR the error
>>>>>>> log buffer should be part of RTAS blob and hence we cannot have error
>>>>>>> log buffer independent of RTAS blob).
>>>>>>>
>>>>>>> Alex, Alexey, Ben: Any thoughts?
>>>>>>
>>>>>>
>>>>>> Any updates about FWNMI? Thanks
>>>>>
>>>>> Huh.. I'd completely forgotten about this.  Aravinda, can you repost
>>>>> your latest work on this?
>>>>
>>>>
>>>> Aravinda disappeared...
>>>
>>> Ok, well someone who cares about FWNMI is going to have to start
>>> sending something, or it won't happen.
>>
>> I am yet to work on the new approach proposed above. I will start
>> looking into that this week.
> 
> The RTAS call being discussed in this thread actually has two vectors to patch
> (System Reset and Machine Check), and the patches so far only address the
> Machine Check part. I've been looking at filling in the System Reset part and
> that will mean basing my code on top of this set.  I would like to keep the
> same style of solution for both vectors, so I'd like to get the discussion
> started again :-)
> 
> So (1) do we use a trampoline in guest memory, and if so (2) how is the
> trampoline code handled?
> 
> (1) It does seem simpler to me to deliver directly to the handler, but I'm
> worried about a few things:
> 
> If a guest were to call ibm,nmi-register and then kexec to a new kernel that
> does not call ibm,nmi-register, would the exception cause a jump to a stale
> address?

If a kexec kernel does not call ibm,nmi-register, then an exception can
lead to a jump to stale address in the kexec kernel. This can happen
with the v3 patches also i.e., it can happen even if we don't take the
approach of delivering directly to the handler. Or is there something
else which I am missing?

> 
> Because we're adding a new exit condition, presumably an upgraded KVM would
> require an upgraded QEMU: is this much of a problem?
> 
> From some investigation it looks like the current upstream KVM already
> forwards (some) host machine checks to the guest by sending it directly to
> 0x200 and that Linux guests expect this, regardless of support in the host for
> ibm,nmi-register (although they do call ibm,nmi-register if it's present).

Upstream KVM was modified to route MCE to guest 0x200 as a part of
handling machine check work. AFAIR, earlier, MCE error was directly
delivered to QEMU.

Regards,
Aravinda

> 
> (2) If we are using trampolines:
> 
> About the trampoline code in the v3 patches: I like producing the code using
> the assembler, but I'm not sure that the spapr-rtas blob is the right place to
> store it. The spapr-rtas blob is loaded into guest memory but it's only QEMU
> that needs it. It seems messy to me and means that the guest could corrupt it.
> 
> Some other other options might be:
> 
> (a) Create a new blob (spapr-rtas-trampoline?) just like the spapr-rtas one 
> but
> only load it when ibm,nmi-register is called, and only into QEMU not the guest
> memory. There would be another "BIOS" blob to install, and it wouldn't really
> actually be BIOS but it seems like it would work easily.  Since we need a
> second, different, trampoline for System Reset, I would then need to add yet
> another blob for that... Still, this doesn't seem so bad. I suppose we could
> add some structure to the blob (e.g. a table of contents at the start) and fit
> both trampolines in, but that's inventing yet another file format... ugh.
> 
> (b) As above but assemble the trampoline code into an ELF dynamic library
> rather than stripping it down to a raw binary: we could use known symbols to
> find the trampolines, even the patch locations, so at least we wouldn't be
> inventing our own format (using dlopen()/dlsym()... I wonder if this would be
> OK for all platforms...).
> 
> (c) Assemble it (as above) but include it directly in the QEMU binary by
> objcopying it in or hexdumping into a C string or something similar. This 
> seems
> fairly neat but I'm not sure how people would feel about including "binaries"
> into QEMU this way.  Although it would take some work in the build system, it
> seems like a fairly neat solution to me.
> 
> Cheers,
> Sam.
> 

-- 
Regards,
Aravinda




reply via email to

[Prev in Thread] Current Thread [Next in Thread]