qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 15/40] i386/tdx: Add property sept-ve-disable for tdx-gues


From: Gerd Hoffmann
Subject: Re: [PATCH v1 15/40] i386/tdx: Add property sept-ve-disable for tdx-guest object
Date: Fri, 2 Sep 2022 07:46:21 +0200

On Fri, Sep 02, 2022 at 02:52:25AM +0000, Sean Christopherson wrote:
> On Fri, Sep 02, 2022, Xiaoyao Li wrote:
> > On 8/26/2022 1:57 PM, Gerd Hoffmann wrote:
> > >    Hi,
> > > > For TD guest kernel, it has its own reason to turn SEPT_VE on or off. 
> > > > E.g.,
> > > > linux TD guest requires SEPT_VE to be disabled to avoid #VE on syscall 
> > > > gap
> > > > [1].
> > > 
> > > Why is that a problem for a TD guest kernel?  Installing exception
> > > handlers is done quite early in the boot process, certainly before any
> > > userspace code runs.  So I think we should never see a syscall without
> > > a #VE handler being installed.  /me is confused.
> > > 
> > > Or do you want tell me linux has no #VE handler?
> > 
> > The problem is not "no #VE handler" and Linux does have #VE handler. The
> > problem is Linux doesn't want any (or certain) exception occurrence in
> > syscall gap, it's not specific to #VE. Frankly, I don't understand the
> > reason clearly, it's something related to IST used in x86 Linux kernel.
> 
> The SYSCALL gap issue is that because SYSCALL doesn't load RSP, the first 
> instruction
> at the SYSCALL entry point runs with a userspaced-controlled RSP.  With TDX, a
> malicious hypervisor can induce a #VE on the SYSCALL page and thus get the 
> kernel
> to run the #VE handler with a userspace stack.
> 
> The "fix" is to use an IST for #VE so that a kernel-controlled RSP is loaded 
> on #VE,
> but ISTs are terrible because they don't play nice with re-entrancy (among 
> other
> reasons).  The RSP used for IST-based handlers is hardcoded, and so if a #VE
> handler triggers another #VE at any point before IRET, the second #VE will 
> clobber
> the stack and hose the kernel.
> v
> It's possible to workaround this, e.g. change the IST entry at the very 
> beginning
> of the handler, but it's a maintenance burden.  Since the only reason to use 
> an IST
> is to guard against a malicious hypervisor, Linux decided it would be just as 
> easy
> and more beneficial to avoid unexpected #VEs due to unaccepted private pages 
> entirely.

Hmm, ok, but shouldn't the SEPT_VE bit *really* controlled by the guest then?

Having a hypervisor-controlled config bit to protect against a malicious
hypervisor looks pointless to me ...

take care,
  Gerd




reply via email to

[Prev in Thread] Current Thread [Next in Thread]