Re: [PATCH v2 0/2] MTE support for KVM guest

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/2] MTE support for KVM guest

From:	Steven Price
Subject:	Re: [PATCH v2 0/2] MTE support for KVM guest
Date:	Thu, 10 Sep 2020 10:21:04 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 10/09/2020 07:29, Andrew Jones wrote:

On Wed, Sep 09, 2020 at 05:04:15PM +0100, Steven Price wrote:

On 09/09/2020 16:25, Andrew Jones wrote:

On Fri, Sep 04, 2020 at 05:00:16PM +0100, Steven Price wrote:

   2. Automatically promotes (normal host) memory given to the guest to be
      tag enabled (sets PG_mte_tagged), if any VCPU has MTE enabled. The
      tags are cleared if the memory wasn't previously MTE enabled.


Shouldn't this be up to the guest? Or, is this required in order for the
guest to use tagging at all. Something like making the guest IPAs memtag
capable, but if the guest doesn't enable tagging then there is no guest
impact? In any case, shouldn't userspace be the one that adds PROT_MTE
to the memory regions it wants the guest to be able to use tagging with,
rather than KVM adding the attribute page by page?


I think I've probably explained this badly.

The guest can choose how to populate the stage 1 mapping - so can choose
which parts of memory are accessed tagged or not. However, the hypervisor
cannot restrict this in stage 2 (except by e.g. making the memory uncached
but that's obviously not great - however devices forward to the guest can be
handled like this).

Because the hypervisor cannot restrict the guest's access to the tags, the
hypervisor must assume that all memory given to the guest could have the
tags accessed. So it must (a) clear any stale data from the tags, and (b)
ensure that the tags are preserved (e.g. when swapping pages out).


Yes, this is how I understood it.


Ok, I've obviously misunderstood your comment instead ;)

Because of the above the current series automatically sets PG_mte_tagged on
the pages. Note that this doesn't change the mappings that the VMM has (a
non-PROT_MTE mapping will still not have access to the tags).


But if userspace created the memslots with memory already set with
PROT_MTE, then this wouldn't be necessary, right? And, as long as
there's still a way to access the memory with tag checking disabled,
then it shouldn't be a problem.

Yes, so one option would be to attempt to validate that the VMM hasprovided memory pages with the PG_mte_tagged bit set (e.g. by mappingwith PROT_MTE). The tricky part here is that we support KVM_CAP_SYNC_MMUwhich means that the VMM can change the memory backing at any time - sowe could end up in user_mem_abort() discovering that a page doesn't havePG_mte_tagged set - at that point there's no nice way of handling it(other than silently upgrading the page) so the VM is dead.

So since enforcing that PG_mte_tagged is set isn't easy and provides ahard-to-debug foot gun to the VMM I decided the better option was to letthe kernel set the bit automatically.


If userspace needs to write to guest memory then it should be due to
a device DMA or other specific hardware emulation. Those accesses can
be done with tag checking disabled.


Yes, the question is can the VMM (sensibly) wrap the accesses with a
disable/renable tag checking for the process sequence. The alternative at
the moment is to maintain a separate (untagged) mapping for the purpose
which might present it's own problems.


Hmm, so there's no easy way to disable tag checking when necessary? If we
don't map the guest ram with PROT_MTE and continue setting the attribute
in KVM, as this series does, then we don't need to worry about it tag
checking when accessing the memory, but then we can't access the tags for
migration.

There's a "TCO" (Tag Check Override) bit in PSTATE which allowsdisabling tag checking, so if it's reasonable to wrap accesses to thememory you can simply set the TCO bit, perform the memory access andthen unset TCO. That would mean a single mapping with MTE enabled wouldwork fine. What I don't have a clue about is whether it's practical inthe VMM to wrap guest accesses like this.


If it's not practical to either disable tag checking in the VMM or
maintain multiple mappings then the alternatives I'm aware of are:

   * Provide a KVM-specific method to extract the tags from guest memory.
     This might also have benefits in terms of providing an easy way to
     read bulk tag data from guest memory (since the LDGM instruction
     isn't available at EL0).


Maybe we need a new version of KVM_GET_DIRTY_LOG that also provides
the tags for all addresses of each dirty page.


Certainly possible, although it seems to conflate two operations: "get list
of dirty pages", "get tags from page". It would also require a lot of return
space (size of slot/32).


It would require num-set-bits * host-page-size / 16 / 2, right?

Yes, where the worst case is all bits set which is size/32. Since youdon't know at the time of the call how many bits are going to be set I'mnot sure how you would design the API which doesn't requirepreallocating the worst case.

   * Provide support for user space setting the TCMA0 or TCMA1 bits in
     TCR_EL1. These would allow the VMM to generate pointers which are not
     tag checked.


So this is necessary to allow the VMM to keep tag checking enabled for
itself, plus map guest memory as PROT_MTE, and write to that memory when
needed?


This is certainly one option. The architecture provides two "magic" values
(all-0s and all-1s) which can be configured using TCMAx to be treated
differently. The VMM could therefore construct pointers to otherwise tagged
memory which would be treated as untagged.

However, Catalin's user space series doesn't at the moment expose this
functionality.


So if I understand correctly this would allow us to map the guest memory
with PAGE_MTE and still access the memory when needed. If so, then this
sounds interesting.

Yes - you could derive a pointer which didn't perform tag checking. Notethat this also requires the rest of user space to play along (i.e.understand that the tag value is reserved). I believe for user space wehave to use the all-0s value which means that a standard pointer(top-byte is 0) would be unchecked.


Steve

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v2 2/2] arm64: kvm: Introduce MTE VCPU feature, (continued)
- Re: [PATCH v2 0/2] MTE support for KVM guest, Dr. David Alan Gilbert, 2020/09/07
  - Re: [PATCH v2 0/2] MTE support for KVM guest, Steven Price, 2020/09/09
- Re: [PATCH v2 0/2] MTE support for KVM guest, Andrew Jones, 2020/09/09
  - Re: [PATCH v2 0/2] MTE support for KVM guest, Steven Price, 2020/09/09
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Andrew Jones, 2020/09/10
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Steven Price <=
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Andrew Jones, 2020/09/10
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Steven Price, 2020/09/10
  - Re: [PATCH v2 0/2] MTE support for KVM guest, Richard Henderson, 2020/09/09
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Andrew Jones, 2020/09/10
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Dr. David Alan Gilbert, 2020/09/10
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Andrew Jones, 2020/09/10
- Re: [PATCH v2 0/2] MTE support for KVM guest, Richard Henderson, 2020/09/09
  - Re: [PATCH v2 0/2] MTE support for KVM guest, Steven Price, 2020/09/10
    - Re: [PATCH v2 0/2] MTE support for KVM guest, Richard Henderson, 2020/09/10

Prev by Date: Re: [PATCH v2 2/2] arm64: kvm: Introduce MTE VCPU feature
Next by Date: Re: [RFC] QEMU as Xcode project on macOS
Previous by thread: Re: [PATCH v2 0/2] MTE support for KVM guest
Next by thread: Re: [PATCH v2 0/2] MTE support for KVM guest
Index(es):
- Date
- Thread