[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access

qemu-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access

From:	Scott Wood
Subject:	[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access
Date:	Mon, 7 Feb 2011 14:15:47 -0600
On Mon, 7 Feb 2011 16:43:02 +0100
Alexander Graf <address@hidden> wrote:

> On 04.02.2011, at 23:33, Scott Wood wrote:
> 
> > On Thu, 3 Feb 2011 10:19:06 +0100
> > Alexander Graf <address@hidden> wrote:
> > 
> >> Makes sense. So we basically need an ioctl that tells KVM the MMU type and 
> >> TLB size. Remember, the userspace tool is the place for policies :).
> > 
> > Maybe, though keeping it in KVM means we can change it whenever we want
> > without having to sync up Qemu and worry about backward compatibility.
> 
> Quite the contrary - you have to worry more about backward compatibility. If 
> we implement a new feature that doesn't work on old kernels, we can just tell 
> qemu to not work on those old versions. For the kernel interfaces, we have to 
> keep supporting old userspace.

If you're talking about actual interface changes, yes.  But a change in
how KVM implements things behind the scenes shouldn't break an old
Qemu, unless it's buggy and makes assumptions not permitted by the API.

> > How's that different from backing the void pointer up with a different
> > struct depending on the MMU type?  We weren't proposing unions.
> > 
> > A fixed array does mean you wouldn't have to worry about whether qemu
> > supports the more advanced struct format if fields are added --
> > you can just unconditionally write it, as long as it's backwards
> > compatible.  Unless you hit the limit of the pre-determined array size,
> > that is.  And if that gets made higher for future expansion, that's
> > even more data that has to get transferred, before it's really needed.
> 
> Yes, it is. And I don't see how we could easily avoid it. Maybe just pass in 
> a random __user pointer that we directly write to from kernel space and tell 
> qemu how big and what type a tlb entry is?
> 
> struct request_ppc_tlb {
>     int tlb_type;
>     int tlb_entries;
>     uint64_t __user *tlb_data
> };

That's pretty much what the proposed API does -- except it uses a void
pointer instead of uint64_t *.

> Would you really want to loop through 16k entries, doing an ioctl for each? 

Not really.  The API was modeled after something we did on Topaz where
it's just a function call.  But something array-based would have been
awkward without constraining the geometry.

Now that we're going to constrain the geometry, providing an array-based
get/set would be easy and should definitely be a part of the API.

> Then performance really would always be an issue.

For cases where you really need to do a full get/set, yes.

> I would really prefer we tackle it with a full-on tlb get/set first and then 
> put the very flexible one on top, because to be the full-on approach feels 
> like the more generic one. I'm very open to adding an individual tlb get/set 
> and maybe even a "kvm, please translate EA x to RA y" ioctl. But those should 
> come after we cover the big hammer that just copies everything.

If we add everything at once, it minimizes the possibilities that Qemu
has to deal with -- either the full MMU API is there, or it's not.

BTW, I wonder if we should leave PPC out of the name.  It seems like
any arch with a software-visible TLB could use this, since the hw
details are hidden behind the MMU type.

How about:

struct kvmppc_booke_tlb_entry {
        union {
                __u64 mas0_1;
                struct {
                        __u32 mas0;
                        __u32 mas1;
                };
        };
        __u64 mas2;
        union {
                __u64 mas7_3    
                struct {
                        __u32 mas7;
                        __u32 mas3;
                };
        };
        __u32 mas8;
        __u32 pad;
};

struct kvmppc_booke_tlb_params {
        /*
         * book3e defines 4 TLBs.  Individual implementations may have
         * fewer.  TLBs that do not exist on the target must be configured
         * with a size of zero.  KVM will adjust TLBnCFG based on the sizes
         * configured here, though arrays greater than 2048 entries will
         * have TLBnCFG[NENTRY] set to zero.
         */
        __u32 tlb_sizes[4];
};

struct kvmppc_booke_tlb_search {
        struct kvmppc_booke_tlb_entry entry;
        union {
                __u64 mas5_6;
                struct {
                        __u64 mas5;
                        __u64 mas6;
                };
        };
};

For a mmu type of PPC_BOOKE_NOHV, the mas5 field in
kvmppc_booke_tlb_search and the mas8 field in kvmppc_booke_tlb_entry
are present but not supported.

For an MMU type of PPC_BOOKE_NOHV or PPC_BOOKE_HV:
 - TLB entries in get/set arrays may be provided in any order, and all 
   TLBs are get/set at once.
 - An entry with MAS1[V] = 0 terminates the list early (but there will
   be no terminating entry if the full array is valid).  On a call to
   KVM_GET_TLB, the contents of elemnts after the terminator are undefined.
   On a call to KVM_SET_TLB, excess elements beyond the terminating
   entry may not be accessed by KVM.

[Note: Once we implement sregs, Qemu can determine which TLBs are
implemented by reading MMUCFG/TLBnCFG -- but in no case should a TLB be
unsupported by KVM if its existence is implied by the target CPU]

KVM_SET_TLB
-----------

Capability: KVM_CAP_SW_TLB
Type: vcpu ioctl
Parameters: struct kvm_set_tlb (in)
Returns: 0 on success
         -1 on error

struct kvm_set_tlb {
        __u64 params;
        __u64 array;
        __u32 mmu_type;
};

[Note: I used __u64 rather than void * to avoid the need for special
compat handling with 32-bit userspace on a 64-bit kernel -- if the other
way is preferred, that's fine with me]

Configures and sets the virtual CPU's TLB array.  The "params" and
"array" fields are userspace addresses of mmu-type-specific data
structures.

For mmu types PPC_BOOKE_NOHV and PPC_BOOKE_HV, the "params" field is of
type "struct kvmppc_booke_tlb_params", and the "array" field points to
an array of type "struct kvmppc_booke_tlb_entry".

[Note: KVM_SET_TLB with early array termination makes a separate
invalidate call unnecessary.]

KVM_GET_TLB
-----------

Capability: KVM_CAP_SW_TLB
Type: vcpu ioctl
Parameters: void pointer (out)
Returns: 0 on success
         -1 on error

Reads the TLB array from a virtual CPU.  A successful call to
KVM_SET_TLB must have been previously made on this vcpu.  The argument
must point to space for an array of the size and type of TLB entry
structs configured by the most recent successful call to KVM_SET_TLB.

For mmu types BOOKE_NOHV and BOOKE_HV, the array is of type "struct
kvmppc_booke_tlb_entry", and must hold a number of entries equal to
the sum of the elements of tlb_sizes in the most recent successful
TLB configuration call.


KVM_SEARCH_TLB
--------------

Capability: KVM_CAP_SW_TLB
Type: vcpu ioctl
Parameters: void pointer (in/out)
Returns: 0 on success
         -1 on error

Searches the TLB array of a virtual CPU for an entry matching
mmu-type-specific parameters.  A successful call to KVM_SET_TLB must
have been previously made on this vcpu.

For mmu types BOOKE_NOHV and BOOKE_HV, the argument must point to a
struct of type "kvmppc_booke_tlb_search".  The search operates as a tlbsx
instruction.

[Note: there currently exists a BookE implementation of KVM_TRANSLATE,
but the way it interprets the address is broken on 64-bit, and seems to
be confusing PPC's notion of a virtual address with what was most likely
intended by the x86ish API.  If nothing yet uses it to be broken by a
change, we may want to reimplement that using the address input as just
an effective address, as it would be interpreted by the current state
of the vcpu.  This would be a way to provide a non-hw-specific
mechanism for simple virtual->physical translation for debuggers and
such, as long as the caller doesn't care about the x86ish attributes.].
[Prev in Thread]
Current Thread
[Next in Thread]
[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access, (continued)
Prev by Date: Re: [Qemu-devel] Re: [PATCH 2/7] Enable I/O thread and VNC threads by default
Next by Date: [Qemu-devel] RE: RFC: New API for PPC for vcpu mmu access
Previous by thread: [Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access
Next by thread: [Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access
Index(es):
- Date
- Thread