qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH -V5] target-ppc: Fix page table lookup with kvm


From: Aneesh Kumar K.V
Subject: Re: [Qemu-devel] [PATCH -V5] target-ppc: Fix page table lookup with kvm enabled
Date: Fri, 11 Oct 2013 22:28:28 +0530
User-agent: Notmuch/0.16+84~gcbbf53e (http://notmuchmail.org) Emacs/24.2.1 (x86_64-pc-linux-gnu)

Alexander Graf <address@hidden> writes:

> On 11.10.2013, at 13:13, Aneesh Kumar K.V <address@hidden> wrote:
>
>> From: "Aneesh Kumar K.V" <address@hidden>
>> 
>> With kvm enabled, we store the hash page table information in the hypervisor.
>> Use ioctl to read the htab contents. Without this we get the below error when
>> trying to read the guest address
>> 
>> (gdb) x/10 do_fork
>> 0xc000000000098660 <do_fork>:   Cannot access memory at address 
>> 0xc000000000098660
>> (gdb)
>> 
>> Signed-off-by: Aneesh Kumar K.V <address@hidden>
>> ---
>> Changes from V4:
>> * Rewrite to avoid two code paths for doing hash lookups

....

>> +
>> +    i = 0;
>> +    hpte = pte_index * HASH_PTE_SIZE_64;
>>     if (likely((flags & H_EXACT) == 0)) {
>>         pte_index &= ~7ULL;
>> -        hpte = pte_index * HASH_PTE_SIZE_64;
>> -        for (i = 0; ; ++i) {
>> +        token = ppc_hash64_start_access(ppc_env_get_cpu(env), pte_index);
>> +        do {
>
> Why convert this into a while loop?

I am moving i = 0 outside the loop. Hence found while () better than for(;;++i) 

>
>>             if (i == 8) {
>> +                ppc_hash64_stop_access(token);
>>                 return H_PTEG_FULL;
>>             }
>> -            if ((ppc_hash64_load_hpte0(env, hpte) & HPTE64_V_VALID) == 0) {
>> +            if ((ppc_hash64_load_hpte0(token, i) & HPTE64_V_VALID) == 0) {
>>                 break;
>>             }
>> -            hpte += HASH_PTE_SIZE_64;
>> -        }
>> +        } while (i++);
>> +        ppc_hash64_stop_access(token);

....


>> +
>> +int kvm_ppc_hash64_start_access(PowerPCCPU *cpu, unsigned long pte_index,
>> +                                struct ppc_hash64_hpte_token *token)
>> +{
>> +    int htab_fd;
>> +    int hpte_group_size;
>> +    struct kvm_get_htab_fd ghf;
>> +    struct kvm_get_htab_buf {
>> +        struct kvm_get_htab_header header;
>> +        /*
>> +         * We required one extra byte for read
>> +         */
>> +        unsigned long hpte[(HPTES_PER_GROUP * 2) + 1];
>> +    } hpte_buf;;
>
> Double semicolon?

Will fix

>
>> +
>> +    ghf.flags = 0;
>> +    ghf.start_index = pte_index;
>> +    htab_fd = kvm_vm_ioctl(kvm_state, KVM_PPC_GET_HTAB_FD, &ghf);
>> +    if (htab_fd < 0) {
>> +        goto error_out;
>> +    }
>> +    memset(&hpte_buf, 0, sizeof(hpte_buf));

....

>> diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
>> index 67fc1b5..aeb4593 100644
>> --- a/target-ppc/mmu-hash64.c
>> +++ b/target-ppc/mmu-hash64.c
>> @@ -302,29 +302,73 @@ static int ppc_hash64_amr_prot(CPUPPCState *env, 
>> ppc_hash_pte64_t pte)
>>     return prot;
>> }
>> 
>> -static hwaddr ppc_hash64_pteg_search(CPUPPCState *env, hwaddr pteg_off,
>> +struct ppc_hash64_hpte_token *ppc_hash64_start_access(PowerPCCPU *cpu,
>> +                                                      unsigned long 
>> pte_index)
>
> How about you also pass in the number of PTEs you want to access?
> Let's call it "pte_num" for now. Then if you only care about one PTE
> you can indicate so, otherwise it's clear that you want to access 8
> PTEs beginning from the one you're pointing at.

So if we want to pass pte_num, then i can be any number, 1, 8, 10. That
would make the code complex, because now we need to make the buffer
passed to read() of variable size.Also i would need another allocation
for the return buffer. I can do tricks like make the token handle the
pointer to actual buffer skipping the header. But ppc_hash64=stop_acess then
would have to know about kvm htab read header which i found not nice.
We can possibly update the function name to indicate that it will always
read hptegroup from the pte_index. Something like ppc64_start_hpteg_access() ?. 

>
>> +{
>> +    hwaddr pte_offset;
>> +    struct ppc_hash64_hpte_token *token;
>
> void *token = NULL;
>
> if (kvmppc_uses_htab_fd(cpu)) {
>     /* HTAB is controlled by KVM. Fetch the PTEG into a new buffer. */
>
>     int hpte_group_size = sizeof(unsigned long) * 2 * pte_num;
>     token = g_malloc(hpte_group_size);
>     if (kvm_ppc_hash64_read_pteg(cpu, pte_index, token)) {

That is the tricky part, the read buffer need to have a header in the
beginning. May be i can do kvm_ppc_hash64_stop_access(void *token) that
does the pointer match gets to the head of token and free. Will try that.

>         free(token);
>         return NULL;
>     }
> } else {
>     /* HTAB is controlled by QEMU. Just point to the internally accessible 
> PTEG. */
>     hwaddr pte_offset;
>
>     pte_offset = pte_index * HASH_PTE_SIZE_64;
>     if (cpu->env.external_htab) {
>         token = cpu->env.external_htab + pte_offset;
>     } else {
>         token = (uint8_t *) cpu->env.htab_base + pte_offset;
>     }
> }
>
> return token;
>
> This way it's more obvious which path the "normal code flow" would be. We 
> also only clearly choose what to do depending on in-kernel HTAB or now. As a 
> big plus we don't need a struct that we need to dynamically allocate even in 
> the TCG case (malloc is slow).
>
> You also want to cache kvmppc_uses_htab_fd(cpu) so it doesn't have to
> call an ioctl every time we try to read a PTE. I guess the best spot
> for it would be to put it into env.

didn't get that. We already cache that value as 

bool kvmppc_has_cap_htab_fd(void)
{
    return cap_htab_fd;
}

Looking at the kernel capability check I do find an issue, So with both
hv and pr loaded as module, that would always return true. Now that is
not we want here. Ideally we should have all these as VM ioctl and not
device ioctl. I had asked that as a fixme in the HV PR patchset.


>
>
>> +
>> +    token = g_malloc(sizeof(struct ppc_hash64_hpte_token));
>> +
>> +    pte_offset = pte_index * HASH_PTE_SIZE_64;
>> +    if (!kvmppc_has_cap_htab_fd()) {
>> +        token->free_hpte_group = 0;
>> +        token->external = !!cpu->env.external_htab;
>> +        if (token->external) {
>> +            token->hpte_group = cpu->env.external_htab + pte_offset;
>> +        } else {
>> +            token->hpte_group = (uint8_t *) cpu->env.htab_base + pte_offset;
>> +        }
>> +        return token;
>> +    }
>> +
>> +    if (!kvm_ppc_hash64_start_access(cpu, pte_index, token)) {
>> +        return token;
>> +    }
>> +    free(token);
>> +    return NULL;
>> +}
>> +
>> +void ppc_hash64_stop_access(struct ppc_hash64_hpte_token *token)
>> +{
>
> Pass in cpu as well, then simply only do
>
> if (kvmppc_uses_htab_fd(cpu)) {
>     g_free(token);
> }
>
>> +    if (token->free_hpte_group) {
>> +        free(token->hpte_group);
>> +    }
>> +    free(token);
>> +    return;
>> +}


......


>> +
>> +struct ppc_hash64_hpte_token {
>> +    bool free_hpte_group;
>> +    bool external;
>> +    uint8_t *hpte_group;
>> +};
>> +
>> +struct ppc_hash64_hpte_token *ppc_hash64_start_access(PowerPCCPU *cpu,
>> +                                                      unsigned long 
>> pte_index);
>> +void ppc_hash64_stop_access(struct ppc_hash64_hpte_token *token);
>> +
>> +static inline target_ulong ppc_hash64_load_hpte0(struct 
>> ppc_hash64_hpte_token *token,
>> +                                                 int index)
>
> If you pass in env or cpu in here as well you can do the decision
> whether to use ldq_p or ldq_phys based on
> cpu->env.external_htab. Whenever the HTAB is controlled by KVM,
> external_htab is true as well.

Will do

>
>
> Alex
>
>> {
>> -    if (env->external_htab) {
>> -        return  ldq_p(env->external_htab + pte_offset);
>> +    index *= HASH_PTE_SIZE_64;
>> +    if (token->external) {
>> +        return  ldq_p(token->hpte_group + index);
>>     } else {
>> -        return ldq_phys(env->htab_base + pte_offset);
>> +        return ldq_phys((uint64_t)(token->hpte_group + index));
>>     }
>> }


-aneesh




reply via email to

[Prev in Thread] Current Thread [Next in Thread]