qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/7] target/mips: Fix <ld|st>.<b|h|w|d> MSA i


From: Aleksandar Markovic
Subject: Re: [Qemu-devel] [PATCH v2 1/7] target/mips: Fix <ld|st>.<b|h|w|d> MSA instructions for MIPS big endian host
Date: Tue, 26 Mar 2019 11:26:55 +0000

> From: Aleksandar Markovic
> Subject: Re: [PATCH v2 1/7] target/mips: Fix <ld|st>.<b|h|w|d> MSA 
> instructions for MIPS big endian host
> 
> > From: Mateja Marjanovic <address@hidden>
> > Subject: Re: [PATCH v2 1/7] target/mips: Fix <ld|st>.<b|h|w|d> MSA 
> > instructions for MIPS big endian host
> > On 25.3.19. 22:21, Aleksandar Markovic wrote:
> >> From: Mateja Marjanovic <address@hidden>
> >> Subject: [PATCH v2 1/7] target/mips: Fix <ld|st>.<b|h|w|d> MSA 
> >> instructions for MIPS big endian host
> >> Please split this patch into one for load instructions and another for
> >> store instructions.
> >
> > I will do that.
> >
> >> I don't think the variable "big_endian" is needed, you should better 
> >> resolve
> >> all differences wrt endian via "#if defined(HOST_WORDS_BIGENDIAN)",
> >> like you did in other patches. It is important to be consistent.
> >
> > I think you can't have a preprocessing statement inside a macro,
> > so I did the thing that seemed most similar to that.
> >
> 
> There should be some equivalent way of doing that involving an argument
> to the macro, however, in the light of your recent work on MSA optimization,
> let's split/demacro and unroll loops in load and store helpers.
> 
> After splitting helper for LD.B should look something like this:
> (the code is just for demonstration purposes, doublecheck it)
> 
> void helper_msa_ld_b(CPUMIPSState *env, uint32_t wd, target_ulong addr)
> {
>     wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
>     wr_t wx;
>     int i;
> 
>     MEMOP_IDX(DF_BYTE)
>     for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {
> #if !defined(CONFIG_USER_ONLY)
>         wx.b[i] = helper_ret_ldub_mmu(env, addr + (i << DF_BYTE), oi, 
> GETPC());
> #else
>         wx.b[i] = cpu_ldub_data(env, addr + (i << DF_BYTE));
> #endif
>     }
>     memcpy(pwd, &wx, sizeof(wr_t));
> }
> 
> After unrolling:
> 
> void helper_msa_ld_b(CPUMIPSState *env, uint32_t wd, target_ulong addr)
> {
>     wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
>     wr_t wx;
>     int i;
> 
>     MEMOP_IDX(DF_BYTE)
> 
> #if !defined(CONFIG_USER_ONLY)
>     wx.b[0]  = helper_ret_ldub_mmu(env, addr, oi, GETPC());
>     wx.b[1]  = helper_ret_ldub_mmu(env, addr + 1, oi, GETPC());
>     wx.b[2]  = helper_ret_ldub_mmu(env, addr + 2, oi, GETPC());
>     wx.b[3]  = helper_ret_ldub_mmu(env, addr + 3, oi, GETPC());
>     wx.b[4]  = helper_ret_ldub_mmu(env, addr + 4, oi, GETPC());
>     wx.b[5]  = helper_ret_ldub_mmu(env, addr + 5, oi, GETPC());
>     wx.b[6]  = helper_ret_ldub_mmu(env, addr + 6, oi, GETPC());
>     wx.b[7]  = helper_ret_ldub_mmu(env, addr + 7, oi, GETPC());
>     wx.b[8]  = helper_ret_ldub_mmu(env, addr + 8, oi, GETPC());
>     wx.b[9]  = helper_ret_ldub_mmu(env, addr + 9, oi, GETPC());
>     wx.b[10] = helper_ret_ldub_mmu(env, addr + 10, oi, GETPC());
>     wx.b[11] = helper_ret_ldub_mmu(env, addr + 11, oi, GETPC());
>     wx.b[12] = helper_ret_ldub_mmu(env, addr + 12, oi, GETPC());
>     wx.b[13] = helper_ret_ldub_mmu(env, addr + 13, oi, GETPC());
>     wx.b[14] = helper_ret_ldub_mmu(env, addr + 14, oi, GETPC());
>     wx.b[15] = helper_ret_ldub_mmu(env, addr + 15, oi, GETPC());
> #else
>     wx.b[0]  = cpu_ldub_data(env, addr);
>     wx.b[1]  = cpu_ldub_data(env, addr + 1);
>     wx.b[2]  = cpu_ldub_data(env, addr + 2);
>     wx.b[3]  = cpu_ldub_data(env, addr + 3);
>     wx.b[4]  = cpu_ldub_data(env, addr + 4);
>     wx.b[5]  = cpu_ldub_data(env, addr + 5);
>     wx.b[6]  = cpu_ldub_data(env, addr + 6);
>     wx.b[7]  = cpu_ldub_data(env, addr + 7);
>     wx.b[8]  = cpu_ldub_data(env, addr + 8);
>     wx.b[9]  = cpu_ldub_data(env, addr + 9);
>     wx.b[10] = cpu_ldub_data(env, addr + 10);
>     wx.b[11] = cpu_ldub_data(env, addr + 11);
>     wx.b[12] = cpu_ldub_data(env, addr + 12);
>     wx.b[13] = cpu_ldub_data(env, addr + 13);
>     wx.b[14] = cpu_ldub_data(env, addr + 14);
>     wx.b[15] = cpu_ldub_data(env, addr + 15);
> #endif
>     }
>     memcpy(pwd, &wx, sizeof(wr_t));
> }
> 
> 
> And then apply big-endian-related changes.
> 
> I think that would be more in the spirit of other MSA work.

The same approach (splitting helpers/unrolling loops) should be
good for COPY_S, COPY_U, INSERT as well.

Thanks,
Aleksandar






reply via email to

[Prev in Thread] Current Thread [Next in Thread]