[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translatio
From: |
Michael Rolnik |
Subject: |
Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation |
Date: |
Mon, 6 Jun 2016 00:47:05 +0300 |
please see my answer inside.
On Sun, Jun 5, 2016 at 6:27 AM, Richard Henderson <address@hidden> wrote:
> On 06/02/2016 01:07 PM, Michael Rolnik wrote:
>
>> Signed-off-by: Michael Rolnik <address@hidden>
>> ---
>> target-avr/translate-inst.c | 2443
>> +++++++++++++++++++++++++++++++++++++++++++
>>
>
> Is there a reason this code isn't going into translate.c?
> You wouldn't need the declarations in translate-inst.h or translate.h.
I see here two levels of logic
a. instruction translation
b. general flow of program translation.
>
>
> +/*
>> + NOTE: all registers are assumed to hold 8 bit values.
>> + so all operations done on registers should preseve this
>> property
>> +*/
>> +
>> +/*
>> + NOTE: the flags C,H,V,N,V have either 0 or 1 values
>> + NOTE: the flag Z has inverse logic, when the value of Zf is 0 the
>> flag is assumed to be set, non zero - not set
>> +*/
>>
>
> This documentation belongs with the declarations in cpu.h.
moved.
>
>
> +void gen_add_CHf(TCGv R, TCGv Rd, TCGv Rr);
>> +void gen_add_Vf(TCGv R, TCGv Rd, TCGv Rr);
>> +void gen_sub_CHf(TCGv R, TCGv Rd, TCGv Rr);
>> +void gen_sub_Vf(TCGv R, TCGv Rd, TCGv Rr);
>> +void gen_ZNSf(TCGv R);
>> +void gen_push_ret(CPUAVRState *env, int ret);
>> +void gen_pop_ret(CPUAVRState *env, TCGv ret);
>> +void gen_jmp_ez(void);
>> +void gen_jmp_z(void);
>> +
>> +void gen_set_addr(TCGv addr, TCGv H, TCGv M, TCGv l); /* H:M:L =
>> addr */
>> +void gen_set_xaddr(TCGv addr);
>> +void gen_set_yaddr(TCGv addr);
>> +void gen_set_zaddr(TCGv addr);
>> +
>> +TCGv gen_get_addr(TCGv H, TCGv M, TCGv L); /* addr =
>> H:M:L */
>> +TCGv gen_get_xaddr(void);
>> +TCGv gen_get_yaddr(void);
>> +TCGv gen_get_zaddr(void);
>> +int sex(int Imm, unsigned bits);
>>
>
> Order these functions properly and you don't need forward declarations.
is it a requirements? this way it look cleaner.
>
>
> +
>> +void gen_add_CHf(TCGv R, TCGv Rd, TCGv Rr)
>> +{
>> + TCGv t1 = tcg_temp_new_i32();
>> + TCGv t2 = tcg_temp_new_i32();
>> + TCGv t3 = tcg_temp_new_i32();
>> +
>> + tcg_gen_and_tl(t1, Rd, Rr); /* t1 = Rd & Rr */
>> + tcg_gen_not_tl(t2, R); /* t2 = Rd & ~R */
>> + tcg_gen_and_tl(t2, Rd, t2);
>>
>
> tcg_gen_andc_tl.
thanks
>
>
> + tcg_gen_not_tl(t3, R); /* t3 = Rr *~R */
>> + tcg_gen_and_tl(t3, Rr, t3);
>>
>
> Likewise.
thanks
>
>
> + tcg_gen_or_tl(t1, t1, t2); /* t1 = t1 | t2 | t3 */
>> + tcg_gen_or_tl(t1, t1, t3);
>>
>
> While this is exactly the formula in the manual, it's also equal to
>
> ((Rd ^ Rr) ^ R) & 16
>
Please explain. I don't see it.
http://www.wolframalpha.com/input/?i=A+and+B+or+A+and+not+C+or+B+and+not+C,+A+xor+B+xor+C
>
>
> where we examine the difference between the non-carry addition (Rd ^ Rr)
> and the carry addition (R) to find the carry out of bit 3. This reduces
> the operation count to 2 from 5.
>
> + tcg_gen_shri_tl(cpu_Cf, t1, 7); /* Cf = t1(7) */
>>
>
> Of course, Cf is also bit 8 of R, at least before you masked that off.
>
> + tcg_gen_shri_tl(cpu_Hf, t1, 3); /* Hf = t1(3) */
>> + tcg_gen_andi_tl(cpu_Hf, cpu_Hf, 1);
>>
>
> Consider leaving Hf in bit 3 (or 4, as above) instead of shifting and
> masking to bit 0. This makes it (slightly) more complicated to read the
> full SREG value, but it saves two instructions per arithmetic instruction.
> This will add up quickly.
>
yes you are right. next version.
>
> One can make the same argument for most of the other flags.
>
> Compare how this is handled in target-arm for cpu_[NZCV]F.
>
> +void gen_add_Vf(TCGv R, TCGv Rd, TCGv Rr)
>> +{
>> + TCGv t1 = tcg_temp_new_i32();
>> + TCGv t2 = tcg_temp_new_i32();
>> +
>> + tcg_gen_not_tl(t1, Rd); /* t1 = ~Rd & ~Rr & R */
>> + tcg_gen_not_tl(t2, Rr);
>> + tcg_gen_and_tl(t1, t1, t2);
>> + tcg_gen_and_tl(t1, t1, R);
>> +
>> + tcg_gen_not_tl(t2, R); /* t2 = Rd & Rr & ~R */
>> + tcg_gen_and_tl(t2, t2, Rd);
>> + tcg_gen_and_tl(t2, t2, Rr);
>> +
>> + tcg_gen_or_tl(t1, t1, t2); /* t1 = Rd & Rr & ~R | ~Rd & ~Rr &
>> R */
>>
>
> While this is indeed the formula in the manual, a better expression is
>
> (R ^ Rd) & ~(Rd ^ Rr)
>
right, thanks.
>
> which is 3 operations instead of 5. As alluded above, it might be best to
> keep Vf in bit 7 instead of bit 0.
>
> Also note that if you use bit 4 to compute Hf, as above, then one can
> share a sub-expression with the computation of Vf.
>
> +void gen_sub_CHf(TCGv R, TCGv Rd, TCGv Rr)
>> +{
>> + TCGv t1 = tcg_temp_new_i32();
>> + TCGv t2 = tcg_temp_new_i32();
>> + TCGv t3 = tcg_temp_new_i32();
>> +
>> + /* Cf & Hf */
>> + tcg_gen_not_tl(t1, Rd); /* t1 = ~Rd */
>> + tcg_gen_and_tl(t2, t1, Rr); /* t2 = ~Rd & Rr */
>> + tcg_gen_or_tl(t3, t1, Rr); /* t3 = (~Rd | Rr) & R */
>> + tcg_gen_and_tl(t3, t3, R);
>> + tcg_gen_or_tl(t2, t2, t3); /* t2 = ~Rd & Rr | ~Rd & R | R & Rr
>> */
>> + tcg_gen_shri_tl(cpu_Cf, t2, 7); /* Cf = t2(7) */
>> + tcg_gen_shri_tl(cpu_Hf, t2, 3); /* Hf = t2(3) */
>> + tcg_gen_andi_tl(cpu_Hf, cpu_Hf, 1);
>>
>
> Note that carry and borrow are related, and thus this is *also* computable
> via ((Rd ^ Rr) ^ R) on bit 4.
please explain, I don't see it
http://www.wolframalpha.com/input/?i=not+A+and+B+or+not+A+and+C+or++C+and+B,+A+xor+B+xor+C
>
>
> +void gen_sub_Vf(TCGv R, TCGv Rd, TCGv Rr)
>> +{
>> + TCGv t1 = tcg_temp_new_i32();
>> + TCGv t2 = tcg_temp_new_i32();
>> +
>> + /* Vf */
>> + tcg_gen_and_tl(t1, Rr, R); /* t1 = Rd & ~Rr & ~R */
>> + tcg_gen_not_tl(t1, t1);
>> + tcg_gen_and_tl(t1, t1, Rd);
>> + tcg_gen_not_tl(t2, Rd); /* t2 = ~Rd & Rr & R */
>> + tcg_gen_and_tl(t2, t2, Rr);
>> + tcg_gen_and_tl(t2, t2, R);
>> + tcg_gen_or_tl(t1, t1, t2); /* t1 = Rd & ~Rr & ~R | ~Rd & Rr &
>> R */
>> + tcg_gen_shri_tl(cpu_Vf, t1, 7); /* Vf = t1(7) */
>>
>
> This is equal to
>
> (R ^ Rd) & (Rd ^ Rr)
>
> +void gen_ZNSf(TCGv R)
>> +{
>> + tcg_gen_mov_tl(cpu_Zf, R); /* Zf = R */
>> + tcg_gen_shri_tl(cpu_Nf, R, 7); /* Nf = R(7) */
>> + tcg_gen_and_tl(cpu_Sf, cpu_Nf, cpu_Vf);/* Sf = Nf & Vf */
>>
>
> This should be xor.
thanks. fixed
>
>
> +void gen_push_ret(CPUAVRState *env, int ret)
>> +{
>> + tcg_gen_qemu_st8(tcg_const_local_i32((ret & 0x0000ff)), cpu_sp,
>> DATA_INDEX);
>>
>
> You don't need a local constant, and you should be freeing these 3 temps.
done.
>
>
I'll also note that the piece-wise store is big-endian, so you could
> perform this in 1 store for 2_BYTE and 2 stores for 3_BYTE.
I got an expression that the platform is little endian.
>
>
> +void gen_jmp_ez()
>> +{
>> + tcg_gen_mov_tl(cpu_pc, cpu_eind);
>> + tcg_gen_shli_tl(cpu_pc, cpu_pc, 8);
>> + tcg_gen_or_tl(cpu_pc, cpu_pc, cpu_r[31]);
>> + tcg_gen_shli_tl(cpu_pc, cpu_pc, 8);
>> + tcg_gen_or_tl(cpu_pc, cpu_pc, cpu_r[30]);
>> + tcg_gen_andi_tl(cpu_pc, cpu_pc, 0xffffff);
>> + tcg_gen_exit_tb(0);
>> +}
>> +
>> +void gen_jmp_z()
>> +{
>> + tcg_gen_mov_tl(cpu_pc, cpu_r[31]);
>> + tcg_gen_shli_tl(cpu_pc, cpu_pc, 8);
>> + tcg_gen_or_tl(cpu_pc, cpu_pc, cpu_r[30]);
>> + tcg_gen_andi_tl(cpu_pc, cpu_pc, 0xffff);
>>
>
> Note this is
>
> tcg_gen_deposit_tl(cpu_pc, cpu_r[30], cpu_r[31], 8, 8);
>
> Also consider storing EIND (and perhaps already shifted so that you can
> just OR it in.
>
> +void gen_set_addr(TCGv addr, TCGv H, TCGv M, TCGv L)
>> +{
>> + tcg_gen_andi_tl(L, addr, 0xff);
>> +
>> + tcg_gen_shri_tl(addr, addr, 8);
>> + tcg_gen_andi_tl(M, addr, 0xff);
>> +
>> + tcg_gen_shri_tl(addr, addr, 8);
>> + tcg_gen_andi_tl(H, addr, 0xff);
>> +}
>> +
>> +void gen_set_xaddr(TCGv addr)
>> +{
>> + gen_set_addr(addr, cpu_rampX, cpu_r[27], cpu_r[26]);
>> +}
>> +
>> +void gen_set_yaddr(TCGv addr)
>> +{
>> + gen_set_addr(addr, cpu_rampY, cpu_r[29], cpu_r[28]);
>> +}
>> +
>> +void gen_set_zaddr(TCGv addr)
>> +{
>> + gen_set_addr(addr, cpu_rampZ, cpu_r[31], cpu_r[30]);
>> +}
>> +
>> +TCGv gen_get_addr(TCGv H, TCGv M, TCGv L)
>> +{
>> + TCGv addr= tcg_temp_new_i32();
>> +
>> + tcg_gen_mov_tl(addr, H); /* addr = H:M:L */
>> + tcg_gen_shli_tl(addr, addr, 8);
>> + tcg_gen_or_tl(addr, addr, M);
>> + tcg_gen_shli_tl(addr, addr, 8);
>> + tcg_gen_or_tl(addr, addr, L);
>> +
>> + return addr;
>> +}
>> +
>> +TCGv gen_get_xaddr()
>> +{
>> + return gen_get_addr(cpu_rampX, cpu_r[27], cpu_r[26]);
>> +}
>> +
>> +TCGv gen_get_yaddr()
>> +{
>> + return gen_get_addr(cpu_rampY, cpu_r[29], cpu_r[28]);
>> +}
>> +
>> +TCGv gen_get_zaddr()
>> +{
>> + return gen_get_addr(cpu_rampZ, cpu_r[31], cpu_r[30]);
>> +}
>>
>
> Wow. Um... Surely it would be better to store X and Y internally as whole
> 24-bit quantities, and Z as a 16-bit quantity (to be extended with rampz,
> rampd, or eind as needed).
>
rampX/Y/Z are represented now as 0x00ff0000.
X/Y/Z can be represented as 16 bits registers, however I do not know if and
when r26-r31 are used as 8 bits, so if X/Y/Z are represented as 16 bits it
would be hard to use r26-r31 in arithmetics
>
> This would allow normal loads, stores, and autoincrement to operate
> efficiently. When you see byte accesses to R[26-31], you extract/deposit
> the bytes as required. I assume that happens (significantly) less often
> than operating on X/Y/Z as a unit...
>
> One might make the same argument for R[24-25] as it pertains to adiw et al.
I don't think so, it will make simple arithmetics complex.
>
>
> +int sex(int Imm, unsigned bits)
>> +{
>> + Imm <<= 32 - bits;
>> + Imm >>= 32 - bits;
>> +
>> + return Imm;
>> +}
>>
>
> This is sextract32(imm, 0, bits).
done. thanks.
>
>
> +int avr_translate_ADIW(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + if (avr_feature(env, AVR_FEATURE_ADIW_SBIW) == false) {
>> + gen_helper_unsupported(cpu_env);
>> +
>> + return BS_EXCP;
>> + }
>> +
>> + TCGv RdL = cpu_r[24 + 2 * ADIW_Rd(opcode)];
>> + TCGv RdH = cpu_r[25 + 2 * ADIW_Rd(opcode)];
>> + int Imm = (ADIW_hImm(opcode) << 4) | (ADIW_lImm(opcode));
>> + TCGv R = tcg_temp_new_i32();
>> + TCGv Rd = tcg_temp_new_i32();
>> + TCGv t0 = tcg_temp_new_i32();
>> +
>> + /* op */
>> + tcg_gen_shli_tl(Rd, RdH, 8); /* R = ((H << 8) | L) + Imm */
>> + tcg_gen_or_tl(Rd, RdL, Rd);
>> + tcg_gen_addi_tl(R, Rd, Imm);
>> + tcg_gen_andi_tl(R, R, 0xffff);/* make it 16 bits */
>> +
>> + /* Cf */
>> + tcg_gen_not_tl(t0, R); /* t0 = Rd & ~R */
>> + tcg_gen_and_tl(t0, Rd, t0);
>> + tcg_gen_shri_tl(cpu_Cf, t0, 15); /* Cf = t0(15) */
>>
>
> Or simply bit 16.
>
> + /* Sf */
>> + tcg_gen_and_tl(cpu_Sf, cpu_Nf, cpu_Vf);/* Sf = Nf & Vf */
>>
>
> xor.
fixed.
>
>
> +int avr_translate_ASR(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[ASR_Rd(opcode)];
>> + TCGv t1 = tcg_temp_new_i32();
>> + TCGv t2 = tcg_temp_new_i32();
>> +
>> + /* op */
>> + tcg_gen_andi_tl(t1, Rd, 0x80); /* t1 = (Rd & 0x80) | (Rd >>
>> 1) */
>> + tcg_gen_shri_tl(t2, Rd, 1);
>> + tcg_gen_or_tl(t1, t1, t2);
>> +
>> + /* Cf */
>> + tcg_gen_andi_tl(cpu_Cf, Rd, 1); /* Cf = Rd(0) */
>> +
>> + /* Vf */
>> + tcg_gen_and_tl(cpu_Vf, cpu_Nf, cpu_Cf);/* Vf = Nf & Cf */
>>
>
> xor.
>
> +/*
>> + Copies the T Flag in the SREG (Status Register) to bit b in register
>> Rd.
>> +*/
>> +int avr_translate_BLD(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[BLD_Rd(opcode)];
>> + TCGv t1 = tcg_temp_new_i32();
>> +
>> + tcg_gen_shli_tl(t1, cpu_Tf, BLD_Bit(opcode));
>> + tcg_gen_or_tl(Rd, Rd, t1);
>> + tcg_gen_and_tl(Rd, Rd, t1);
>>
>
> Err... extra and? I'm pretty sure this insn simply sets bit B, and
> doesn't clear all of the others.
it was a bug. fixed.
>
>
> +/*
>> + Clears the specified bits in register Rd. Performs the logical AND
>> between the contents of register Rd and the
>> + complement of the constant mask K. The result will be placed in
>> register Rd.
>> +*/
>> +int avr_translate_COM(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[COM_Rd(opcode)];
>> + TCGv R = tcg_temp_new_i32();
>> +
>> + tcg_gen_movi_tl(R, 0xff);
>> + tcg_gen_sub_tl(Rd, R, Rd);
>>
>
> Better as xor with 0xff.
>
done
>
> +/*
>> + This instruction performs a compare between two registers Rd and Rr,
>> and skips the next instruction if Rd = Rr.
>> +*/
>> +int avr_translate_CPSE(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[CPSE_Rd(opcode)];
>> + TCGv Rr = cpu_r[(CPSE_hRr(opcode) << 4) |
>> (CPSE_lRr(opcode))];
>> + TCGLabel *skip= gen_new_label();
>> +
>> + tcg_gen_movi_tl(cpu_pc, ctx->inst[1].npc); /* PC if next inst
>> is skipped */
>> + tcg_gen_brcond_i32(TCG_COND_EQ, Rd, Rr, skip);
>> + tcg_gen_movi_tl(cpu_pc, ctx->inst[0].npc); /* PC if next inst
>> is not skipped */
>> + gen_set_label(skip);
>>
>
> Any reason you're not using goto_tb?
I think, this place can be optimized by not exiting tb on skip. I will do
it in next version.
>
>
> +/*
>> + Subtracts one -1- from the contents of register Rd and places the
>> result in the destination register Rd.
>> + The C Flag in SREG is not affected by the operation, thus allowing
>> the DEC instruction to be used on a loop
>> + counter in multiple-precision computations.
>> + When operating on unsigned values, only BREQ and BRNE branches can
>> be expected to perform consistently.
>> + When operating on two’s complement values, all signed branches are
>> available.
>> +*/
>> +int avr_translate_DEC(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[DEC_Rd(opcode)];
>> +
>> + tcg_gen_subi_tl(Rd, Rd, 1); /* Rd = Rd - 1 */
>> + tcg_gen_andi_tl(Rd, Rd, 0xff); /* make it 8 bits */
>> +
>> + tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, Rd, 0x7f); /* cpu_Vf =
>> Rd == 0x7f */
>>
>
> This is INC overflow.
please explain, I don't see a problem here
>
>
> +int avr_translate_INC(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[INC_Rd(opcode)];
>> +
>> + tcg_gen_addi_tl(Rd, Rd, 1);
>> + tcg_gen_andi_tl(Rd, Rd, 0xff);
>> +
>> + tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, Rd, 0x80); /* cpu_Vf =
>> Rd == 0x80 */
>>
>
> This is DEC overflow.
> please explain, I don't see a problem here
>
> +/*
>> + Load one byte indirect from data space to register and stores an
>> clear the bits in data space specified by the
>> + register. The instruction can > +/*
>>
> only be used towards internal SRAM.
>
>> + The data location is pointed to by the Z (16 bits) Pointer Register
>> in the Register File. Memory access is limited
>> + to the current data segment of 64KB. To access another data segment
>> in devices with more than 64KB data
>> + space, the RAMPZ in register in the I/O area has to be changed.
>> + The Z-pointer Register is left unchanged by the operation. This
>> instruction is especially suited for clearing status
>> + bits stored in SRAM.
>> +*/
>> +int avr_translate_LAC(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + if (avr_feature(env, AVR_FEATURE_RMW) == false) {
>> + gen_helper_unsupported(cpu_env);
>> +
>> + return BS_EXCP;
>> + }
>> +
>> + TCGv Rr = cpu_r[LAC_Rr(opcode)];
>> + TCGv addr= gen_get_zaddr();
>> + TCGv t0 = tcg_temp_new_i32();
>> + TCGv t1 = tcg_temp_new_i32();
>> +
>> + tcg_gen_qemu_ld8u(
>> + t0, addr, DATA_INDEX); /* t0 = mem[addr] */
>> + tcg_gen_movi_tl(t1, 0xff); /* t1 = t0 & (0xff
>> - Rr) */
>> + tcg_gen_sub_tl(t1, t1, Rr);
>> + tcg_gen_and_tl(t1, t0, t1);
>>
>
> These last 3 are
>
fixed
>
> tcg_gen_andc_tl(t1, t0, Rr);
>
> +/*
>> + This instruction performs 8-bit x 8-bit -> 16-bit signed
>> multiplication.
>> +*/
>> +int avr_translate_MULS(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + if (avr_feature(env, AVR_FEATURE_MUL) == false) {
>> + gen_helper_unsupported(cpu_env);
>> +
>> + return BS_EXCP;
>> + }
>> +
>> + TCGv R0 = cpu_r[0];
>> + TCGv R1 = cpu_r[1];
>> + TCGv Rd = cpu_r[16 + MULS_Rd(opcode)];
>> + TCGv Rr = cpu_r[16 + MULS_Rr(opcode)];
>> + TCGv R = tcg_temp_new_i32();
>> +
>> + tcg_gen_ext8s_tl(Rd, Rd); /* make Rd full 32 bit signed
>> */
>> + tcg_gen_ext8s_tl(Rr, Rr); /* make Rr full 32 bit signed
>> */
>>
>
> You need to either zero-extend these again afterward, or you need to
> perform the extensions into a temporary.
oops
>
>
> +/*
>> + Replaces the contents of register Rd with its two’s complement; the
>> value $80 is left unchanged.
>> +*/
>> +int avr_translate_NEG(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + TCGv Rd = cpu_r[NEG_Rd(opcode)];
>> + TCGv R = tcg_temp_new_i32();
>> +
>> + tcg_gen_neg_tl(R, Rd);
>> +
>> + tcg_gen_setcondi_tl(TCG_COND_NE, cpu_Cf, R, 0x00); /* Cf = R !=
>> 0x00 */
>> + tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, R, 0x80); /* Vf = R ==
>> 0x80 */
>>
>
> Correct, but surely it's better to re-use the normal subtraction
> infrastructure. In particular, setcond is at minimum two operations. The
> tcg optimizer should do approximately as well folding away the zero from
> the subtract.
>
> + tcg_gen_not_tl(cpu_Hf, Rd); /* Hf =
>> (~Rd | R)(3) */
>> + tcg_gen_or_tl(cpu_Hf, cpu_Hf, R);
>> + tcg_gen_shri_tl(cpu_Hf, cpu_Hf, 3);
>> + tcg_gen_andi_tl(cpu_Hf, cpu_Hf, 1);
>> + gen_ZNSf(R);
>> +
>> + tcg_temp_free_i32(R);
>>
>
> You failed to write back the result.
>
done.
>
> +int avr_translate_XCH(CPUAVRState *env, DisasContext *ctx, uint32_t
>> opcode)
>> +{
>> + if (avr_feature(env, AVR_FEATURE_RMW) == false) {
>> + gen_helper_unsupported(cpu_env);
>> +
>> + return BS_EXCP;
>> + }
>> +
>> + TCGv Rd = cpu_r[XCH_Rd(opcode)];
>> + TCGv t0 = tcg_temp_new_i32();
>> + TCGv addr = gen_get_zaddr();
>> +
>> + tcg_gen_qemu_ld8u(
>> + addr, t0, DATA_INDEX);
>> + tcg_gen_qemu_st8(
>> + addr, Rd, DATA_INDEX);
>>
>
> The address and data temporaries are swapped.
>
> fixed.
>
> r~
>
--
Best Regards,
Michael Rolnik
- Re: [Qemu-devel] [PATCH 07/10] target-avr: adding instruction decoder, (continued)
[Qemu-devel] [PATCH 05/10] target-avr: adding AVR interrupt handling, Michael Rolnik, 2016/06/02
[Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation, Michael Rolnik, 2016/06/02
- Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation, Richard Henderson, 2016/06/04
- Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation,
Michael Rolnik <=
- Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation, Richard Henderson, 2016/06/05
- Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation, Michael Rolnik, 2016/06/06
- Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation, Richard Henderson, 2016/06/06
- Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation, Michael Rolnik, 2016/06/06
Re: [Qemu-devel] [PATCH 01/10] target-avr: AVR cores support is added. 1. basic CPU structure 2. registers 3. no instructions, Richard Henderson, 2016/06/04