qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translatio


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation
Date: Sun, 5 Jun 2016 16:34:28 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0

On 06/05/2016 02:47 PM, Michael Rolnik wrote:
    Is there a reason this code isn't going into translate.c?
    You wouldn't need the declarations in translate-inst.h or translate.h.

I see here two levels of logic
a. instruction translation
b. general flow of program translation.

FWIW, static functions can be automatically inlined by the compiler, whereas external function calls can't. In theory, the compiler could auto-inline the entire translator into a single function.

    Order these functions properly and you don't need forward declarations.

is it a requirements? this way it look cleaner.

Does it? In my experience it just means you've got to edit two places when one changes things.

    While this is exactly the formula in the manual, it's also equal to

        ((Rd ^ Rr) ^ R) & 16

Please explain. I don't see it.

http://www.wolframalpha.com/input/?i=A+and+B+or+A+and+not+C+or+B+and+not+C,+A+xor+B+xor+C

I did explain:

    where we examine the difference between the non-carry addition (Rd ^ Rr)
    and the carry addition (R) to find the carry out of bit 3.  This reduces
    the operation count to 2 from 5.

It's not a manipulation of the original expression, but a different way of looking at the problem.

You want to compute carry-out of bit 3. Given the *result* of the addition, it's easier to examine bit 4, into which carry-in has happened, rather than examine bit 3 and re-compute the carry-out.

The AVR hardware probably computes it exactly as described in the manual, because that can be done in parallel with the addition, and has a lower total gate latency. This is fairly common in the industry, where the documentation follows the implementation more closely than it perhaps should.

    Note that carry and borrow are related, and thus this is *also* computable
    via ((Rd ^ Rr) ^ R) on bit 4.

please explain, I don't see it
http://www.wolframalpha.com/input/?i=not+A+and+B+or+not+A+and+C+or++C+and+B,+A+xor+B+xor+C

As above, given the *result* of the subtraction, examining bit 4 into which borrow-in has happened.

Once you accept that, you'll note that the same expression can be used to re-create both carry-in and borrow-in.

    I'll also note that the piece-wise store is big-endian, so you could
    perform this in 1 store for 2_BYTE and 2 stores for 3_BYTE.

I got an expression that the platform is little endian.

Then you've got the order of the stores wrong. Your code pushes the LSB before pushing the MSB, which results in the MSB at the lower address, which means big-endian.

    Wow.  Um... Surely it would be better to store X and Y internally as whole
    24-bit quantities, and Z as a 16-bit quantity (to be extended with rampz,
    rampd, or eind as needed).

rampX/Y/Z are represented now as 0x00ff0000.
X/Y/Z can be represented as 16 bits registers, however I do not know if and
when r26-r31 are used as 8 bits, so if X/Y/Z are represented as 16 bits it
would be hard to use r26-r31 in arithmetics

You would use a setup like the following, and use these functions instead of other direct accesses to the cpu registers. This setup requires similar functions in cpu.h for use by e.g. gdbstub.c.


TCGv cpu_rb[24];
TCGv cpu_rw[4];

TCGv read_byte(unsigned rb)
{
    TCGv byte = tcg_temp_new();
    if (rb < 24) {
        tcg_gen_mov_tl(byte, cpu_rb[rb]);
    } else {
        unsigned rw = (rb - 24) / 2;
        if (rb & 1) {
            tcg_gen_shri_tl(byte, cpu_rw[rw]);
        } else {
            tcg_gen_ext8u_tl(byte, cpu_rw[rw]);
        }
    }
    return byte;
}

void write_byte(unsigned rb, TCGv val)
{
    if (rb < 24) {
        tcg_gen_mov_tl(cpu_rb[rb], val);
    } else {
        unsigned rw = (rb - 24) / 2;
        tcg_gen_deposit_tl(cpu_rw[rw], cpu_rw[rw], val, (rb & 1) * 8, 8);
    }
}

/* Return RB+1:RB.  */
TCGv read_word(unsigned rb)
{
    TCGv word = tcg_temp_new();
    if (rb < 24) {
        tcg_gen_deposit_tl(word, cpu_rb[rb], cpu_rb[rb + 1], 8, 8);
    } else {
        unsigned rw = (rb - 24) / 2;
        tcg_gen_mov_tl(word, cpu_rw[rw]);
    }
    return word;
}

void write_word(unsigned rb, TCGv val)
{
    if (rb < 24) {
        tcg_gen_ext8u_tl(cpu_rb[rb], val);
        tcg_gen_shri_tl(cpu_rb[rb + 1], val, 8);
    } else {
        unsigned rw = (rb - 24) / 2;
        tcg_gen_mov_tl(cpu_rw[rw], val);
    }
}

        +int    avr_translate_DEC(CPUAVRState *env, DisasContext *ctx, uint32_t
...
        +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, Rd, 0x7f);  /* cpu_Vf   =
        Rd == 0x7f  */


    This is INC overflow.

please explain, I don't see a problem here

You have swapped the overflow conditions for INC and DEC.

    127 + 1 -> -128
    -128 - 1 -> 127


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]