Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translatio

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translatio

From:	Michael Rolnik
Subject:	Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation
Date:	Mon, 6 Jun 2016 09:52:44 +0300

On Mon, Jun 6, 2016 at 2:34 AM, Richard Henderson <address@hidden> wrote:

> On 06/05/2016 02:47 PM, Michael Rolnik wrote:
>
>>     Is there a reason this code isn't going into translate.c?
>>     You wouldn't need the declarations in translate-inst.h or translate.h.
>>
>> I see here two levels of logic
>> a. instruction translation
>> b. general flow of program translation.
>>
>
> FWIW, static functions can be automatically inlined by the compiler,
> whereas external function calls can't.  In theory, the compiler could
> auto-inline the entire translator into a single function.

these functions are called through pointer so there is no way for these
functions to be inlined.

>
>
>     Order these functions properly and you don't need forward declarations.
>>
>> is it a requirements? this way it look cleaner.
>>
>
> Does it?  In my experience it just means you've got to edit two places
> when one changes things.

It still one place because it's either instruction translation or a program
translation.
from my point of view translate.c will have almost no bugs in very close
feature whereas translate-inst.c will hove bug fixes and modification.
having it in two files will isolate translate.c from erroneous
modifications.

>
>
>     While this is exactly the formula in the manual, it's also equal to
>>
>>         ((Rd ^ Rr) ^ R) & 16
>>
>> Please explain. I don't see it.
>>
>>
>> http://www.wolframalpha.com/input/?i=A+and+B+or+A+and+not+C+or+B+and+not+C,+A+xor+B+xor+C
>>
>
> I did explain:

truth table shows that these computations are different.

>
>
>     where we examine the difference between the non-carry addition (Rd ^
>> Rr)
>>     and the carry addition (R) to find the carry out of bit 3.  This
>> reduces
>>     the operation count to 2 from 5.
>>
>
> It's not a manipulation of the original expression, but a different way of
> looking at the problem.
>
> You want to compute carry-out of bit 3.  Given the *result* of the
> addition, it's easier to examine bit 4, into which carry-in has happened,
> rather than examine bit 3 and re-compute the carry-out.
>
> The AVR hardware probably computes it exactly as described in the manual,
> because that can be done in parallel with the addition, and has a lower
> total gate latency.  This is fairly common in the industry, where the
> documentation follows the implementation more closely than it perhaps
> should.

you can't look onto 4th bit because 4th bits in the input were not 0s.

>
>
>     Note that carry and borrow are related, and thus this is *also*
>> computable
>>     via ((Rd ^ Rr) ^ R) on bit 4.
>>
>> please explain, I don't see it
>>
>> http://www.wolframalpha.com/input/?i=not+A+and+B+or+not+A+and+C+or++C+and+B,+A+xor+B+xor+C
>>
>
> As above, given the *result* of the subtraction, examining bit 4 into
> which borrow-in has happened.
>
> Once you accept that, you'll note that the same expression can be used to
> re-create both carry-in and borrow-in.
>
>     I'll also note that the piece-wise store is big-endian, so you could
>>     perform this in 1 store for 2_BYTE and 2 stores for 3_BYTE.
>>
>> I got an expression that the platform is little endian.
>>
>
> Then you've got the order of the stores wrong.  Your code pushes the LSB
> before pushing the MSB, which results in the MSB at the lower address,
> which means big-endian.

this is right. However as far as I understand AVR is neither little nor big
endian because there it's 8 bit architecture (see here
http://www.avrfreaks.net/forum/endian-issue). for time being I defined the
platform to be little endian with ret address exception

>
>
>     Wow.  Um... Surely it would be better to store X and Y internally as
>> whole
>>     24-bit quantities, and Z as a 16-bit quantity (to be extended with
>> rampz,
>>     rampd, or eind as needed).
>>
>> rampX/Y/Z are represented now as 0x00ff0000.
>> X/Y/Z can be represented as 16 bits registers, however I do not know if
>> and
>> when r26-r31 are used as 8 bits, so if X/Y/Z are represented as 16 bits it
>> would be hard to use r26-r31 in arithmetics
>>
>
> You would use a setup like the following, and use these functions instead
> of other direct accesses to the cpu registers.  This setup requires similar
> functions in cpu.h for use by e.g. gdbstub.c.
>
I will have to think about it.

>
>
> TCGv cpu_rb[24];
> TCGv cpu_rw[4];
>
> TCGv read_byte(unsigned rb)
> {
>     TCGv byte = tcg_temp_new();
>     if (rb < 24) {
>         tcg_gen_mov_tl(byte, cpu_rb[rb]);
>     } else {
>         unsigned rw = (rb - 24) / 2;
>         if (rb & 1) {
>             tcg_gen_shri_tl(byte, cpu_rw[rw]);
>         } else {
>             tcg_gen_ext8u_tl(byte, cpu_rw[rw]);
>         }
>     }
>     return byte;
> }
>
> void write_byte(unsigned rb, TCGv val)
> {
>     if (rb < 24) {
>         tcg_gen_mov_tl(cpu_rb[rb], val);
>     } else {
>         unsigned rw = (rb - 24) / 2;
>         tcg_gen_deposit_tl(cpu_rw[rw], cpu_rw[rw], val, (rb & 1) * 8, 8);
>     }
> }
>
> /* Return RB+1:RB.  */
> TCGv read_word(unsigned rb)
> {
>     TCGv word = tcg_temp_new();
>     if (rb < 24) {
>         tcg_gen_deposit_tl(word, cpu_rb[rb], cpu_rb[rb + 1], 8, 8);
>     } else {
>         unsigned rw = (rb - 24) / 2;
>         tcg_gen_mov_tl(word, cpu_rw[rw]);
>     }
>     return word;
> }
>
> void write_word(unsigned rb, TCGv val)
> {
>     if (rb < 24) {
>         tcg_gen_ext8u_tl(cpu_rb[rb], val);
>         tcg_gen_shri_tl(cpu_rb[rb + 1], val, 8);
>     } else {
>         unsigned rw = (rb - 24) / 2;
>         tcg_gen_mov_tl(cpu_rw[rw], val);
>     }
> }
>
>         +int    avr_translate_DEC(CPUAVRState *env, DisasContext *ctx,
>> uint32_t
>>
> ...
>
>>         +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, Rd, 0x7f);  /*
>> cpu_Vf   =
>>         Rd == 0x7f  */
>>
>>
>>     This is INC overflow.
>>
>> please explain, I don't see a problem here
>>
>
> You have swapped the overflow conditions for INC and DEC.
>
>     127 + 1 -> -128
>     -128 - 1 -> 127

this is how it's defined in the document.

>
>
>
> r~
>



-- 
Best Regards,
Michael Rolnik

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 04/10] target-avr: adding instructions encodings, (continued)

Prev by Date: Re: [Qemu-devel] [PATCH qemu v17 10/12] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)
Next by Date: Re: [Qemu-devel] [RFC 14/14] Add virtio-gpu vhost-user backend
Previous by thread: Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation
Next by thread: Re: [Qemu-devel] [PATCH 08/10] target-avr: adding instruction translation
Index(es):
- Date
- Thread