qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QEMU for Qualcomm Hexagon - KVM Forum talk and code available


From: Alex Bennée
Subject: Re: QEMU for Qualcomm Hexagon - KVM Forum talk and code available
Date: Wed, 13 Nov 2019 10:31:16 +0000
User-agent: mu4e 1.3.5; emacs 27.0.50

Taylor Simpson <address@hidden> writes:

> I had discussions with several people at the KVM Forum, and I’ve been 
> thinking about how to divide up the code for community review.  Here is my 
> proposal for the steps.
>
>   1.  linux-user changes + linux-user/hexagon + skeleton of target/hexagon
> This is the minimum amount to build and run a very simple program.  I
>   have an assembly program that prints “Hello” and exits.  It is
>   constructed to use very few instructions that can be added brute
>   force in the Hexagon back end.

I'm hoping most of the linux-user changes are in the hexagon runloop?
There has been quite a bit of work splitting up and cleaning up the
#ifdef mess in linux-user over the last few years.

>   2.  Add the code that is imported from the Hexagon simulator and the qemu 
> helper generator
> This will allow the scalar ISA to be executed.  This will grow the set
> of programs that could execute, but there will still be limitations.
> In particular, there can be no packets which means the C library won’t
> work .  We have to build with -nostdlib

You could run -nostdlib system TCG tests (hello and memory) but that
would require modelling some sort of hardware and assumes you have a
simple serial port or semihosting solution. That said a bunch of the
MIPS tests are linux-user and -nostdlib so that isn't a major problem in
getting some of the tests running.

When you say code imported from the hexagon simulator I was under the
impression you were generating code from the instruction description.
Otherwise you'll need to be very clear about your licensing grants.

>   3.  Add support for packet semantics
> At this point, we will be able to execute full programs linked with
> the C library.  This will include the check-tcg tests.

I think the interesting question is if the roll-back semantics of the
hexagon are something we might need for other emulated architectures or
is a particularly specific solution for Hexagon (I'm guessing the later).

>   4.  Add support for the wide vector extensions
>   5.  Add the helper overrides for performance optimization
> Some of these will be written by hand, and we’ll work with rev.ng to
>   integrate their flex/bison generator.

One thing to nail down will be will we include the generated code in the
source tree with a tool to regenerate (much like we do for
linux-headers) or if we want to add the dependency and regenerate each
time from scratch. I don't see including flex/bison as a dependency
being a major issue (in fact we have it in our docker images so I guess
something uses it). However it might be trickier depending on
libclang which was also being discussed.

>
> I would love some feedback on this proposal.  Hopefully, that is enough 
> detail so that people can comment.  If anything isn’t clear, please ask 
> questions.
>
>
> Thanks,
> Taylor
>
>
> From: Qemu-devel <qemu-devel-bounces+tsimpson=address@hidden> On Behalf Of 
> Taylor Simpson
> Sent: Tuesday, November 5, 2019 10:33 AM
> To: Aleksandar Markovic <address@hidden>
> Cc: Alessandro Di Federico <address@hidden>; address@hidden; address@hidden; 
> Niccolò Izzo <address@hidden>
> Subject: RE: QEMU for Qualcomm Hexagon - KVM Forum talk and code available
>
> Hi Aleksandar,
>
> Thank you – We’re glad you enjoyed the talk.
>
> One point of clarification on SIMD in Hexagon.  What we refer to as the 
> “scalar” core does have some SIMD operations.  Register pairs are 8 bytes, 
> and there are several SIMD instructions.  The example we showed in the talk 
> included a VADDH instruction.  It treats the register pair as 4 half-words 
> and does a vector add.  Then there are the Hexagon Vector eXtensions (HVX) 
> instructions that operate on 128-byte vectors.  There is a wide variety of 
> instructions in this set.  As you mentioned, some of them are pure SIMD and 
> others are very complex.
>
> For the helper generator, the vast majority of these are implemented with 
> helpers.  There are only 2 vector instructions in the scalar core that have a 
> TCG override, and all of the HVX instructions are implemented with helpers.  
> If you are interested in a deeper dive, see below.
>
> Alessandro and Niccolo can comment on the flex/bison implementation.
>
> Thanks,
> Taylor
>
>
> Now for the deeper dive in case anyone is interested.  Look at the genptr.c 
> file in target/hexagon.
>
> The first vector instruction that is with an override is A6_vminub_RdP.  It 
> does a byte-wise comparison of two register pairs and sets a predicate 
> register indicating whether the byte in the left or right operand is greater. 
>  Here is the TCG code.
> #define fWRAP_A6_vminub_RdP(GENHLPR, SHORTCODE) \
> { \
>     TCGv BYTE = tcg_temp_new(); \
>     TCGv left = tcg_temp_new(); \
>     TCGv right = tcg_temp_new(); \
>     TCGv tmp = tcg_temp_new(); \
>     int i; \
>     tcg_gen_movi_tl(PeV, 0); \
>     tcg_gen_movi_i64(RddV, 0); \
>     for (i = 0; i < 8; i++) { \
>         fGETUBYTE(i, RttV); \
>         tcg_gen_mov_tl(left, BYTE); \
>         fGETUBYTE(i, RssV); \
>         tcg_gen_mov_tl(right, BYTE); \
>         tcg_gen_setcond_tl(TCG_COND_GT, tmp, left, right); \
>         fSETBIT(i, PeV, tmp); \
>         fMIN(tmp, left, right); \
>         fSETBYTE(i, RddV, tmp); \
>     } \
>     tcg_temp_free(BYTE); \
>     tcg_temp_free(left); \
>     tcg_temp_free(right); \
>     tcg_temp_free(tmp); \
> }
>
> The second instruction is S2_vsplatrb.  It takes the byte from the operand 
> and replicates it 4 times into the destination register.  Here is the TCG 
> code.
> #define fWRAP_S2_vsplatrb(GENHLPR, SHORTCODE) \
> { \
>     TCGv tmp = tcg_temp_new(); \
>     int i; \
>     tcg_gen_movi_tl(RdV, 0); \
>     tcg_gen_andi_tl(tmp, RsV, 0xff); \
>     for (i = 0; i < 4; i++) { \
>         tcg_gen_shli_tl(RdV, RdV, 8); \
>         tcg_gen_or_tl(RdV, RdV, tmp); \
>     } \
>     tcg_temp_free(tmp); \
> }
>
>
> From: Aleksandar Markovic <address@hidden<mailto:address@hidden>>
> Sent: Monday, November 4, 2019 6:05 PM
> To: Taylor Simpson <address@hidden<mailto:address@hidden>>
> Cc: address@hidden<mailto:address@hidden>; Alessandro Di Federico 
> <address@hidden<mailto:address@hidden>>; 
> address@hidden<mailto:address@hidden>; Niccolò Izzo 
> <address@hidden<mailto:address@hidden>>
> Subject: Re: QEMU for Qualcomm Hexagon - KVM Forum talk and code available
>
>
> CAUTION: This email originated from outside of the organization.
>
>
> On Friday, October 25, 2019, Taylor Simpson 
> <address@hidden<mailto:address@hidden>> wrote:
> We would like inform the you that we will be doing a talk at the KVM Forum 
> next week on QEMU for Qualcomm Hexagon.  Alessandro Di Federico, Niccolo 
> Izzo, and I have been working independently on implementations of the Hexagon 
> target.  We plan to merge the implementations, have a community review, and 
> ultimately have Hexagon be an official target in QEMU.  Our code is available 
> at the links below.
> https://github.com/revng/qemu-hexagon
> https://github.com/quic/qemu
> If anyone has any feedback on the code as it stands today or guidance on how 
> best to prepare it for review, please let us know.
>
>
> Hi, Taylor, Niccolo (and Alessandro too).
>
> I didn't have a chance to take a look at neither the code nor the docs, but I 
> did attend you presentation at KVM Forum, and I found it superb and 
> attractive, one of the best on the conference, if not the very best.
>
> I just have a couple of general questions:
>
> - Regarding the code you plan to upstream, are all SIMD instructions 
> implemented via tcg API, or perhaps some of them remain being implemented 
> using helpers?
>
> - Most of SIMD instructions can be viewed simply as several paralel 
> elementary operations. However, for a given SIMD instruction set, usually not 
> all of them fit into this pattern. For example, "horizontal add" (addind data 
> elements from the same SIMD register), various "pack/unpack/interleave/merge" 
> operations, and more general "shuffle/permute" operations as well (here I am 
> not sure which of these are included in Hexagon SIMD set, but there must be 
> some). How did you deal with them?
>
> - What were the most challenging Hexagon SIMD instructions you came accross 
> while developing your solution?
>
> Sincerely,
> Aleksandar
>
>
>
>
> Thanks,
> Taylor


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]