avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega)


From: David Brown
Subject: Re: [avr-gcc-list] >4.5.1 better than this at register-structure (xmega) access?
Date: Thu, 11 Oct 2012 09:15:52 +0200
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20120907 Thunderbird/15.0.1

On 10/10/2012 23:13, Erik Walthinsen wrote:
On 10/10/2012 01:43 PM, Georg-Johann Lay wrote:
 > Such a bug has never been reported to the GCC bug tracker.
Well then, I guess I need to file one.  I just have no idea how to
actually fix it, and no time to do so at this point ;-(

If nobody found it important enough to file a problem report for over 3
years (4.5 was released early 2009) I'd guess this is simply not an
important issue?
It's not a bug per se but a missing optimization, so it's entirely
possible nobody's cared enough to check.  I'm a) doing some extremely
time-critical ISR routines, and b) trying to actually make use of the
struct convenience.


You should file a "missed optimisation" bug on the issue - then it will not be forgotten or repeated. Missed optimisation bugs don't get as high priority as wrong code bugs, or adding support for new devices, but they /do/ get worked on whenever someone has the time (and can figure out a way to improve the issue). In particular, when there are other changes in the code in related areas, developers look through the list of outstanding bugs to see if there are other issues that could be fixed at the same time.

In this case, the structure Z+offset access is the most space-efficient if you are making three or more accesses to the same structure, and for two accesses it takes the same space and as using the direct access. So using the Z+offset is not faster than direct access.

The exception here is if it is possible to use Z or Z+ modes, which are single-cycle on the XMega.

So ideally, the compiler should use direct access when there are two or less accesses to the same structure, and Z+q for three or more accesses (in -Os) or stick with direct access (in -O2). If it first uses Z+q, then Z should be loaded with the lowest accessed address to allow Z mode for a slight speed gain. And multiple adjacent access can be done using Z+ for faster access.


That's the theory, anyway, as far as I understand it - but I don't know how it could be implemented in practice...




reply via email to

[Prev in Thread] Current Thread [Next in Thread]