Re: [Qemu-devel] [PATCH v3 9/9] tcg: Lower indirect registers in a separ

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 9/9] tcg: Lower indirect registers in a separ

From:	Richard Henderson
Subject:	Re: [Qemu-devel] [PATCH v3 9/9] tcg: Lower indirect registers in a separate pass
Date:	Thu, 4 Aug 2016 00:57:20 +0530
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1

On 07/26/2016 12:53 AM, Aurelien Jarno wrote:

Now on the less technical side, I really like the idea of being able to
transform more or less in place the TCG instruction stream. Your more or
less recent patches towards that direction are great. That said I am a
bit worried that we loop many times on the various ops. We used to have
one forward pass (optimizer) and one backward pass (liveness analysis).
Your patch adds up to two additional passes (one forward and one
backward), this clearly has a cost. Given that indirect registers bring
a lot of performance I think it is worth it. Now I wonder if there is
any way to do the lowering of registers earlier, I mean before the
liveness analysis. This would probably generate plenty of useless ops,
but that are later removed by the liveness analysis. Maybe you have
already try that?

No, I did not try that, simply because we don't do liveness analysis of memory.And that's what we have with lowering indirect registers earlier. Indeed, itmeans we're right back where we were before introducing them.

We need liveness analysis on the tcg globals in order to know where to add thereads and writes. I see no way around that.

The one place where the code could be improved to remove a pass is to have theindirect lowering pass update liveness at the same time. We need accurateliveness in order to satisfy the asserts in the final code generation pass, sowe have to do something. I simply thought it was easier to re-run the originalliveness pass rather than complicating the indirect lowering pass.

I think it also depends on which direction we want to go with TCG,
either plenty of small independent optimization passes, or keep the
number of passes limited which means more complex code. Contrary to
a compiler we have to do a much more difficult trade-off between the
optimization time and the level of optimization.

Indeed. Fewer passes over large amounts of data is better, but I'm not sure wehave "large" amounts of data for the average TB. On the other hand, smallerpasses can reduce the code size of any one loop so that each fits in icachewhen one unified pass might not.

r~

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v3 9/9] tcg: Lower indirect registers in a separate pass, Richard Henderson <=

Prev by Date: [Qemu-devel] [PATCH v6 4/4] docs: Add Documentation for Mediated devices
Next by Date: [Qemu-devel] [PATCH] ppc64: fix compressed dump with pseries kernel
Previous by thread: [Qemu-devel] [PATCH v6 0/4] Add Mediated device support
Next by thread: [Qemu-devel] [PATCH] ppc64: fix compressed dump with pseries kernel
Index(es):
- Date
- Thread