qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: pixman_blt on aarch64


From: BALATON Zoltan
Subject: Re: pixman_blt on aarch64
Date: Tue, 7 Feb 2023 13:46:03 +0100 (CET)

Maybe we should include pixman list in this. In case you're not subscribed I'm forwarding it to that list now.

On Tue, 7 Feb 2023, Akihiko Odaki wrote:
On 2023/02/06 4:16, Richard Henderson wrote:
On 2/5/23 08:44, BALATON Zoltan wrote:
On Sun, 5 Feb 2023, Richard Henderson wrote:
On 2/4/23 06:57, BALATON Zoltan wrote:
This has just bounced, I hoped to still be able to post after moderation but now I'm resending it after subscribing to the pixman list. Meanwhile I've found this ticket as well: https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/71 See the rest of the message below. Looks like this is being worked on but I'm not sure how far is it from getting resolved. Any info on that?

Please try this:

https://gitlab.freedesktop.org/rth7680/pixman/-/tree/general

It provides a pure C version for ultimate fallback.
Unfortunately, there are no test cases for this, nor documentation.

It can share the implementation with fast_composite_src_memcpy(). fast_composite_src_memcpy() should be well-tested with the tests for pixman_image_composite(). arm-neon does similar so we can trust fast_composite_src_memcpy() functions as blt.


Thanks, I don't have hardware to test this but maybe Akihiko or somebody else here cam try. Do you think pixman_fill won't have the same problem? It seems to have at least a fast_path implementation but I'm not sure how pixman selects these.

For fill, I think the fast_path implementation should work, so long as it isn't disabled via environment variable.  I'm not sure why that is, and why _fast_path isn't part of _general.

The implementation of fill should be moved to pixman-general.c but the other part of pixman-fast-path.c shouldn't be.

By isolating the non-essential fast-path code to pixman-fast-path.c, you can disable it with the environment variable when you are not confident with the implementation, and that may help debugging. However, if pixman-fast-path.c has some essential code like the implementation of fill, the utility of the environment variable will be impaired as setting the environment variable may break things.


Indeed, the fast_path implementation of fill should be easily vectorized by the compiler. I would expect it to be competitive with an assembly implementation.  I would expect the implementation chain design to only be useful when multiple vector implementations are supported and selected at runtime -- e.g. the x86 SSE2 vs SSSE3 stuff.


r~


reply via email to

[Prev in Thread] Current Thread [Next in Thread]