rapp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Rapp-dev] RC_ALIGNMENT definition tweak heads-up


From: Hans-Peter Nilsson
Subject: [Rapp-dev] RC_ALIGNMENT definition tweak heads-up
Date: Sat, 3 Dec 2011 04:04:07 +0100

The following commit might warrant a heads-up: if you intend to
implement a back-end for a platform with the required alignment
lower than the vector size, RC_ALIGNMENT needs to be the larger
of them.  You'll notice that when testing...

An alternative would be have separate macro for the max access
size (the vector size when using a vector back-end) and the max
required alignment, but IMO the performance benefit from the
lowered memory footprint would be consumed by more complicated
code and the likelihood of problems from confusion creeping in.
So, as long as there's no widespread SIMD with that property
(and no measurements of improvements with a more complicated
back-end API), let's stick to the simpler solution.

The RC_ALIGNMENT macro is not documented elsewhere,
unfortunately.  One of these days I'm going to move it to the
configure.ac --enable-backend cases, so we'll have one less
place to modify for new back-ends, as well as fixing a FIXME in
configure.ac (after moving things around) for the unlikely
systems lacking both memalign and posix_memalign but having a
malloc and SIMD back-end matching (or smaller than)
RC_COMPUTED_NATIVE_SIZE.

commit 9cc07cba439ab414b5ec76d99e364ba215a06ed5
Author: Hans-Peter Nilsson <address@hidden>
Date:   Sat Dec 3 03:43:19 2011 +0100

    Tweak comment re RC_ALIGNMENT: max of vector size and required memory 
alignment

diff --git a/compute/include/rc_platform.h b/compute/include/rc_platform.h
index 9195a7e..16bf4e3 100644
--- a/compute/include/rc_platform.h
+++ b/compute/include/rc_platform.h
@@ -77,7 +77,9 @@
 #endif
 
 /**
- *  The buffer alignment value in bytes.
+ *  The maximum of the required buffer alignment value in bytes for
+ *  vector memory access and the vector size: the latter can be bigger
+ *  than the former.
  */
 #if defined __SSE2__ || defined __VEC__ || defined __ALTIVEC__
 #define RC_ALIGNMENT 16

brgds, H-P



reply via email to

[Prev in Thread] Current Thread [Next in Thread]