lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] overview of C++ expression template libraries


From: Vadim Zeitlin
Subject: Re: [lmi] overview of C++ expression template libraries
Date: Sun, 21 Mar 2021 01:55:15 +0100

On Thu, 18 Feb 2021 01:27:25 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> On 2/17/21 11:36 AM, Vadim Zeitlin wrote:
GC> [...]
GC> > On Thu, 11 Sep 2008 20:16:08 +0000 Greg Chicares 
<gchicares@sbcglobal.net> wrote:
GC> > 
GC> > GC> On 2007-01-08 14:23Z, Vadim Zeitlin wrote:
GC> > GC> > 
GC> > GC> >  I don't see anything against PETE except that it doesn't seem clear
GC> > GC> > whether it's actively maintained.
GC> > GC> 
GC> > GC> There's no recent maintenance activity AFAICT.
GC> > 
GC> >  This is still the case today and, I think, it's safe to say there will
GC> > never be any. Yet, lmi still uses PETE and you've been extending its use
GC> > recently -- which is the motivation for this post, of course, as I wonder
GC> > if it could make sense to have another look at the ET libraries using
GC> > modern C++ in case we can find something "better" than PETE.
GC> 
GC> Maybe, though I doubt it.

 Seeing all the recent work you've been doing on PETE, I guess this is even
more doubtful now than it was a month ago, isn't it?

GC> My ambitions are tiny by comparison, but mostly orthogonal to theirs.
GC> I don't want tensors and all that. I just want to be able to write
GC> simple operations on scalars and arrays in a simple, natural, and
GC> (most of all) terse way, e.g.:
GC> 
GC>   scalar A, B;
GC>   vector U, V, W;
GC>   W = max(0.0, 1.0 + A * U / (B * V - 3.5));

 FWIW I've found several libraries that allow doing this (and not that much
more). The problem is that none of them works with std::vector, they all
define their own bespoke data structures, which makes them unacceptable for
our needs without any changes.

GC> I can glance at a terse expression (in my professional domain) and
GC> see what it does, and (probably) say whether it's right. But endless
GC> lines of std::transform would puzzle anyone's will, and lambdas with
GC> ranged-for aren't much more transparent.

 Unfortunately the new wave in C++(20) is "ranges" and I'm afraid I'm
almost as unimpressed by them as by std::transform(). Maybe it's an
acquired taste that will come with time, but so far they don't seem
appealing at all, even though you can implement just about everything with
them, including expression templates.

GC> I come to expression templates seeking expressiveness, though wanting
GC> to avoid any serious speed penalty--yet not primarily seeking speed.
GC> But better speed is nice to have, provided expressiveness is maximized.

 Ideally expression template library would allow to completely decouple the
syntax from the implementation, allowing to change the latter without
affecting the former. This is important because it would allow to easily
use AVX instructions, for example, that can result in massive performance
gains.

 From this point of view, Boost.YAP[1] library looks promising, as it seems
to allow exactly this, and I was going to look at it "soon" but just didn't
have time to do it yet. But, again, I don't know if it's still worth doing
this if you've already committed to using PETE for the observable future.

[1]: https://www.boost.org/doc/libs/master/doc/html/yap.html

GC> I'd be quickly convinced by a library that would let me replace this:
GC> 
GC>     std::reverse(chg_sa.begin(), chg_sa.end());
GC>     std::partial_sum(chg_sa.begin(), chg_sa.end(), chg_sa.begin());
GC>     std::reverse(chg_sa.begin(), chg_sa.end());
GC> 
GC> with the pellucid equivalent in Iverson notation:
GC> 
GC>     chg_sa←⌽+\⌽chg_sa
GC> 
GC> but I know that's a lot to ask.

 I did see a library called ra-ra which claims to be inspired by APL and,
naturally, I've added it to the list of candidates to look at just because
I thought you would be interested. OTOH I probably will be rather relieved
to never know how does APL implemented in C++ look like...

[2]: https://github.com/lloda/ra-ra


 To summarize, after looking at some more recent libraries I am even more
convinced that PETE is very far from being the ideal implementation of the
ET idea in modern C++ and it seems clear that things could be much improved
simply by using C++17-specific features such as if constexpr and fold
expressions. But so far I haven't been able to find a drop-in replacement
for it, and I don't think one exists, so migrating from PETE would clearly
require some work and it looks like you'd prefer to work on improving lmi
version of PETE itself instead.

 This is a pity because it seems unlikely that lmi will ever be able to use
AVX instructions if we don't use an external library, and I also remain
certain that this is the real key to qualitative performance gains (of
course, I've been saying this since 15 years about SSE, then about SSE 2,
then about AVX, AVX 256 and by now we're at AVX 512 and I still keep saying
the same thing, so at the very least you will have to grant that I'm
consistent in my convictions).

 Please let me know if you're interested in pursuing this discussion
further or if we should make another 10+ year break and see how things
evolve by 2035...

 Thanks,
VZ

Attachment: pgpur9UG9Mg33.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]