[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] Was crunch really discontinued? Is there an alternat
Re: [Chicken-users] Was crunch really discontinued? Is there an alternative?
Mon, 12 Oct 2009 01:49:44 -0300
On Mon, Oct 12, 2009 at 01:04:40PM +0900, Alex Shinn wrote:
> Jeronimo Pellegrini <address@hidden> writes:
> > http://aleph0.info/scheme/
> > The times I listed before are for 100000 repetitions on small matrices
> > (3x4, 4x6), so as to alsoinclude function call overhead in the
> > benchmark.
> > I have uploaded two 100x100 random matrices also, and the results for
> > 20 repetitions on them (results.txt).
> > I understand that micro-benchmarks like this are usually not
> > significative, but in this case they make some sense, since it's the
> > kind of thing my programs will do most of the time.
> OK, there are lots of things going on here :)
> The first is that you're using a naive multiplication
> algorithm - there are faster algorithms, and algorithms that
> take L1 cache consideration into account for very large
> matrices, and BLAS does all of this in addition to being
> written in highly tuned Fortran.
Yes, I know -- I was just trying to compare the same numerical
algorithm on different Scheme implementations.
> If matrix operations are
> really what you want to do, then as Ivan says just use BLAS
Not exactly. I do use lots of floating-point operations, but
not necessarily linear algebra-style.
> If you were more curious about the speed of Scheme compilers
> for their own sake, and not about actually getting work
> done, then there are several reasons for the slowness. The
> first is that SRFI-25 is inherently slow - the design makes
> it difficult to implement efficiently, and so that's slowing
> down all of the Scheme implementations. It's easy enough to
> just implement your own matrices on top of vectors for a
> huge speed boost.
That was the problem.
The test with 100x100 floating-point matrices ran in:
with SRFI-25: 23.2s
without it: 1.3s
(Although this is suspicious -- it's >2x faster than Bigloo and
> The next problem is that presumably you want to test
> floating point, although the test case you use involves only
Well, the small examples use fixnums, but the 100x100 example
> Floating point involves heap allocation for every
> operation in I think all Scheme implementations except
> Stalin. Stalin can unbox floating point numbers if it can
> prove all of the types involved are inexact (it wouldn't
> work on this example because of the general READ, you'd have
> to tweak it so that Stalin's type inference would kick in).
> This makes Scheme in general unsuited to floating point
> intensive computations.
> Given that you're only testing fixnums here, the -fixnum
> optimization gives a big boost.
Yes, I have tested it, and then Chicken runs just like bigloo
and gambit -- very fast! But I actually will need floating point --
and the problem is solved now (I just won't use SRFI 25).
I don't really need the same speed of C or Fortran -- it just
shouldn't be more than 15 times slower. :-)
Thanks a lot!
> The attached variation of the code, with the -fixnum flag,
> will get this specific example within the ballpark of BLAS
> (0.1 seconds on my machine), but only for the small example
> you're using. As the matrices get larger BLAS will become
> increasingly faster, and Scheme will be painfully slow with
OK -- I understand that I need to test larger examples also.
I'll do that.
Thanks Alex (and thanks Ivan also)!