axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Axiom-developer] exact quotient


From: Waldek Hebisch
Subject: Re: [Axiom-developer] exact quotient
Date: Thu, 21 Jun 2007 01:26:57 +0200 (CEST)

Martin Rubey wrote:
> Waldek Hebisch <address@hidden> writes:
> 
> > I am interested in Axiom performance.  But do you have reasons to supect
> > exquo?  I mean, did you look at profile of some problematic run?  While I
> > agree that perfroming division and checking that remainder is zero is not
> > very smart, it should not matter much for polynomials (at least for
> > polynomials in single variable, for multivariate polynomials division is not
> > defined, so you must do something smarter anyway).
> 
> Well, I'm not so sure - I just don't know.  To fix things: I am really only
> (for the moment) interested in the case where we are dealing with multivariate
> polynomials.
> 
> As far as I can see, the poor performance is due to poor multiplication and
> exquo, but I cannot be sure.  I find it a bit hard to measure.
> 
> [one day later...]
> 
> OK, the situation is worse than I would have thought.  I did the following
> rather simple test, which is, however, very realistic for my application: 
> (Note
> that in my application, for certain reasons, I also have to use the POLY
> constructor.)
> 
> -------------------------------------------------------------------------------
> -- define a bivariate polynomial of "size" (= number of coefficients) n+1
> mon(n,k) == (random k * x + random k * y)^n
> 
> l(kappa, m) == [mon(kappa, 10000) for i in 1..m]
> 
> -- test performance of multiplication
> test(kappa, m, n) ==
>     ls1 := l(kappa, m)
>     ls2 := l(kappa, m)
>     res: List List Integer := []
>     for trial in 1..n repeat
>         GBC(t)$Lisp
>         tim1 := integer GET_-UNIVERSAL_-TIME()$Lisp
>         for i in ls1 for j in ls2 repeat i*j;
>         tim2 := integer GET_-UNIVERSAL_-TIME()$Lisp
>         res := cons([kappa, tim2-tim1], res)
>     output res
>     res
> 
> reduce(append, [test(kappa, 500, 5) for kappa in 10..100 by 10])
> -------------------------------------------------------------------------------
> 
> On my machine, I get the following output
> 
>    [[10,1],[10,0],[10,1],[10,1],[10,1]]
>    [[20,2],[20,2],[20,3],[20,2],[20,2]]
>    [[30,6],[30,6],[30,5],[30,6],[30,6]]
>    [[40,11],[40,12],[40,12],[40,12],[40,12]]
>    [[50,19],[50,20],[50,19],[50,20],[50,20]]
>    [[60,32],[60,32],[60,35],[60,35],[60,35]]
>    [[70,52],[70,52],[70,52],[70,55],[70,51]]
>    [[80,77],[80,77],[80,76],[80,76],[80,76]]
>    [[90,102],[90,102],[90,108],[90,103],[90,103]]
> 
> This really looks like complexity between O(k^2.5) and O(k^3.0) for me, where 
> k
> is the number of monomials of the input.  I expected something like O(k^2)
> though:
> 
> (a1*x1 + a2*x2+...+ak*xk)*(b1*x1 + b2*x2+...+bk*xk)
> = (a1*x1 + a2*x2+...+ak*xk)*B
> = a1*B*x1 + a2*B*x2...+ak*B*xk
> 
> which makes k^2 coefficient multiplications.  The cost of multiplying 
> variables
> should really be negligible, I believe.
> 

Well, naive complexity is k^4: you have k^2 mutiplications of 4-k digit
numbers.  Naive algorithm multiplication algorithm nees k^2 operations
to multipy two k-digint numbers...

A little estimate: for k=100 we have about 400 digits.  400 digits
require something like 21 machine words on amd64.  So we get about
100^2*21^2*500 = 2205000000 ~= 2*10^9 multiplications.  The calcularion
takes 37 seconds on my amd64, while in theory could be few times faster.
However, there are various overheads so the result is reasonable.

Note that both Axiom and gmp contain faster algorithms (Axiom
uses Karatsuba method and gmp is smarter), however for numbers
having 21 digits smart methods are likely to work just as fast
as a naive one. 

There is another factor: your polynomials are homogeneous, so
essentially behave like one-dimensional ones, if your real data
is inhomogeneous you have much more work to do.  For one
dimensional polynomials of degree 100 Karatsuba gives substantial
win and FFT method may be applicable.  If you have two variable
polynomials of degree 100 with 100 nonzero terms smart methods
are unlikely to win.

If your intermediate polynomials have large coefficients, but
the final answer has moderate size it may be wort look at modular
methods.  In Axiom modular arithmetic is unreasonably expensive,
but still may give some win.  For example, replacing first three
lines in your script by:

MP := Polynomial PrimeField(87654337)
-- define a bivariate polynomial of "size" (= number of coefficients) n+1
mon(n,k) == ((random k * x + random k * y)::MP)^n

for k = 100 I need 26 seconds (compared to 37 in integer case).
This is way too much, because modular case needs about 100 times less
arithmetic operations at machine level (note that the modulus is
chosen in such a way that the intermedate numbers should stay in the
range of machine integers), but still shows possibility for gain.

BTW:  sbcl needs 12 seconds for k = 100 in modular case and 25
seconds in integer case, so gain is bigger. 

-- 
                              Waldek Hebisch
address@hidden 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]