axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Axiom-developer] hyperdoc on the web


From: Tim Daly
Subject: Re: [Axiom-developer] hyperdoc on the web
Date: Tue, 08 Dec 2009 16:51:57 -0500
User-agent: Thunderbird 2.0.0.21 (Windows/20090302)

Well, since my name came up in discussion I feel I have to respond
with a couple points.

1) My original "translation" from spad files to "literate" files involved
adding a latex header and a footer to every file. Fricas removed this from
almost every other file except the algebra files. Waldek's complaint is that
this makes algebra files hard to use. (Note that Axiom no longer has
any spad files or separate algebra pamphlet files. What you have was
a trivial first step in the process and worthless as it stands.)

I might suggest that since you don't plan to really do literate programming
it would be perfectly sane to remove the header and footer from the
algebra files. This has 2 benefits. First, you can use the spad files directly
Second, you no longer need noweb which simplifies your build process
considerably (removing steps and removing the need to fetch noweb)
In fact, you hardly use noweb because it appears you build from the lisp
translations most of the time.

You can't really do literate programming "half way". What you have
now is just a "dust jacket" over an empty novel. What is the point?
Why be irrational?

2) The debate seems to depend on whether Fricas will attract more
users if more functionality is added or if more documentation is added.
I think you will only judge this in the long term by whether Fricas or
Axiom ends up with more users. In fact, though, I don't consider
this an interesting metric; but we disagree on that also.

Given that I'm a complete Knuth fanboi about literate programming,
I have SO much more to say but since Fricas forked the Axiom project
over the literate issue, this is the wrong audience. There is nothing I
can say that would convince you otherwise.

Tim


=================================================

Martuin Rubey asked about stmult routine which I recently
commited to 'sttaylor.spad'.  I think the best explanation
is to give short reconstruction how the routine was
written.

1) Let us look first at general pattern of stream fortines.  Good
example is the simpified version of 'map':

   map(f,x)== delay
     empty? x => empty()
     concat(f frst x, map(f,rst x))

Note that the result is produced by the 'delay' function.  'delay(f)'
takes function 'f' as argument and produces from is a stream, such
that evaluation of the strem causes f to be called.  The name 'delay'
is releated to the fact that 'delay' effectively delayes evaluation
of its argument to the moment when first value from the stream
is requested.  When strean is evaluated: evaluation triggered
by call to 'empty?'.  In other words, before trying to access
first element of the stream we need to call 'empty?' to cause
evaluaton, after the call we know if the element is there (and
if it is present we can use it).  So above the first line calls
'empty?' which either give us true, in which case we simply return
empty stream or we may use the first element of input stream to
produce first element of the result result.  Now, once we have
first element of the result in hand we can produce the result by
concatenating first element of the result with the rest of the
result.  In case of 'map' the result is obtained by recursive
application of 'map' to the rest of input stream.

Important remark: stream are produced and should be consumed
in lazy way.  'delay' makes sure that no computation takes
place before first element of the result is requested.  However
for single request map should produce only one element and to
produce it should consume only one elemet form input stream.
Using 'concat' makes sure that caller will get the computed
element and that request for further elements will go to map
applied to the rest of stream.

Let me also say that the code above looks like unbounded
recursion, but in fact there is _no_ recursion at runtime:
what looks like recursive call to 'map' is done _after_
call to outer map has finished.

2) Having pattern for general operation, how we apply it in a
concrete situation.  For example, we would like to multiply
each element 'n' of the stream by 'n+1'.  To do this we need
to know what 'n' is, so we need to pass 'n' between invocations
of function.  So we need an auxilary function and a wrapper:

  stmuln_aux(n0, x) == delay
      empty? x => empty()
      n := n0 + 1
      firstval := n*frst x
      concat(firstval, stmuln_aux(n, rst x))

  stmuln(x) == stmuln_aux(0, x)

Note: due to stupidity of Spad compiler I can not simply increment
n0, but must introduce extra variable n.

3) Back to multiplication.  For multiplication formula is:

  (f*g)(n) = \sum_{i=0}^{n} f(i)*g(n-i)

This is more complicated then previous case because I need terms
f(0),...,f(n) and g(0),...,g(n) to produce the result.
Naive implementation (assuming infinte streams) may look like:

  stmult(n, x, y) == delay
      valn := sum(coefficient(x, i)*coefficient(y, n - i), i=0..n)
      concat(valn, stmult(n + 1, x, y))

  mult(x, y) == stmult(0, x, y)

However, coefficient for streams has step trough stream from
the start up to requested element, so it is slow if n gets large.
It is easy to see that we can step trough x as we sum saving
half ot the operations:

  stmult(n, x, y) == delay
      valn : A := 0
      xp := x
      for i in 0..n repeat
          empty? xp => break
          valn := frst xp * coefficient(y, n - i)
          xp := rst xp
      concat(valn, stmult(n + 1, x, y))

We still use coefficient on y.  Since we need to go trough y
backwards, it is a bit harder to get rid of.  Easy way it to
have a list which stores the first n elements of y in backward
order:

  stmult(n, x, y0, ll0) == delay
      yval : A
      y := y0
      if empty? y then
          yval := 0
      else
          yval := frst y
          y := rst y
      ll := cons(yval, ll0)
      xp := x
      llp := ll
      valn : A := 0
      for i in 0..n repeat
          empty? xp => break
          valn := valn + frst xp * first(llp)
          llp := rest(llp)
          xp := rst xp
      concat(valn, stmult(n + 1, x, y, ll))

This may be (I did not test it) correct routine to multiply infinite
power series, but it does not conform to expectation of rest of FriCAS.
Namely, if we are given to finite series it will produce infinite
series -- after finite number of nonzero terms we will get infinite
stream of zeros.  There is also unneeded eagerness: if first element
if y is zero we do not need element n of x to produce first element
of the result.  Also, if y is finite there is no need to append
zeros.  First, let us handle leading zeros of 'y'.  I first
element of y is zero we get zero as fist element of the result and
rest of resulst is x times rest of y:

    ...
    yval := frst y
    n = 0 and yval = 0 => concat(0, stmult(n, x, rst y, ll))

Note: we need to test if n is 0 to know that we really have first
elememt and not some zero in the middle.

Next, if we get to the end of y instead of extending ll by zero
we may drop first element of x.  We need to be a little
careful if x is also empty ( possibly after dropping several
elements).  And we need to properly adjust n.  Adjusting n is
easier if we start from n = -1 instead of n = 0.  Also, if at start
y is empty then the result is 0 (empty):

      if empty? y then
           n < 0 => return empty()
           empty? x => return empty()
           x := rst x

If we in the loop we discover that x is empty we can also
return 0.  FinallCollection the changes above we get:

  stmult(n0, x0, y0, ll0) == delay
      x := x0
      y := y0
      ll := ll0
      n = n0
      if empty? y then
          n < 0 => return empty()
          empty? x => return empty()
          x := rst x
      else
          yval := frst y
          n < 0 and yval = 0 => concat(0, stmult(n, x, rst y, ll))
          y := rst y
          ll := cons(yval, ll)
          n := n + 1
      xp := x
      llp := ll
      valn : A := 0
      for i in 0..n repeat
          empty? xp =>
              i = 0 => return empty()
              break
          valn := valn + frst xp * first(llp)
          llp := rest(llp)
          xp := rst xp
      concat(valn, stmult(n + 1, x, y, ll))

This is almost the routine I want to use.  However, there is one
extra problem: if series has unevaluated tail printing routines
print O(..) term even if after further evaluation it turns out
that in fact the tail is empty.  To avoid such artifacts printing
routines check if tail is explicitly empty.  Note that testing
if tail is empty causes evaluation and printing routines are not
allow to evaluate extra terms.  But 'explicitlyEmpty?' does not
cause evaluation, even if 'explicitlyEmpty?' return false
'empty?' may return 'true'.  So, to make printing routines
happy we need to make sure that the tail is explicitely empty.
This is done adding after loop test:

        ....
     explicitlyEmpty? rst x and
       explicitlyEmpty? y => concat(res, empty())

Note that we if x is empty we call return inside loop, so after
loop x is always nonempy and calling rst is safe.  Note that
if x has only one element and y is empty then in the next iteration
we will throw out the only element of x and get empty result.
So the test allows us to produce earlier the empty result.
OTOH if that test fails in principle we can get more elements,
so must use general form.

Martin also asked why the new version uses less space than the
old one and why the old one used so much space.

First, note that stream data is stored in nested anonymous
functions.  The new version creates single function for
each term of the series, so when we want n terms n such
functions are created.  Each function takes fixed amount
of space for variables (the space is actually taken by "frozen"
values of variables from outside which the function uses).
The variables may reference large data structure (for
example x and y may be big), but fortunatly such large
structures may be shared.  In particular parts of x and y are
shared so that at each step we add (compute) only one extra element
of x and of y.  The list ll has in general n + 1 element, but
we share n with previous iteration.  So at each step we
allocate fixed aount of memory.  More precisely, we allocate
whatever memory is needed to store element n of x and element
n of y plus fixed storage (of order 10-15 machine words).
In other words, memory usage is linear in n.

The old version was:

  (x:ST A) * (y:ST A) == delay
    empty? y => zro()
    empty? x => zro()
    concat(frst x * frst y,frst x * rst y + rst x * y)

It is easy to see that this version allocates stream node for
each term in the product, that is of order n^2 nodes.  Theoretically
most of those nodes may be garbage collected.  I did not check
if the FriCAS code for some reason (for example due to keeping
unnecessery extra reference to the data) did not allow collection,
or if this was problem with SBCL garbage collector, but the fact
was that the space usage grow quadratically...

The new version allocates much less memory, so there is no
chance for garbage collector (or FriCAS programmers) to scrow up.

--
                             Waldek Hebisch
address@hidden

From: Ralf Hemmecke <address@hidden>
Date: Sat, 05 Dec 2009 12:23:10 +0100
Local: Sat, Dec 5 2009 6:23 am
Subject: Re: [fricas-devel] Multiplication of power series

Waldek,

Since we have literate programming, I think your explanation would be
best put just next to your implementation. Can you commit that?

In the mailing list archive it will be just forgotten or relatively hard
to find if the code does not (at least) have a link to your message.
Anyway, better put it directly into the code.

Since I've also worked on powerseries in aldor-combinat... does your
code allow to define powerseries by recursion? I think that should be
somehow possible by using "delay".

Further, I still would very much like a native implementation of
"Generator" in SPAD. Do you have any plans for that?

Ralf

On 12/05/2009 04:17 AM, Waldek Hebisch wrote:


> Martuin Rubey asked about stmult routine which I recently
> commited to 'sttaylor.spad'.  I think the best explanation
> is to give short reconstruction how the routine was
> written.


From: Waldek Hebisch <address@hidden>
Date: Sat, 5 Dec 2009 16:54:52 +0100 (CET)
Local: Sat, Dec 5 2009 10:54 am
Subject: Re: [fricas-devel] Multiplication of power series


Ralf Hemmecke wrote:
> Waldek,

> Since we have literate programming, I think your explanation would be
> best put just next to your implementation. Can you commit that?

No!!!  I _really_ do not want to see such texts inside source
files.  Such for given person such text may be useful once,
maybe few times, but having to read it or even only look at
it every time when looking at code is simply too distracting.

In other words, persons looking at/modifying algebra code are
supposed to have appropriate background knowledge.  Consider
research article on some fancy rings.  If you put there
elementary introduction to rings every referee will tell
you to delete such material.  Background specific to topic
at hand is welcome, but is normally put in separate
section.  The main body of article simply uses notation/facts
from background section without repeating them.  This is
due to separation of concerns principle: systems with many
interactiong parts are hard to understand.  Splitting them
into pieces and using hierachical description makes things
easier.  Also, people are pretty good at analogy, so explaing
one thing allows them to understand similar thins.  Again,
separating material into background and specific parts
means that background is easier to reuse.

For background we should have documentation (separate
from source code!).  Unfortunatly, past "literate"
effort removed places where normally documentation
should live.  But is relatively easy to cure.

There are practical aspects releated to tools that we
use.  Let me explain my workflow when developing algebra
code:

1) copy XXX.spad file from build directory to work directory
2) start up FriCAS in work directory
3) in separate window start an editor on XXX.spad file (I use vi,
  but any reasonable editor would do).
4) modify the XXX.spad file and save it
5) Do

)compile XXX.spad

(typically the command is in command history so this is fast)
6) Test (frequently test are in history too, so this also is
  fast)
7) if not done goto 4
8) make diff of original XXX.spad and modified XXX.spad
9) Edit the diff replacing XXX.spad by xxx.spad.pamphlet
10) Applay the diff to algebra
11) Do full bootrap to final test.

Typically there are several interations of inner edit-compile-test
loop.  On small files mechanical parts take fraction of second, on
larger few seconds, so it does not intrude too much into natural
flow of developement.  Note that step 10 depends on code inside
pamphlet beeing just wrapped version of code in XXX.spad, any noweb
tricks means to I need to do merging manually -- this looses time
and may introduce extra errors.

In principle good "literate" tools could work as well or better.
But external tools are a dependency while tools specific to
the project take effort ot develop.  So it is better to use
popular tools unless there is some really big gain from nonstandard
tool.  Looking at many project I see to use of exotic tools
frequently limits adoption of otherwise quite good ones.

> In the mailing list archive it will be just forgotten or relatively hard
> to find if the code does not (at least) have a link to your message.
> Anyway, better put it directly into the code.

My fist feeling was that this text was too rough to elevate it
to status of documentation.  But if you think that it is good
enough we can create documentation subdirectory and put it there.

> Since I've also worked on powerseries in aldor-combinat... does your
> code allow to define powerseries by recursion? I think that should be
> somehow possible by using "delay".

Well "my code" is routine for mutiplying power series.  The framework
for computing power series is due to original Axiom developer.  AFAICS
you can use quite general recursive construct when defining power
series.  Note: while conceptualy there is recursion at runtime
typically there are no recursive calls, so no need to worry about
overflowing stack due to deep recursion (but as the power series
example illustrates whithout some care one may exhaust available
memory (heap)).

> Further, I still would very much like a native implementation of
> "Generator" in SPAD. Do you have any plans for that?

They are in queue of "nice things that we would do if we had
enough resources", but IMHO other things are more important,
so no specific plans ATM.

--
                             Waldek Hebisch
address@hidden


From: Martin Rubey <address@hidden>
Date: Sat, 05 Dec 2009 21:44:22 +0100
Local: Sat, Dec 5 2009 3:44 pm
Subject: Re: [fricas-devel] Multiplication of power series


Waldek Hebisch <address@hidden> writes:
> Ralf Hemmecke wrote:
>> Waldek,

>> Since we have literate programming, I think your explanation would be
>> best put just next to your implementation. Can you commit that?

> No!!!  I _really_ do not want to see such texts inside source
> files.  Such for given person such text may be useful once,
> maybe few times, but having to read it or even only look at
> it every time when looking at code is simply too distracting.

How about putting it into a separate subsection but at least in the same
file?  I'm afraid that there will never be concensus, but maybe we cann
have a compromise?

When working on GUESS, I updated the doc (for myself!!!) at least from
time to time.  I admit it is far from perfect, outdated at places and
often inaccurate.  But it happened already a few times that reading my
old doc helped myself remember what I was trying to do.

> In other words, persons looking at/modifying algebra code are
> supposed to have appropriate background knowledge.  Consider
> research article on some fancy rings.  If you put there
> elementary introduction to rings every referee will tell
> you to delete such material.
So far this didn't happen to me - actually rather the opposite -
although I consider myself as doing "elementary" stuff.  I am frequently
surprised what other mathematicians find "elementary", by the way.

> Let me explain my workflow when developing algebra code:

> 1) copy XXX.spad file from build directory to work directory
> 2) start up FriCAS in work directory
> 3) in separate window start an editor on XXX.spad file (I use vi,
>    but any reasonable editor would do).
> 4) modify the XXX.spad file and save it

I think you might save time in 9, 10 below by editing the pamhlet
directly, and binding some keystroke to notangle or document.

I admit however, that it is probably better to have only one chunk per
function.  Having it in a completely separate section looks "far away"
from me, I probably won't ever update documentation in a different file.

> 5) Do

> )compile XXX.spad

> (typically the command is in command history so this is fast)
> 6) Test (frequently test are in history too, so this also is
>    fast)
> 7) if not done goto 4
> 8) make diff of original XXX.spad and modified XXX.spad
> 9) Edit the diff replacing XXX.spad by xxx.spad.pamphlet
> 10) Applay the diff to algebra
> 11) Do full bootrap to final test.

The true question probably is: how to allow different workflows in the
same project...

Martin
From: address@hidden
Date: Sat, 5 Dec 2009 22:02:01 +0100
Local: Sat, Dec 5 2009 4:02 pm
Subject: Re: [fricas-devel] Multiplication of power series

Dear Waldek

thank you for the explanations on streams.
Incidentally I am working with streams too these days
(moments, Jacobi parameters and continued fractions)
and found this extremely helpful.
Also the recent comments on the spad compiler.

On Sat, Dec 05, 2009 at 04:54:52PM +0100, Waldek Hebisch wrote:
> No!!!  I _really_ do not want to see such texts inside source
> files.

well after notangle they disappear ... this way everyone is more or less
happy.

> Such for given person such text may be useful once,

Given many persons it will be useful many times :)

> In other words, persons looking at/modifying algebra code are
> supposed to have appropriate background knowledge.

At the moment the source code is an essential part of the documentation.
Most prospective library programmers and readers will be mathematicians
with low programming background (I have almost none),
concentrating on mathematics.
Given that the compiler and the library is mostly undocumented,
where should this background knowledge come from?

> Consider
> research article on some fancy rings.  If you put there
> elementary introduction to rings every referee will tell
> you to delete such material.

Well, research papers without proofs are also rejected.
Having to reconstruct every proof is even more tedious than
reading too many trivialities. Similarly here some hints
on why and what the code is supposed to do can be helpful.

> easier.  Also, people are pretty good at analogy, so explaing
> one thing allows them to understand similar thins.

Indeed. The information should be there and cross referenced
when needed.

> For background we should have documentation (separate
> from source code!).  Unfortunatly, past "literate"
> effort removed places where normally documentation
> should live.  But is relatively easy to cure.

So where should I put documentation for my domains?
I mostly write it for myself in order to understand my own code later
and where it came from.
Being a forgetful person I very much like having everything in one file,
but understand that this is personal taste.
Another advantage is that I can have multiple alternative versions of
a chunk in _one_ file and being able to quickly switch by just
changing the name of subchunks.

> )compile XXX.spad

I write a Makefile extracting the *.spad and *.input code from  *.nw and do
)sys make
)compile XXX.spad
)read test
this is almost as fast. And only one source file for everything.
And if this is too much typing,
put the three lines into yet another .input file.

> 7) if not done goto 4
> 8) make diff of original XXX.spad and modified XXX.spad
> 9) Edit the diff replacing XXX.spad by xxx.spad.pamphlet

not needed.

> My fist feeling was that this text was too rough to elevate it
> to status of documentation.  But if you think that it is good
> enough we can create documentation subdirectory and put it there.

Yes, please this is exactly the kind of documentation newcomers like to see.
In any case documentation not written immediately will never be written.
Given that everyone's time on this planet is finite,
things which are not written down will be gone sooner or later.

just my 2c
Franz


From: Ralf Hemmecke <address@hidden>
Date: Sat, 05 Dec 2009 23:30:10 +0100
Local: Sat, Dec 5 2009 5:30 pm
Subject: Re: [fricas-devel] Multiplication of power series

On 12/05/2009 04:54 PM, Waldek Hebisch wrote:

> Ralf Hemmecke wrote:
>> Waldek,

>> Since we have literate programming, I think your explanation would be
>> best put just next to your implementation. Can you commit that?

> No!!!  I _really_ do not want to see such texts inside source
> files.  Such for given person such text may be useful once,
> maybe few times, but having to read it or even only look at
> it every time when looking at code is simply too distracting.

Hmmm. OK, I cannot and will not force anybody to find literate
programming important.

There are still proper tools missing to support LP. Thank you, Martin
and Franz, that now I know you are more or less on the LP side and find
this a good thing.

Anyway, I guess, what Waldek says should not be overlooked. Looking at
what I did during my work at aldor-combinat

http://www.risc.uni-linz.ac.at/people/hemmecke/aldor/combinat/

I must say that it was a time when I tried to figure out whether LP is
good or bad. Look at the text there, it's not perfect. Not perfect in
terms of structure of the text and not perfect to fit one's workflow.
And, although there is lot's of information, most people will probably
find an overview section missing. Yes, aldor-combinat is unfinished in
terms of documentation.

An important point that Waldek makes is that as the original programmer,
one knows all the background and just wants to code and refactor the
code. Documentation is just getting in the way.

Unfortunately, documentation is more often read than written and the
most important thing is that the documentation is for other people. In
particular for other programmers that want to understand the code and
improve it.

I think, we as the FriCAS team have to make a compromise to make
everyone happy. If Waldek doesn't (yet) like LP, we must accept it. As I
said in my last mail, however, there should at least be a pointer in the
.spad file (write it as -- comment) which points to a file or section
that contains the corresponding information.

Waldek, I'm so happy that you wrote that section in
http://groups.google.com/group/fricas-devel/browse_thread/thread/7bf2...
but please put it into the source code repository. For me it doesn't
matter where. I find that in the same .pamphlet file is the best option,
but if you like, open up another place, but put a link to that place in
the .spad file.

Whenever we become more than 100 developers (smiling a little), we can
be more strict with LP rules. But who knows when this will happen.
Waldek is right in the sense that FriCAS should be made interesting in
term of functionality in order to attract more people.

Ralf

From: Bill Page <address@hidden>
Date: Tue, 8 Dec 2009 13:46:55 -0500
Local: Tues, Dec 8 2009 1:46 pm
Subject: Re: [fricas-devel] Multiplication of power series


- Hide quoted text -
- Show quoted text -
On Sat, Dec 5, 2009 at 5:30 PM, Ralf Hemmecke wrote:
> On 12/05/2009 04:54 PM, Waldek Hebisch wrote:
>> Ralf Hemmecke wrote:
>>> Waldek,

>>> Since we have literate programming, I think your explanation would be
>>> best put just next to your implementation. Can you commit that?

>> No!!!  I _really_ do not want to see such texts inside source
>> files.  Such for given person such text may be useful once,
>> maybe few times, but having to read it or even only look at
>> it every time when looking at code is simply too distracting.

> Hmmm. OK, I cannot and will not force anybody to find literate
> programming important.

> There are still proper tools missing to support LP. Thank you, Martin
> and Franz, that now I know you are more or less on the LP side and
> find this a good thing.
> ...

Perhaps there are also people like me who have "lost faith" in the
concept of literate programming. When I first started with the Axiom
project several years ago I originally thought it was a good idea and
a reasonable response to the serious problem of lack of documentation
of the Axiom system. But now I agree with Waldek that almost always
the "documentation" part of the pamphlet files gets in the way of my
understanding of the code and work almost exclusively with the .spad
files until forced to re-assemble them into the pamphlet form.

I do not think that this is only a result of the lack of proper tools
to support LP although I have long argued that there are better
approaches than the one originally advocated by Donald Knuth. There
are multiple problems but I think the most serious problem is that the
methodology just does not fit the psychology of many (most?) of the
talented people who are able to do this kind of work - particularly
when they do it voluntarily. Advocated "re-education" and greater
self-discipline is simply impractical. And hoping that other motivated
users/developers will appear who are willing to put an effort into
documenting other people's code is simply unrealistic.

Having worked with Sage for awhile it seems to me that the "doc
string" methodology much better suits the current state-of-the-art and
state-of-mind in software development. Ironically this is very similar
to the ++ style comments in Spad and how they interact with hyperdoc.
I am very happy that Waldek has stated elsewhere that he considers the
hyperdoc documentation to be the "definitive source" for documentation
of FriCAS and that he spent considerable effort to resurrect the tools
from the original Axiom source code to help keep Hyperdoc up to date.
It seems that even more can be done in this direction to advance the
original approach developed before Axiom became an open source
project. For example as I understand it the Axiom book was originally
extracted from the same set of enhanced tex files combined with
documentation in the Axiom source code and derived by running Axiom
itself.

I think that it was a great experiment but in the end it is rather
unfortunate that Tim Daly choose to adopt a new "technology" based on
the Knuth literate programming approach when initiating the first open
source Axiom project rather than focusing on extending and expanding
the use of hyperdoc. I cannot see any evidence at all that the use of
literate programming as such in the original Axiom project has
contributed anything to the acceptance of Axiom in the potential
user/developer community.

Needless to repeat I suppose that this is of course only my personal
opinion. :-)

Regards,
Bill Page.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]