axiom-developer
[Top][All Lists]

## Re: [Axiom-developer] gui/axiom research question

 From: C Y Subject: Re: [Axiom-developer] gui/axiom research question Date: Fri, 20 Jan 2006 19:24:53 -0800 (PST)

```--- root <address@hidden> wrote:

> perhaps this is a good "summer of code" problem
>
> Reading Soiffer's PhD thesis (The Design of a User Interface
> for Computer Algebra Systems) raises an interesting research
> question.
>
> assume you're given a standard page size (8 1/2 x 11, A4,
> display size) and you have an equation to format in some
> understandable way  how can you format the equation to fit
> on one page.

That's probably the major remaining question for formatting mathematics
:-).

> the underlying "assumption" is that equations longer than one\
> page are unreadable and useless.

Not necessarily.  I think we can treat the "vertical" direction as a
series of lines of varying heights, and come up with some reasonable
algorithms to handle most cases.  There are obviously some corner cases
where nothing will make it look good, but in most cases I think
multi-page is probably not as hard as the "line breaking" problem.

> we allow a variety of techniques:
>
> * line breaks when the equation hits the edge

Right - that's the first trigger that any sort of line breaking is
needed.

> * common subexpression elimination when it can occur
>
>     R = (x+y) * (z+(x*y)) ==>
>
>     let M = (x+y)
>         R = M * (z+M)
>
>     so that you can "name" common subexpressions and lift them

Does anyone use this?

> * function reification
>
>     "name" a subexpression that requires a parameter,
>     lift it out, and substitute a parameterized term.
>     in a tree or DAG representation this is node-naming
>
> * term reification
>
>     "name" a term, lift it out, and substitute
>     in a tree or DAG representation this is node-naming

I think these are less common in practice but I like them as an option
- it can sometimes help reveal "high level structure" in a busy
expression.

> * term summarization
>
>     replacing a (reordered) subset of the terms by a summation
>
> * eliding leading, trailing, or middle terms
>
>      term + term + ... + term

I'll have to think about these...

> * linearizing terms
>
>     terms, such as fractions, can be rewritten in linear
>     form to save lines

This is of course the ultimate fallback - linearize the equation and
use normal text linebreaking equations.

> * constant naming
>
>     long constants replaced by short names:
>
>
>      R = 3.77612876767 * foo ==>
>
>      let M = 3.77612876767
>          R = M * foo

Isn't that a special case of term naming?

> * pattern naming
>
>     turn 2D template structures (powers/ratios/matricies/polys) into
>     "named" template structures and substitute
>
>     +-   -+   +-   -+
>     | 1 2 |   | 5 6 |
>     |     | * |     |
>     | 3 4 |   | 7 8 |
>     +-   -+   +-   -+
>
>     turns into
>
>     A = Matrix((1 2) (3 4))
>     B = Matrix((5 6) (7 8))
>     MatrixProduct(A,B)

Another case of term naming, although with the twist that * needs to be
identified as MatrixProduct if the terms are no longer "visually"
matricies.

> * operator names for all template structures and linear
> versions

Sort of a "temporary function" definition?

> * "outer structure" recognition
>
>    is it fundamentally a matrix? a polynomial? an integral?
>    the layout techniques could differ based on the outer
>    structure

Do you mean "identify the top level mathematical operators?"  An
expression could be a sum and yet have one of the terms be determinent
of a large matrix (large meaning larger than page width, in this case).
I think reasonable rulesets can be devised, and I hope as I work my
way through this thesis it will help align my thinking :-).

> * "inner structure" eliding
>
>    clip out "heavy" element in a matrix so it does not cause
>    wide columns

Hmm.  Not sure how that could be handled.

> * *depth*, *width*, *height* maximums as parameters

Those are ultimately constrained by the page, with page width being an
absolute limit and page height being infinite but requiring breaking
over pages.  Not sure what depth would refer to.

> ultimately i think this boils down to a question of embedding a
> tree or a DAG into a planar graph. or perhaps this is an
> extension of the TeX layout algorithm with each weights
> assigned to the boxes?

I tend to regard it as somewhere inbetween - boxes are the virtually
universal metaphor for algorithm description in this situation, but
things like strict super and subscript rules tend to make it a little
more complex than a normal tree or DAG (unless I'm thinking about those
terms incorrectly).

> compute the weight (area?) of a node in the tree or DAG
> and do some sort of weight-reduction?

I think it depends a lot on the situation and the mathematics involved.
Take, for example, a large nxn matrix larger than both page width and
height (let's say for the sake of argument it contains only integers
from 0-9, or more correctly that each element is already atomic in
size. It could only be displayed in the confines of a page/book
environment by either hiding some part of the matrix, or something like
the following:

Page 1

1   2   3   4   5
-------------------
1 | 1   2   3   4   5  .
2 | 1   2   3   4   5  .
3 | 1   2   3   4   5  .
4 | 1   2   3   4   5  .
5 | 1   2   3   4   5  .
6 | 1   2   3   4   5  .
7 | 1   2   3   4   5  .
- - - - - - - - -

Page 2

6   7   8   9   10
-------------------
1 | 1   2   3   4   5  .
2 | 1   2   3   4   5  .
3 | 1   2   3   4   5  .
4 | 1   2   3   4   5  .
5 | 1   2   3   4   5  .
6 | 1   2   3   4   5  .
7 | 1   2   3   4   5  .
- - - - - - - - - - -

Page 3

1   2   3   4   5
- - - - - - - - - -
8 | 1   2   3   4   5  .
9 | 1   2   3   4   5  .
10 | 1   2   3   4   5  .
11 | 1   2   3   4   5  .
12 | 1   2   3   4   5  .
13 | 1   2   3   4   5  .
14 | 1   2   3   4   5  .
--------------------

Page 4

6   7   8   9   10
- - - - - - - - - -
8 . 1   2   3   4   5  |
9 . 1   2   3   4   5  |
10 . 1   2   3   4   5  |
11 . 1   2   3   4   5  |
12 . 1   2   3   4   5  |
13 . 1   2   3   4   5  |
14 . 1   2   3   4   5  |
--------------------

It's arguable whether or not that would be useful, but it would
preserve both information concerning the location of all elements in
the matrix and the complete set of matrix data.  Of course, there are
even more interesting limit cases, such as what happens if one or more
columns or rows is excessively large, there is a situation where a
single column would occupy a page, the column numbers get so large they
widen the column width unreasonably, the row numbers get so large they
require a large fraction of the page area, etc.  I think, in those
situations, the response must be to collapse back to the 1-D ASCII
display, perhaps with some helpful formatting for each element (would
need to think about that one.)

That suggests a general mechanism that had already occurred to me - try
every graphical trick defined to fit a formatted expression onto the
page(s), and if all else fails use the 1-D expression which will always
work.  As more techniques and styles are included the need to fall back
to 1-D will hopefully become rare, but that way the system never hits a
case where it can't produce at least some kind of document.

I don't know if that fits into the DAG/Tree scheme or not - I suppose
it might and I'm just not seeing it.  Fascinating stuff - it's too bad
there's no longer any way to get funding for this kind of work :-).
Still, even without it a literate pamphlet which systematically
discusses and implements all of this would be a lot of fun.  AFAICT
research on this sort of dried up in the early to mid 1990s, with the
commercial GUIs making polishing and incremental improvements.  (Which
are nice, don't get me wrong...)  Subsequent efforts seem to be focused
on web based technologies and techniques.  I'm afraid I don't quite see
the point of that, since to my mind a web interface is just another
instance of a mathematical interface that happens to be rather
constrained, and its merely a question of what techniques are available
(or maybe technologies in web browsers) to implement general GUI ideas.
I think that's probably an injustice, since I've never really liked
the idea of web applications in the first place.  Those papers deserve
study, but I think a good LOCAL GUI should come first, with the
interface logic defined in such a way that it is easy to plate on any
given toolkit or graphical environment.  Call it "server side
intelligent formatting", I guess.

Anyway, more reading to do.  I won't plunge into this in a big way for
a while (I've still got a units package to do, after all) and I think
we will find that the first step will be to implement the ideas of TeX
and maybe lout in Lisp (cl-typesetting may be a good platform to use if
we can convince the author to release it under a garden variety
Modified BSD license) and build off of that logic to implement
automated display algorithms for confined 2D formatting.  Knuth knew
leave it to the author to decide how to handle a particular situation.
We don't have that luxury unfortunately, and although an override
mechanism is a good idea to provide as a "last resort" sort of thing
for the most part we will have to implement the best available
solutions, whatever they may be.

My current thought is to take TeX - the Program and the lout design
documentation, make a coherent design from those and turn
cl-typesetting (if the license is cleared up) into a high powered Lisp
Typesetter.  Possibly we would want to also implement the ability to
render TeX documents, in order to have a UI which uses our Pamphlet
format as it's native file format for notebooks :-).  Then implement
the 2D layout algorithms.  After that, proceed to define other
interface logic and the connections to Axiom.

A lot of work, but as someone said once "nothing worth doing is easy."

Cheers,
CY

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

```