axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Axiom-developer] Crystal, Pamphlets and LaTex


From: William Sit
Subject: [Axiom-developer] Crystal, Pamphlets and LaTex
Date: Sat, 21 Jul 2007 02:20:42 -0400

Epilog: I must be writing this after smoking ... . I have
put this away in my draft folder for a few days because I
know little about databases.  Encouraged by Stephen's
prompt, I decide finally to submit it for whatever it is not
worth. I suppose I could have rephrased things in question
form, given that I do not know the internals of how Axiom's
databases are created and used (which might for all I know
have been already in some form of structure that is
supportive of the Crystal vision). So if this makes any
sense, we can continue to discuss it (but don't expect me to
do any implementation). Otherwise, it's ok to tell me that's
nonsense -- I learnt that from Stephen :-). If my uncooked
ideas help raise questions on what infrastructure Crystal
would need, that is good enough for the effort.

(no, I don't smoke, but I am following Stephen's pipe dream
perhaps)


Stephen wrote:

> How do you define the database? How is it going to be
> constructed,
> defined? You need to write it down somewhere, and in the
> case of
> axiom, it will be written down in a pamphlet. May as well
> let the
> pamphlets communicate the structure.

I will argue that according to my understanding of the
Crystal vision, pamphlets are a very poor way to support the
vision.

If I understand Tim's vision correctly, the Crystal concept
is nothing new. In fact, we are experiencing it almost every
time we open a new web page (read the news,  shop on the
internet, do a google search, etc.)  If I go to any news
site, I can see the world view, the US view, or any other
geographical region in the world; I can also see the
business view, the technology view, the health view, and so
on. With each click, we are provided with a new facet,
tailored to provide not only the information asked, but also
additional and peripheral info (like ads, etc.) The web page
does not preexist in a single (html) file. Rather this
single html file that is displayed is created on the fly. It
is not self contained (because it loads other files), but it
is the "master" composer for a particular facet.

For the Crystal project, a pamphlet file should be equated
to such a modern web page (and even exceed it in
imagination) that masters a facet, because, as the ultimate
document seen by the user, that facet document should be
literate and Tim has argued that pamphlet files are for
literate programming. However, they should never be
pre-composed, but should be generated on the fly from a
database, where the individual items may have different
formats (pictures, gif, movies, etc. in the case of html
file, and lisp code, boot code, spad code, documentation,
graphs, research articles, etc., some perhaps as fragments
of a pamphlet file, in the case of a facet). A facet can
lead to other facets or subfacets. On display, all subfacets
should be live and clickable.

If I want to work with elliptic curves, I want examples, I
want links to its definition, its theory, its code, I want
to see a graph, I want to see how it is used in
cryptography, I want to see how it is an algebraic group, or
an abelian variety. I want to be able to compute addition on
its points. Each of these can be displayed in a subfacet
document. How can a single elliptic.spad.pamphlet be able to
satisfy these diverse demands? Its graph probably would be
better handled by a general program that plots any graph, or
picked from a library of stocked examples. Its algebraic and
algebraic-geometric structure should be taken from some
articles about abelian variety in general. Its use in
cryptography would be better supplied from a subdatabase on
cryptography. The information should be very modular to
allow efficient storage and multiple use, and be able to be
mixed at will.

If I want to build a business like Amazon.com, I don't
create all the documents about an item in an item.pamphlet
file where we include all information on the item (best
price, manufacturers, specs, consumer reviews, related
items, companies selling them, local shops, shipping
information, etc.). I would use a database (for example, a
live relational one), from which I can piece these info
together and present them in one single web page. I would
also collect user information and tailor the ads to his
recent interests. The Crystal project should be no
different! Each facet should be tailored to the user's
interest.

A relational database is efficient on multidimensional
searches and storage (each piece of information is stored
once). The database stores what might be called atomic
information, little pieces of indivisible information.
Exactly what is indivisible depends on the applications the
relational database is to support. Data may be categorized
(into domain types) by attributes, organized and stored into
hierarchical files structures. Relations among them are
captured in tables or schema.  Management programs allow
input of the data and relations (a multi-window environment
fancier than what Stephen hypothesized). A query language
allows searches that provides customized views of the
relevant information (which may be called facets).

For the Axiom project, what are the atomic objects? On the
algebra level, and even at the boot and lisp level, they are
functions (and associated documentation, which will be
discussed later). Yes, just functions: these include
operating system utilities, category constructors, domain
constructors, packages, and exported functions. So in the
Crystal database, the organization should be around
functions. With each function, we need to define what are
its attributes, what info should go with it, what
mathematical concepts go with it, what c.sc. concepts go
with it, what theorems go with it, what algorithms go with
it, what are the relevant references (perhaps sorted with a
weighing system), what other functions it depends on, what
are its usage (all the things you can ask at hyperdoc), what
is an example for its call? what other functions use this?
(these include domains, and packages that include it as
export). All these make up the content of a literate
document for the function.

Notations used in defining a function should be
"relocatable" that is, they are dummies, so that when
different functions got pieced into a coherent facet
document, the dummy names do not conflict with each other.
An analogy is the link and load stages of object codes into
an executable. Another is the references using \label, \ref,
and \cite in LaTeX which are "relocatable" (we can easily
physically relocate the referenced items without affecting
the references). We would do this on a document level where
"memory location" or "offsets" are now identifiers and
mathematical notations. This may not be too important for
code, as the boundaries of code are usually clear. However,
for documentation and the notations used in documentation,
there is potential for conflicts because the boundaries are
no longer clear.

Similarly, the (literate) documentation parts that explain
the code in a function should be as specific to the function
as possible, but liberally using cross-referencing
techniques (which are not yet designed at this time) and
document "glue" (again not yet designed) to be combined with
other documentation parts of other functions. By being
specific, such literate documentation can remain small and
easier to write. Some "glue" may have to be supplied
interactively by a knowledgeable human at first, but these
"glue" documents will be stored and reused. LaTeX is, in
some ways, a good example of a document system that allows
pieceing together other "bits" like table of content, index,
bibliograhy, graphs and figures, tables, etc. In the case of
LaTeX, the "glue" is minimal and is more or less technical
rather than literate.

A flat model, such as a linear pamphlet file will not be
suitable, because the interrelationships among functions
(especially across pamphlet files) are not explicitly
stored. For now,  a linear flat structure may satisfy our
current needs (yes, we can create new latex commands and
latex packages to handle all the embedded constructs,
including new code chunks, running some code outside of the
LaTeX environment and inserting the console transcript back
into the document). We did this already all the time, say
when embedding a eps figure, or making a url in a pdf file
live, or creating a TOC or creating an index. But all this
is linear processing (even if we need multipasses). What we
need is searching, selection, and most importantly for
literate programming, documentation "glue", to present a
facet, not as a discrete set of related objects, but as a
literary piece. Even without concern about literate
programming, just getting the data would be difficult and
inefficient if the info is encaged in pamphlets (or linear
flat) files with no inter-referencing structure to connect
them (current labelled LaTeX references and bib-items are
local, unlike url items; even html files do better in this
respect). Such pamphlet files would only be good for what
each pamphlet file already contains. It would be difficult
to use them to support the Crystal vision.

Even with a relational database infrastructure, it would
only support one aspect of the facet process, namely,
fetching the data to be incorporated into the facet page.
Still, a relational database supports this fetching or
searching much better than flat files. A flat file needs to
be parsed to obtain the individual atomic information.
There should be a compile (or select), a link and a load and
a run phase on the current type of pamphlets. Repeat (second
pass) if further resolution is needed. End result is
fragment of a latex file with the documentation, code, test
run and results, ready to be used in a facet pamphlet.
Question: what information does our current compile stage
collect and is the way it is stored supportive of the
Crystal vision?

The difficulty in the facet process is not the mechanical
processes, but the non-mechanical process of supplying the
glue that make the resulting output a "literate" piece of
prose. With pamphlets, it is always written with one
purpose: documentation, but the glue is very particular to
the structure of THAT document. When the pieces need to be
taken out of context and combined with other similar pieces
from another pamphlet, and then combined to form a facet
document, there is no glue and or only out of context,
irrelevant glue residues.
A relational database would avoid the need for parsers of
embedded chunks from a flat linear pamphlet.

So perhaps my conclusion may be summarized as that pamphlet
files are not a good input device, but can be a good output
device. This bring us back to the fancy GUI. I think it
should be like any developer's interface for languages.
Those of you who deal with developing code day-in day-out
can imagine a better interface. In my inexperienced opinion,
it should consist of multiwindows (of course they don't have
to be opened simultaneously):

     assembly language window
     lisp code window
     boot window
     spad code window
     interpreter (console) window
     aldor window
     graphics window
     latex window (editor for documentation or theory)
     keyword and facet window (or relations window)
     help window

each window with its content wrapped in a minipamphlet
fragment for formatting, as well as a plain file without
wrappers.

Even though we do not yet have this GUI for input, we can
discuss the file structure, search strategies, and
experiment with a small part of Axiom where we build the
supportive database by a combination of automated tools and
manual inputs (such as documentation). The Crystal project,
if it is ever to be realized, must start with a small
prototype, and we must have a design for the infrastructure.

William













reply via email to

[Prev in Thread] Current Thread [Next in Thread]