texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] Table converision library


From: david
Subject: Re: [Texmacs-dev] Table converision library
Date: Thu, 6 Feb 2003 12:53:07 +0100
User-agent: Mutt/1.4i

On Thu, Feb 06, 2003 at 10:16:05AM +0100, Joris van der Hoeven wrote:
> 
> > Patches 1048 and 1071 are just regression tests for tmtex. So they are
> > independant of the table manipulation library.
> > 
> > However I see a use for a library of tools for regression testing. In
> > that case, the functions used to produce tm-tables for testing the
> > converter are usable by any converter. It is important that the
> > regression testing tools do not overlap with tools which do actual
> > work, because it reduce the likelihood of common failure modes.
> 
> That is OK, but I would like to concentrate on the precise
> functionality first.

That meant I see no reason to hold patches 1048 and 1071.

Also that was a sidenote about my latest ideas on the evolution of the
test suite tools.

[...]
> If I understand you well, the idea of the parser is mainly to
> increase performance and simplicity of some of the algorithms. From
> the user point of view, we are of course mainly interested by
> routines for extracting information from tables (or constructing
> tables for * -> TeXmacs converters).

There are two parts in this discussion:

  table parsing -- I just ported the existing funtionnality in a more
  maintainable and extensible form. My focus is not on performance
  but rather on *abstraction*.

  table production -- that discussion is still completely open. I
  think it would be good practice to start by repackaging existing
  code (so we remain pragmatic) when it appears whe need it somewhere
  else.


> > A table-default list is a list of 'twith' and 'cwith' elements, it is
> > meant to hold the default table settings which are set by the macro
> > expansion enclosing the table definition. For example 'block*' sets a
> > global border and centered horizontal alignement for all cells.
> 
> Notice that the default table list should be provided for free
> in most cases by evaluating the to-convert expression with
> the exception of all keywords which can be converted
> (cf. our previous discussion).

Yes, but typesetter evaluation from Scheme is not yet available. So
the 'default' parameter is currently required.

  Note: typesetter evaluation will probably require the converter to
  apply to a buffer (or a view) loaded with the document to export. It
  may be useful on the short term to use that approach so the init-env
  can be easily accessed by converters (e.g. to handle page size and
  margin information).


[..]
> > The table information is accessed by evaluating table-parser closure
> > with different arguments. The first argument is a symbol and denote
> > the scope of the property being accessed. Different scope require
> > different additional arguments. Currently existing scopes are:
> > 
> > Iterator scopes:
[...]
> > Synthetic scope:
[...]
> > Physical scopes:
[...]
> 
> Yes, this distinction is OK.

Fine.

Also we could optionnaly make scope behave smartly. For example, the
properties of a 'col' scope could not match the internal 'cwith'
definitions if all cells of the columns overloads the 'col' settings.

That would make exported structure more WYSIWYG. But maybe that kind
of transformation should rather reside in a tmtm simplification
module.


> > It must be noted that smaller scopes may provide information which is
> > not available to wider scopes. For example a column may define a
> > column-wide bottom border of "1ln", and one cell of this column may
> > define a bottom border of "0ln".
> 
> Absolutely. It is very important to be able to extract properties
> from wider scopes. For instance, when converting to LaTeX, you should
> first extract the properties for each column (the last cwith which
> covers the whole column) and put this information in the argument
> of \begin{array}{...}. Next, this information may be overidden locally.

That is also required if one want to use the advanced table layout
features in HTML or CSS.


[...]
> > Properties are named by symbols. They match table properties used by
> > the typesetter. Rows, columns and cells also define the extra property
> > 'content' which gives the table data. Rows content is the table data
> > as a list of lists in natural order, columns content is the table data
> > in transposed order, cell content is the actual content of the
> > designated TeXmacs cell element
> 
> Yes. Notice that you might also want to extract content and properties
> simultaneously for certain purposes. In fact, the table paradigm is
> very rich and we will need to play with it...

What are you thinking of?

To me, "extracting  content and properties simultaneously" rather
looks like table slicing. But I think the applicable algorithms can
generally be implemented using Scheme mapping and iteration constructs
with iterator scopes.

I think there should be clearly separated submodules in the table
library. A table-parsing submodule and a table-construction submodule
which provide primitive functionnality, more elaborate tools (like
table slicing) could be built on those primitives. So the parsing
would not get intermixed with higher level features.

Two rules of thumb:

  Keep simple things simples (internally as well as for client code).

  Manage complexity with modularity.


> > Handling of subtables and decorations is not yet defined, but I
> > believe the current design to be generic enough.
> 
> Please don't bother about that yet. We first have to understand
> the semantics of the tformat/cwith construct better.

Ok.


> This really is a new, and interesting, level of abstraction which
> is not really present in XML.

I do not see how XML is relevant here. Markup languages are about
describing and transmitting data across modules. Here we are talking
about a processing model...

That makes me think that maybe some inspiration could be drawn from
W3C DOM specifications. Anyone here with expertise in that field?


> > There could also be some accessors for grouping scopes, which might be
> > useful to carry the meaning of cwith on ranges which are not simple
> > rows, columns. But maybe that would only be really meaningful when
> > TeXmacs has a real support for row groups and column groups.
> 
> Notice that we also have subtable-scopes.

What do you mean? Is that about typesetter subtables?

 
> I also agree that we need better support for row groups and column groups.
> A good example of a situation where this is needed are numbered equation
> arrays. One would like to be able to have properties or even macros
> for saying that a row should be numbered or not.

This kind of feature could be implemented by a primitive which gives
read-access to the table-context of the current cell (which would
indeed be very useful). That is not what I mean when I mention row and
column groups.

The HTML4 spec has an interesting discussion about row and column
groups. From memory there are two kinds of groups:

  title groups -- <h:thead> <h:tfoot> <h:th>
    Groups which relate specially to the rest of the table. For
    example, when a table span several pages vertically, its top and
    bottom title rows are repeated on each page. If the editors allows
    scrolling of the table contents, the title group may be
    decorations aroung the scrolling area.

    That is arguably very relevant to TeXmacs

  body groups -- <h:tbody> <h:colgroup>
    Groups whose contents are semantically related. For example, in an
    accounting application, a transfer column and a running balance
    for the same account may belong to the same column group. They
    also probably need to have precise nesting semantics.

    I am unsure that is relevant to TeXmacs. It seems that one reason
    why that was included in HTML was to allow good aural rendering of
    tables.

Groups are significantly different from subtables because their
geometry is constrained by the table they belong to.

And while we are talking of new typesetter features for tables, I want
to remind that we may use more elaborate table property logic. For
example one may want to have rows with alternated background colors
(hey, Alvaro!).

This feature could be implemented by an ad-hoc cell property. Are
there other situations where more table logic would be needed? So we
can look for an appropriately general solution.


> > Maybe that is all crap. If you think so, please propose another design
> > which provides the same level of encapsulation and extensibility.
> 
> No, I think that you are rather close to my views on the subject.
> I would like you to regroup the code which is independent from
> tmtex in a new file though, so that it will be easy to study it.
> Also, it would be good to clearly separate the different parts of
> the library (parsing, extraction, etc.).

All right. You just need to coin a name and a function name prefix and
I will do it as part of my continuing cleanups.

The list iterator from tmhtml is also generally useful, it could also
be used to convert LaTeX sections to recursive sections.

-- 
David Allouche         | GNU TeXmacs -- Writing is a pleasure
Free software engineer |    http://www.texmacs.org
   http://ddaa.net     |    http://alqua.com/tmresources
   address@hidden  |    address@hidden
TeXmacs is NOT a LaTeX front-end and is unrelated to emacs.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]