[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Tables: getting started

From: Klaus Weide
Subject: Re: lynx-dev Tables: getting started
Date: Fri, 19 Nov 1999 03:30:17 -0600 (CST)

On Wed, 17 Nov 1999, Philip Webb wrote:

> assuming the basic 1-pass parse by Lynx, how could it handle tables
> to create something tabular & corresponding to authors' intentions?

There are already some exceptions to the "1-pass"-ness of lynx's HTML
handling (you are right to call it basic[ally] 1-pass).  Storing away
a copy of a TABLE and its contents, in a memory buffer or possibly file,
while it is first encountered, would be possible.  Probably awkward,
and maybe not a good idea, but possible.  You don't *need* SOURCE_CACHE
for this.  This isn't the big problem - the problem is what to do
with the stuff stored away, and how to make the result of that doing
fit into lynx's HText & anchor structures & lists.

> i am assuming (following KW) that TRST is usually inadequate for this.

Well, it does create "something tabular & corresponding to authors'
intentions" when it *does* apply.  I wouldn't call that "usually
inadequate", rather "inadequate in the general case".  Whether it's
"usually" inadequate for you depends on what kinds of tables you are
"usually" viewing.  *I* find it adequate for many tables where lack
of table support used to be annoying.

> the first step is to focus on normal tables with good HTML:
> problems arising from less well-composed HTML can be dealt with later.

Always a good idea to limit the scope of the problem. :)

(Which is also justification for TRST's existence.)

> the user could read the table, save it or return to the previous page,
> esp if the re-rendered table were still a mess due to bad HTML;
> there would be nothing to force users to have a table re-rendered
> or to make TS active or even to compile it in the first place.
> somewhere in the page created by TS for the table
> there would be another link eg `Insert in document',
> following which would cause Lynx to insert the re-rendered table
> back into the larger document in place of the section
> between `Start of table' & `End of table' (inclusive).

You have already suggested something similar in
but without the added touch of 'Insert in document'.

Apart from that after-the-fact re-insertion, you envision (full) table
rendering as a separate process.  All the more reason to experiment
with external scripts!  In your approach, formatting of a table is
separate from rendering the rest of the document.  So I see little
reason why it *has to* be all done in the lynx code.  You migh as
well dumpt the table markup into some file and let an external
script deal with that (ultimately feeding it back to lynx or not -
but that's a separable problem).  

> once upon a time, all this would have been scarcely thinkable,
> as Lynx would have had to go get the source all over again,
> but now source-cacheing is available & would be recommended with TS:
> all Lynx has to do is find the right section of the source to process.

There is no simple way to do this, if you think in terms of byte count
pointers into the raw 'source' form.  Several layers of buffering separate
the parsing from the original byte form.

> to process a table -- we're assuming the HTML is not badly invalid --
> Lynx would concentrate on the tags <tr> <th> <td> + closings,
> ignoring <p> <br> or anything else which might upset formatting;
> it would use < ... width=n% > & -width to calculate how many columns
> to allow for each <th> <td> , wrapping text/numbers when necessary;
> instructions to centre/justify would be followed, insofar as possible.

Trying to fit all this into HTML.c (or some other place within the
usual stages of processing) is a big problem.  You'd want to do
it in a separate stage, giving you a clean slate in a way.  And
then that separate stage might just as well be implemented in an
external process.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]