[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Lynx 2.7.1 and 2.8 refuse to render certain HTML documents

From: Bela Lubkin
Subject: Re: lynx-dev Lynx 2.7.1 and 2.8 refuse to render certain HTML documents
Date: Fri, 8 May 1998 04:14:00 -0700

Foteos Macrides wrote:

> Bela Lubkin <address@hidden> wrote:
> >[...]
> >The implementation I'm hoping for is simple.  Lynx currently caches
> >rendered documents.  I want it to _optionally_ cache source documents as
> >well.  The existing structure becomes two.  The source cache is very
> >simple, because it should be a pure bytestream as received from the
> >server, suitable for re-parsing by any newly requested strategy.
>       At least that much also would be important in conjunction
> with implementing valid CSS2 style sheet support, and more of HTTP/1.1,
> without which the long term viabililty of Lynx as an Internet browser
> is questionable, IMHO.  But I don't see why you're ruling out an
> additional option for disk caching, particularly for personal PCs
> which may have Gbyte disks but not as much spare memory, and on which
> security/privacy concerns about caches may not be as great as on
> multiuser systems.

I didn't intend to rule it out.  I only meant that the way *I* would
configure this hypothetical feature, for my own use, would cache
everything in-core.  Caching on disk is another perfectly reasonable
alternative which would also be worthwhile to include.

For my purposes, running on a working Unix system, caching in core
*becomes* caching on disk, if I run low on memory.  It gets pushed out
to swap.  I trust the kernel to handle this about as efficiently as Lynx
would have, doing its own on-disk cache; except that, by allowing the
kernel to do it, I get to take advantage of the usually-ample RAM, so
that in practice it rarely goes out to disk.

So each cached document has:

  - a source bytestream, exactly as received via HTTP, FTP, file, etc.
  - a parsed representation, in Lynx's internal parse tree format
    (tagged appropriately so that Lynx knows under what conditions it
     was built -- "source" vs. HTML parsed, "tagsoup" vs. "sortSGML",
     "images on" vs. "images off", source charset, etc.   I would
     expect display charset to be handled at display time and not
     require a re-parse when changed, but I see it does currently
     re-download the document...)

Each of which may be:

  - not present
    (for source bytestream, re-download it; for parsed representation,
     re-parse from source bytestream)
  - in core
  - in a file on disk
    (in either of these cases, for parsed representation, re-parse from
     source bytestream if any of the parsing conditions have changed)

Optionally, forget the parsed representation when moving to a different
document, keeping only the source bytestream.  This saves memory (or
disk) at the expense of CPU time when revisiting the page.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]