[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev Why doesn't lynx cache HTML source?
From: |
Bela Lubkin |
Subject: |
Re: lynx-dev Why doesn't lynx cache HTML source? |
Date: |
Mon, 9 Nov 1998 23:49:13 -0800 |
Chuck Martin wrote:
> Sometimes I want to view the HTML source of a document using the "\"
> key, and then switch back to the rendered document when I'm through,
> or I want to switch all images to links and back again, using the "*"
> key, or maybe I want to turn pseudo-ALTs on or off, but every time I
> make a switch in how I view a document, the whole document has to be
> downloaded again. Is there a reason why it was done this way? It
> seems to me that caching the HTML source instead of the rendered
> document would make more sense, and would save time when making these
> changes on the fly. Of course, moving between cached documents might
> be a little slower unless they were cached both ways (source and
> rendered), but not by much, and rendering a cached document would be
> much faster than downloading it repeatedly. Could someone enlighten
> me?
This issue has been discussed extensively in the past; search the
Lynx-Dev archives for the details. I used www.altavista.com's "advanced
search" to search for "host:www.flora.org AND source AND cach* AND
text:subject AND text:html", finding 59 matches.
To summarize briefly: Lynx uses a single-pass recursive descent parser,
consuming the HTML source and producing rendered output on the fly.
Only that rendered output is cached. Since Lynx has many operations
which require it to re-parse the HTML source, many users have suggested
that Lynx cache the source as well. This usually leads to a discussion
of pros and cons, some of which are:
Con:
- would add to the complexity of Lynx
- caching rules are very difficult to get right
- would add code, increasing the size of Lynx source and binary
- would increase the in-core and/or on-disk storage consumed by Lynx
during operation
- duplicates functionality which is already provided by other
programs, i.e. web caches such as Squid -- programs which are
dedicated to caching functionality and thus can be expected to do
it better than Lynx could hope to
Pro:
- would add to the utility of Lynx
- greatly speed operations which require a re-parse, including '\'
view-source, '^V' other-DTD, '*' image-URLs, '"' soft-dquotes, '`'
and "'" comment-parsing, '[' pseudo-alts, '@' raw-mode, and
changes in assumed document character set.
- easier for a regular user to install than a full web proxy
- persistence of cache can be better tuned to the user (e.g., cached
objects can persist only for the duration of a session, thus not
consume disk space while Lynx not running)
Possile techniques have been discussed. But the bottom line is that
Lynx is a cooperative volunteer effort, and big projects like a revamped
caching system do not happen unless someone contributes the code.
Please feel free to jump in and write it! ;-}
>Bela<
Re: lynx-dev Why doesn't lynx cache HTML source?, dickey, 1998/11/10
Re: lynx-dev Why doesn't lynx cache HTML source?, David Combs, 1998/11/10
Re: lynx-dev Why doesn't lynx cache HTML source?, Bela Lubkin, 1998/11/10