lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev anchor->FileCache (was: patch 8 - reusing temp files)


From: Klaus Weide
Subject: lynx-dev anchor->FileCache (was: patch 8 - reusing temp files)
Date: Tue, 6 Jul 1999 02:20:29 -0500 (CDT)

On Tue, 29 Jun 1999, Henry Nelson wrote:

> > page, after being created once, would always be the same file
> > (/tmp/L7013-4TMP.html or whatever).  This changed in connection with
> 
> This message is a little off topic and deals only with temp files
> created in the course of handling files matching "if (skip_loadfile) {",
> around line 273 in src/HTFWriter.c.  I've already posted my interim
> solution, which is simply to remove the temp file when the viewer exits:
>                             FREE(me->remove_command);
>                         }
> +                       remove(me->anchor->FileCache);
> Mostly I'm asking for advice/opinions; no need to do anything if there
> is no interest.  The simple removal works fine, but it could be "cleaner."
> 
> The problem, from my point of view (rather limited disk space, even more
> limited memory, yet wanting any number of users to be able to view large,
> >1MB files), is twofold: 1) Lynx accumulates temporary files until it
> exits, and 2) despite having a file decompressed on disk, Lynx goes ahead
> and decompresses it again.  My focus in this message is '2)'.
>
> For a concrete example, let's say I hit a link which points to a file
> with two levels of extensions, each mapped in .mime.types and .mailcap.
> The outside one, "gz", is for passing to gunzip, and the inside one,
> "arc", is for passing to the viewer most.  On disk I get a temp file:
> -rw-------   1 user     1333410 Jun 29 12:44 L3394-1TMP.arc
> I quit the viewer, and the temp file remains as is:
> -rw-------   1 user     1333410 Jun 29 12:44 L3394-1TMP.arc
> I hit the link again, and Lynx passes the gzipped stream to gunzip, and:
> -rw-------   1 user     1333410 Jun 29 12:45 L3394-2TMP.arc
> "L3394-1TMP.arc" is gone, so Lynx has either removed it, or has renamed
> it to "L3394-2TMP.arc" and written the unzipped stream over it.

(It has removed it, not renamed.)

> If I open another file, I get a unique temp file being created:
> -rw-------   1 user     1333410 Jun 29 12:45 L3394-2TMP.arc
> -rw-------   1 user     1845464 Jun 29 12:46 L3394-3TMP.arc
> 
> This means that Lynx knows that it has already unzipped a certain
> file and has a temporary copy of it available (otherwise it wouldn't
> know when it can delete a file only to make another exact copy with a
> different name).  

Yes, that's true.  Note that they don't always survive until the end of
the session (at leat they shouldn't - I haven't tested it for this
specific situation, a compressed file of a type that is mapped to
an external viewer).  They should be cleaned up when the corresponding
HTParentAnchor structure gets removed from memory (which is a kind
of garbage collection that's hard to predict, but it should generally
happen when you move 'far enough away' in your browsing history from
the document(s) with links to the URL in question.

> A "smart" Lynx would skip the decompression step
> and pass the copy it already has directly to the viewer again, or not
> even bother to remember what it has since it is going to junk it
> immediately after it has used it.  The latter is what I was thinking
> of doing, but I wonder if it needs to remember what the file was
> for some kind of recovery mechanism.

I think there are two reasons why it's done the way it is (the second
being the better one):

1. The mechanism (involving anchor->FileCache) is more or less shared
   between code in HTFWriter.c for three functions:
   - handling files passed to external viewers
   - handling 'D'ownload
   - handling compressed documents / files
   For one of them, 'D'ownload, the temp file needs to be kept around
   for as long as possible, so that the file is available for (possibly
   several) actions from the Download Options menu.

2. The files that are passed to external viewers are kept around after
   the viewer command has been invoked and returned, because the viewer
   command may invoke the real viewer in the background.  This makes
   sense especially for image viewers under X, or for audio players.
   Lynx cannot know when the viewer is completely done with the file,
   so it keeps it around.

But there is a better solution than changing the lynx code IMO: just
make the 'rm' part of the viewer you invoke.  A wrapper script around
'most', that first calls 'most' and then 'rm', should do fine.  Lynx
should have no problem caused by the enexpected disappearance of the
file.

There is the different case of compressed (remote) HTML and text/plain
documents that lynx displays internally, for those temporary copies
are alos kept around although there doesn't seem to be a good reason
for it (unless lynx were changed to re-use those files upon re-loading).
But you don't seem to be talking about that case.

Reuse of the content of those temp files (for both cases) would mean
there is no check against stale versions, unless either some real cache
freshness checking logic is implemented or the files are removed more
aggressively.

(anchor->FileCache is completely separate, as far as I have seen, from the
SOURCE_CACHE:FILE temporary cache file.  The two mechanisms don't know
about each other.  It also seems that SOURCE_CACHE doesn't apply to
compressed HTML files from http servers (but I haven't tested that;
I just expect the p->name that is tested in CacheThru_new would say
"file" in that case instead of "http")).

  Klaus



reply via email to

[Prev in Thread] Current Thread [Next in Thread]