lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Download recovery for lynx?


From: David Woolley
Subject: Re: LYNX-DEV Download recovery for lynx?
Date: Tue, 12 Aug 1997 22:15:36 +0100 (BST)

> 
> My suggestion is to use what Zmodem has done over the years:  If the
> filename matches the existing ones, request the foreign host to send a
> packet, starting from byte (size_of_local_file - packet_size), and compare
> the packet to the local file.  If the content matches the end of the local
> file, append the existing file at the client side.  Otherwise send entire
> file from the beginning. 

This is the wrong way of doing it with HTTP.  For a start, you need byte
range support, which means that you must have a clean HTTP 1.1 path to
the server (or at least have all proxies and servers byte range aware).
I believe the final CERN server destroys byte range headers, and earlier
ones pass them, with the result that they appear to offer this service
without actually handling it properly (a byte range request fulfilled
by the cache is likely to return the whole document).  (Actually, I suspect
that it is unsafe for a server to send Accept-Ranges without also sending
HTTP 1.1, even though the Microsoft ones appear to do this.)

Then, to actually restart, you should do an If-not-modified-since request,
also HTTP 1.1.  This is a much stronger test than comparing the last block,
and likely to be stronger than the average checksum test.

Incidentally, I was under the impression that Z-modem checksummed the partial
file, rather than comparing the last block - could be wrong.  FTP doesn't
do any checks - you must look at the last modified date in the listing.

This is all described at a black box level, however Lynx is layered
software and each layer assumes services from the next.  Z-modem is
much more monolithic, so the low level protocol layers are mixed in with
the layers that write to the file system - as others have pointed out,
Lynx doesn't even know the file name at the point where it is handling the
HTTP protocol.

It could be done, but it is not going to be a clean change, and, given that
people have no problem downloading 9M files over modems in our office, you
should be looking at why the connection is dropping, rather than recovery
strategies.

You might want to note that most cacheing proxies will complete a transfer,
even if they lose the reader.  It can even be worth starting a transfer,
deliberately dropping the connection (preferably aborting rather than hanging
up) and then coming back later, as you are likely to get a full speed download.
(It's actually quite spectacular if the cache is ethernet connected, but good
even over modems.)  It is possible that some caches are byte range aware, but
simply bypass the cache, so it is possible that a reload from the cache will
be faster than an end to end restart.

Does anyone know whether byte ranges where introduced for this reason - I
suspect they are more to do with serving compound documents, e.g. to allow
something like Acrobat to retrieve a page at a time.
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]