[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fwd: Wget -c option possible errors ...
From: |
address@hidden |
Subject: |
Fwd: Wget -c option possible errors ... |
Date: |
Fri, 21 Jan 2022 15:51:28 +0200 (SAST) |
----- Forwarded Message -----
From: gerdd@mweb.co.za
To: "Tim Ruehsen" <tim.ruehsen@gmx.de>
Sent: Friday, January 21, 2022 1:59:41 PM
Subject: Re: Wget
I happen to have an old XP with a wget 1.9.1 installed (and used a lot in its
day), which, I presume predates the 1.11.4 (9 < 11 if numeric?) That machine
will be tasked with running my garden's irrigation, once I can convince it to
stay up long enough.
In the meantime I could try to run a few tests. But let me share a few
experiences with the -c feature, which may (or may not) help explaining what
you see:
I used wget extensively over the years and a lot of the work was downloading
large-ish media files and/or packed archives. Both types would be very
unforgiving with corrupt files.
As far as I can see the -c function just tells the server that you want the
download to start at byte x (which is the next byte after the last byte
received in your previous download.
These are the possible outcomes that I have seen:
1) the server does as instructed and all fits together and all is good.
2) the server looks at the file to download and returns a verdict that what you
have is as long or longer than the file it has. Nothing more happens. Either
your file is already complete or the server copy was replaced by a shorter one,
in which case you might want to download the whole new version - or you will be
stuck with an incomplete copy of the previous version.
3) the server is stupid, doesn't know the "from" function and gives you the
whole file instead without comment and wget stitches it on to the existing
file, resulting in a corrupt file.
4) a new version of the file has been put on the server, which is longer, but
different. wget will receive the tail end of this new file and stitch it on to
the incomplete file it already has, quite likely resulting in a corrupt file.
To prevent the error conditions you may need to compare file timestamps (not
necessarily 100% sure but a good test in any event.)
Or wget gets a new feature that downloads a configurable size chunk of the
already downloaded file to compare and continue the download only if the chunk
matches its counterpart in the existing fragment. Otherwise a configurable
option could be used to download the whole file or give an error message; the
fresh download could either overwrite the existing fragment or start with a new
name (as in the --no clobber function, for instance.)
You might consider some of these changes for wget2 only. (Incidentally, I'll
have to scour around one of these days for an executable of wget2 for Windows
one of these days ...)
One thing I have never done is to reboot during a download - but I have had
power dropped on me in the middle of one often. Static files were regularly
completed correctly when the download was resumed (provided the server in
question was up to it, which most of them seem to be nowadays ...)
In hopes that this is useful ...
Gerd Diederichs
----- Original Message -----
From: "Tim Ruehsen" <tim.ruehsen@gmx.de>
To: "Дмитрий Дмитрий" <kmb697@yandex.ua>, "bug-wget" <bug-wget@gnu.org>
Sent: Friday, January 21, 2022 11:56:00 AM
Subject: Re: Wget
Hi,
I guess nobody even tries to reproduce the issue as nobody uses XP or
the old wget 1.11.4. For example, I don't even have a Windows license
and thus no Windows installed.
To get better feedback from other users, I would suggest
- update to the latest wget (hundreds of bugs have been fixed
meanwhile). Static binaries for 32/64 bit Windows can be found at
https://eternallybored.org/misc/wget/.
- try to reproduce the problem with a minimal set of command line
options (else others have to do that, and that will costs other people's
time)
- provide exact steps to reproduce
Without the above, me and others can only guess what is happening.
E.g. pressing reboot may result in unwanted bytes in a file and an
inconsistent file system. Download continuation is based on the file
size, not the contents. Wget has no possibility to see if the existing
file contents are correct or not - it can only see if bytes are missing
and download+append the missing bytes. Wget also doesn't see if the file
on the server has been changed or not.
In short: continuation is not reliable.
If you need a byte-exact download, make sure the provider (server) also
provides a checksum so that you can verify your downloaded file. Without
it, better don't use -c.
Also think of possible MITM attacks: try to avoid plain text HTTP - use
HTTPS instead.
Regards, Tim
On 21.01.22 04:03, Дмитрий Дмитрий wrote:
> I am russian.
> Excuse me for my English.
>
> I used old version wget-1.11.4 several years ago.
> I noticed what sometimes happens download errors.
> Wget incorrect getting continue a partially-downloaded file (option -c).
>
> I had two versions this error.
> In one case several bites were incorrect.
> In other case was change size of file. File got other size (more or less as
> original size).
> This happened when wget was close reboot of computer (reboots'button).
> Wget could not correct continue download a file.
> Errors didn't happen always.
> Sometimes.
>
> Because of it if I didn't have checksum (md5 for example) I must downloaded
> files two time.
> And compared its.
>
> About it I wrote here.
> https://lists.gnu.org/archive/html/bug-wget/2021-03/msg00025.html
> I think I was not understanded.
> When You don't understand me - let's ask me.
- Re: Wget, Tim Rühsen, 2022/01/21
- Re: Wget on XP, Taylor, 2022/01/21
- Message not available
- Fwd: Wget -c option possible errors ...,
address@hidden <=
- Re:Wget, Дмитрий Дмитрий, 2022/01/22