[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #62869] if retry hits a 302 FOUND wget forgets to send the Range he
From: |
Emanuel Czirai |
Subject: |
[bug #62869] if retry hits a 302 FOUND wget forgets to send the Range header thus appending the whole file to what's downloaded alrdy |
Date: |
Sat, 6 Aug 2022 01:49:13 -0400 (EDT) |
URL:
<https://savannah.gnu.org/bugs/?62869>
Summary: if retry hits a 302 FOUND wget forgets to send the
Range header thus appending the whole file to what's downloaded alrdy
Project: GNU Wget
Submitter: correabuscar
Submitted: Sat 06 Aug 2022 05:49:12 AM UTC
Category: Program Logic
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Originator Name:
Originator Email:
Open/Closed: Open
Release: trunk
Discussion Lock: Any
Operating System: GNU/Linux
Reproducibility: Every Time
Fixed Release: None
Planned Release: None
Regression: None
Work Required: None
Patch Included: Yes
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC By: Emanuel Czirai <correabuscar>
Hello.
I've encountered this append bug on Gentoo with wget-1.21.3-r1 while portage
is downloading the file android-studio-2022.1.1.9-linux.tar.gz for Android
Studio Canary (a 1G file, which on disk was 1.6G and thus corrupt due to this
bug)
I've (not yet) attached file *problem_on_real_url.log* if you want to see wget
output the second time I've reproduced the above which yielded a file that was
24 MiB larger. I haven't redacted anything(like my IP address). I haven't
attached this yet, because only 4 files can be attached, if you really want to
see this let me know, I will attach in the next comment, but only if you need
to see it.
I couldn't reproduce it all the time because those google servers don't always
yield a 302 FOUND after a timeout and they don't always timeout either.
So I've come up with a test that always reproduces this issue (unfortunately,
I couldn't figure out how to make it a test case - test suite doesn't seem to
have the needed functionality): A server that pretends to timeout in the
middle of the transfer then when wget retries, it will give a 302 FOUND
<https://www.rfc-editor.org/rfc/rfc7231.html#section-6.4.3> and redirect to
another server and this is when wget forgets to send the Range header which
specifies from where should the server continue sending the file, thus the
server sends the full file from the beginning, and wget still acts as if the
file is being sent from the continue point, thus appending the full file to
whatever it already downloaded until the timeout(and the 302) occurred.
I've attached files:
a.py
go
tst
wget_no_append_on302_uponretry.patch
to run the test and check that the bug exists just first *chmod a+x go tst*
then run(as normal user, always):
./go
or
to see wget --debug output:
./go --debug
or
./go bug --debug
The last line should be a red color: "Bug still present!"
To see how wget acts when the server doesn't do a 302 redirect after a timeout
(ie. it never hits this bug) then run:
./go nobug --debug
This will always say as last line: "Bug is fixed."
To test both:
./tst
For this test script, if the bug is not fixed you get a yellow/brown last
line:
"ok, bug test is fine ie. wget isn't fixed (but it should eventually be, hence
why this is yellow)"
but if the bug is fixed, you get:
"Failed to reveal the bug, was the wget bug fixed?! (assume this is green if
you know that wget got fixed)"
The test wants to wget the file with contents "Hello World.\r\n" but the
server induces a timeout after "Hello " and this causes wget to retry, but the
server then gives a 302 which wget follows and then wget doesn't send a Range
header anymore causing the server to reply with 200 OK instead of 206 Partial
Content, thus the final file contents are "Hello Hello World.\r\n" when the
bug is present, thus showcasing the fact that the whole file(which is "Hello
World.\r\n") just got appended to whatever it already downloaded(which is the
first "Hello ")
Apply that attached patch to wget to see a proof of concept hacky fix which
makes wget do send a Range header after the 302 happens by pretending that
wget was ran with --start-pos=X arg, where X is the file offset it should've
continued from. It's a hack, not the actual fix.
_______________________________________________________
File Attachments:
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC Name:
wget_no_append_on302_uponretry.patch Size: 1KiB By: correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53534>
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC Name: go Size: 993B By: correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53533>
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC Name: tst Size: 2KiB By:
correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53532>
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC Name: a.py Size: 8KiB By:
correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53531>
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?62869>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #62869] if retry hits a 302 FOUND wget forgets to send the Range header thus appending the whole file to what's downloaded alrdy,
Emanuel Czirai <=