bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unexpected Versioning


From: Roger Brooks
Subject: Unexpected Versioning
Date: Fri, 23 Jul 2021 08:45:28 +0200

With the following wget script I am getting unexpected versioning of the
resulting files:
>>
wget -EkKrNpH \
     --output-file=wget.log \
     --domains=imcz.club,sf.wildapricot.org \
     --exclude-domains=webmail.imcz.club \
     --exclude-directories=calendar,Club-Events,External-Events,Fonts,fonts,Sys
\
     --ignore-case \
     --level=1\
     --no-parent \
     --no-proxy \
     --random-wait \
     --regex-type=pcre \
     --reject=ashx,"overlay*" \
     
--reject-regex="calendar[@\?].*|Club-Events[@\?].*|External-Events[@\?].*|event-\d+[@\?].*|/[Ff]onts"
\
     --rejected-log=wget-rejected.log \
     --restrict-file-names=windows \
     --wait=1 \
     https://imcz.club/
<<
Some of the downloaded pages have ".1" inserted into the filenames, for no
apparent reason.
Since I am using -r without --no-clobber, I would expect no versioning.
In the case of the above script, a versioned file, "FAQ-Forum.1", is
produced in the absence of any unversioned one:
>>
--2021-07-22 11:03:44--  https://imcz.club/FAQ-Forum
Connecting to imcz.club|34.226.77.200|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://imcz.club/Sys/Login?ReturnUrl=%2fFAQ-Forum [following]
--2021-07-22 11:03:46--  https://imcz.club/Sys/Login?ReturnUrl=%2fFAQ-Forum
Connecting to imcz.club|34.226.77.200|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 41667 (41K) [text/html]
Saving to: 'imcz.club/FAQ-Forum.1.html'

     0K .......... .......... .......... ..........           100%
225K=0.2s

Last-modified header missing -- time-stamps turned off.
2021-07-22 11:03:47 (225 KB/s) - 'imcz.club/FAQ-Forum.1.html' saved
[41667/41667]
<<
Replacing "--level=2" results in many more versioned files, a few of which
have unversioned counterparts, but most of which do not.
The full version of the script includes login parameters and "--level=4",
but I have posted a simplified version here so others can reproduce the
problem.
Similar problems have been reported in the past:
https://lists.gnu.org/archive/html/bug-wget/2015-01/msg00076.html
https://lists.gnu.org/archive/html/bug-wget/2014-11/msg00321.html
https://lists.gnu.org/archive/html/bug-wget/2014-06/msg00107.html
but the advice in those threads doesn't seem to apply to my case.
I am using the not-so-ancient v1.19.1 of wget.
Thanks for any help!
Regards, Roger



reply via email to

[Prev in Thread] Current Thread [Next in Thread]