[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] [bug #56660] wget -r or mirror with robots-off should still d
From: |
anonymous |
Subject: |
[Bug-wget] [bug #56660] wget -r or mirror with robots-off should still download robots.txt file |
Date: |
Tue, 23 Jul 2019 11:45:34 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Firefox/68.0 |
URL:
<https://savannah.gnu.org/bugs/?56660>
Summary: wget -r or mirror with robots-off should still
download robots.txt file
Project: GNU Wget
Submitted by: None
Submitted on: Tue 23 Jul 2019 03:45:32 PM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Originator Name:
Originator Email:
Open/Closed: Open
Discussion Lock: Any
Release: 1.20
Operating System: None
Reproducibility: None
Fixed Release: None
Planned Release: None
Regression: None
Work Required: None
Patch Included: None
_______________________________________________________
Details:
GNU Wget 1.20.3 built on darwin18.6.0.
with robots=off, wget does not download the robots.txt file
wget -r -e robots=off https://www.robotstxt.org/
robots.txt is not downloaded even though it is present
Expected:
downloading the root of a site with recursion or --mirror should still save
the robots.txt file, even if it is being ignored.
The robots.txt file still contains useful information for site mirroring and
archival purposes, even if it isn't being respected .
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?56660>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Bug-wget] [bug #56660] wget -r or mirror with robots-off should still download robots.txt file,
anonymous <=