Hello guys!
I’m using wget to make a mirror of https://releases.hashicorp.com but I don’t
want to make a full mirror, rather I’d like to have a mirror of certain
“subfolders” of this site (e.g. terraform, consul etc.). So I do this using the
following command:
wget -N -r -l inf --no-parent https://releases.hashicorp.com/consul/
The problem is that at first I get the following result
******
$ wget -N -r -l inf --no-parent https://releases.hashicorp.com/consul/
--2022-05-16 16:28:18-- https://releases.hashicorp.com/consul/
Resolving releases.hashicorp.com (releases.hashicorp.com)... 151.101.193.183,
151.101.129.183, 151.101.65.183, ...
Connecting to releases.hashicorp.com
(releases.hashicorp.com)|151.101.193.183|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Connection: keep-alive
Content-Type: text/html
ETag: TvHhjlva/+c=
X-Api-Version: 0.1.2
X-Request-Id: 8a74122b-c155-88ff-511e-8d0d93155b2e
X-Amz-Cf-Pop: AMS50-C1
X-Amz-Cf-Id: Pdzhym0uq3XXjsZ_PxS8xvkntM0IsSCQtakE2EvgwC0v0tYMPJwCzQ==
Age: 61398
Access-Control-Allow-Origin: *
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Accept-Ranges: bytes
Date: Mon, 16 May 2022 16:28:18 GMT
Vary: Origin, Accept-Encoding
transfer-encoding: chunked
Length: unspecified [text/html]
Saving to: ‘releases.hashicorp.com/consul/index.html’
releases.hashicorp.com/consul/index.html [ <=>
] 19.51K
--.-KB/s in 0s
Last-modified header missing -- time-stamps turned off.
2022-05-16 16:28:18 (45.4 MB/s) - ‘releases.hashicorp.com/consul/index.html’
saved [19979]
******
We can see that whatever is there at https://releases.hashicorp.com/consul/
gets saved to local releases.hashicorp.com/consul/index.html which is fine,
exactly what I want. But when in comes to the first href from the
releases.hashicorp.com/consul/index.html I get the following:
******
--2022-05-16 16:30:21-- https://releases.hashicorp.com/consul/1.12.0
Reusing existing connection to releases.hashicorp.com:443.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Connection: keep-alive
Content-Type: text/html
X-Api-Version: 0.1.2
X-Request-Id: ca8c47f5-2e54-b09a-adde-6e8cf5e92d45
ETag: 8p+ndCqEoYc=
X-Amz-Cf-Pop: AMS50-C1
X-Amz-Cf-Id: qA5XZEv2hZReEYoZD29GRsD_M6u76VLv6g-usgKJAzTCQm_SyWVFRA==
Age: 27384
Access-Control-Allow-Origin: *
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Accept-Ranges: bytes
Date: Mon, 16 May 2022 16:30:21 GMT
Vary: Origin, Accept-Encoding
transfer-encoding: chunked
Length: unspecified [text/html]
releases.hashicorp.com/consul/1.12.0: Is a directory
Cannot write to ‘releases.hashicorp.com/consul/1.12.0’ (Success).
******
We can see that it tries to save whatever is there at
https://releases.hashicorp.com/consul/1.12.0 into
releases.hashicorp.com/consul/1.12.0, not
releases.hashicorp.com/consul/1.12.0/index.html as I would prefer.
The mind blowing fact is that it used to work well for me even couple of weeks
before with the same invocation. It would produce index.html not only at the
root but at the leaves as well. Definitely something has changed on the server
but how can I address the issue? As it works currently it leaves me no option
to maintain my mirror properly because without these index.htmls I simply can’t
offer my mirror to my users.