[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wget | wget should save directory listings as index.html (#11)
From: |
@rockdaboot |
Subject: |
Re: wget | wget should save directory listings as index.html (#11) |
Date: |
Wed, 25 May 2022 17:58:03 +0000 |
Tim Rühsen commented:
> Honestly, I don't think that to have different content for directory and
> directory/ is a good idea.
ACK :-) But I see this regularly with pages/sites served by MS IIS. So it is
not uncommon.
> And in this case directory.1 would just not work, because the simplest file
> server will return index.html for a directory, but not some directory.1
> (neither users, nor site links will know nothing about directory.1).
With --convert-links, your links in the mirrored site will point to
`directory.1`. So any user clicking on HTMl links should be fine. This is not
true for JS scripts, but let's put that aside as we can't do anything about
this.
Users navigating directly to links should be fine too, because they copy&pasted
this from the mirrored site (!?).
But even if we agree on using only a single file for contents of `directory`,
`directory/` and `directory/index.html` - which one do you prefer ? Keep in
mind that those appear (will be downloaded) in any order.
Should we define a priority / order ?
Also, what happens to a file `directory` in case we see `directory/whatever` ?
Should we rename it to `directory/index.html` (except for when 'whatever' is
'index.html', then we do what exactly ?) ?
If we are able to come up with a precise algorithm that covers all the corner
cases, someone can put that into code. Additional command line options to tune
the behavior can come at a later point.
--
Reply to this email directly or view it on GitLab:
https://gitlab.com/gnuwget/wget/-/issues/11#note_960182379
You're receiving this email because of your account on gitlab.com.