[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Wget: Adding a prefix to downloaded files?
From: |
Tim Rühsen |
Subject: |
Re: Wget: Adding a prefix to downloaded files? |
Date: |
Tue, 17 Dec 2019 13:11:51 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 |
Hi Michel,
wget has no parallel processing. Wget2 has.
Regards, Tim
On 12/17/19 12:56 PM, address@hidden wrote:
> Hi Tim,
>
> It seems completely logical that Wget --- or any application for that matter
> --- works through an input list sequentially.
> But the resulting order might depend upon whether Wget only handles a single
> file at a time, or whether it is capable of processing several files in
> parallel.
> I suppose the answer is a single file only, as I cannot find anything about
> parallel processing in the Manual.
> But I wouldn't put money on it.
>
> On the other hand, I may have been tricked by the settings of Windows
> Explorer when wondering if the file size had an impact.
> Indeed, when I try to doublecheck this behavior, it turns out that the
> downloads simply are executed too quickly to visually confirm sth. of the
> kind!
>
> Also, you are right in pointing out that in fact the target directory is
> "ruled" by local settings (e.g. a folder in Windows Explorer), including the
> sort order, which can have a confusing effect.
> Some further testing learned me that in this particular case I also needed to
> change the time switch for DOS' DIR command. Indeed,
>
> DIR /O: D /T: C
>
> sorts files per D (ate), and uses the C (reation date) to do so.
> (the default values being W (= last written) for the Date, and
> "sort-of-alphabetically" if no O(rdering) switch is applied. See:
> [ https://ss64.com/nt/dir.html | https://ss64.com/nt/dir.html ]
> [ https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 |
> https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 ] )
>
>
> Windows Explorer offers many more possibilities apart from its default values
> ("Date Created" and/or "Date Modified", I'm not really sure).
> See the following screen shot (if that's of any use; I'm not sure if this
> forum persists them):
>
>
> The problem being that Windows Explorer itself does not explain what they
> mean... So in a sense they are useless.
> That's not just a remark, when you know that the default "Date created" in
> Windows Explorer does NOT give the same output as the (apparent) DOS
> equivalent !!
>
> Idem for the other date types proposed by Windows Explorer: none of them
> matches the output of the above DIR command...
> ("Date acquired", "Date archived", "Date completed", "Date received", "Date
> released", and "Date sent" are even empty)
> Typical MS clumzyness, I guess.
>
> If you'd want a stance of the mess MS keeps making of Date/Time fields, have
> a look here:
> [
> https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
> |
> https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
> ]
> Apparently, their meaning changes between versions (Win7 or Win10), and even
> among Win10 releases... Go figure!
>
> Nevertheless, thx to your feedback I've been able to confirm that indeed,
> this is not a Wget issue.
> I suppose I can use this info to work around Wget's missing option for a
> prefix/counter. (which remains the bottom line and triggered this question in
> the first place)
>
> PS:
> The workaround you suggest, is of the same type as the other ones mentioned
> before.
> For yes, it could be done by calling Wget as often as there are images to
> download, and (externally) adding a prefix (counter) for every single
> download.
> But any such workaround would miss out on the efficiency of feeding Wget with
> a plain input txt file.
> And I can only repeat that such a feature could ad some power to Wget, as it
> would avoid cumbersome workarounds.
>
> Thx again for all the feedback received,
>
> MK
>
>
>
> Van: "Tim Rühsen" <address@hidden>
> Aan: "Michel Kempeneers" <address@hidden>, "bug-wget" <address@hidden>
> Verzonden: Vrijdag 13 december 2019 15:39:24
> Onderwerp: Re: Wget: Adding a prefix to downloaded files?
>
> On 12/12/19 1:25 PM, address@hidden wrote:
>
>
> Hi,
>
> I run into a particular problem when I'm trying to download a bunch of URLs I
> grouped together in file "input.txt" like this:
>
> wget -nv -a log.txt -P .\Images\ -i input.txt
>
> Some of these files are huge, hence take a long time to download.
> As a consequence, they will not appear in the same sorting order in the
> download folder as int he input folder, and that's a problem, as this order
> has its importance.
>
>
> Since wget works sequentially, why do you think the order of downloads
> has something to do with the file size ?
>
> If 'Images' is a fresh and empty directory *and* all files download OK,
> the order in the directory is the same as the order in input.txt. At
> least a sane file system should keep the order (is NTFS sane ?).
>
> Then, what is irritating: 'dir' or 'ls' tools like to use a certain sort
> order by default. E.g. here on GNU/Linux 'ls' orders the output files
> alphabetical by name. 'ls -rc' prints with a reverse order by creation
> time (oldest first, then newer files), which seems to be what you want.
>
> In short, wget likely is not your problem. Find out what it really is
> and you can find a mitigation.
>
> As a 'dump' work-around, save your files into a temp directory, then
> move them to Images\ in the order of occurrence in input.txt.
>
> Regards, Tim
>
signature.asc
Description: OpenPGP digital signature