bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Wget: Adding a prefix to downloaded files?


From: michel . kempeneers
Subject: RE: Wget: Adding a prefix to downloaded files?
Date: Tue, 17 Dec 2019 12:56:18 +0100 (CET)

Hi Tim, 

It seems completely logical that Wget --- or any application for that matter 
--- works through an input list sequentially. 
But the resulting order might depend upon whether Wget only handles a single 
file at a time, or whether it is capable of processing several files in 
parallel. 
I suppose the answer is a single file only, as I cannot find anything about 
parallel processing in the Manual. 
But I wouldn't put money on it. 

On the other hand, I may have been tricked by the settings of Windows Explorer 
when wondering if the file size had an impact. 
Indeed, when I try to doublecheck this behavior, it turns out that the 
downloads simply are executed too quickly to visually confirm sth. of the kind! 

Also, you are right in pointing out that in fact the target directory is 
"ruled" by local settings (e.g. a folder in Windows Explorer), including the 
sort order, which can have a confusing effect. 
Some further testing learned me that in this particular case I also needed to 
change the time switch for DOS' DIR command. Indeed, 

DIR /O: D /T: C 

sorts files per D (ate), and uses the C (reation date) to do so. 
(the default values being W (= last written) for the Date, and 
"sort-of-alphabetically" if no O(rdering) switch is applied. See: 
[ https://ss64.com/nt/dir.html | https://ss64.com/nt/dir.html ] 
[ https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 | 
https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 ] ) 


Windows Explorer offers many more possibilities apart from its default values 
("Date Created" and/or "Date Modified", I'm not really sure). 
See the following screen shot (if that's of any use; I'm not sure if this forum 
persists them): 


The problem being that Windows Explorer itself does not explain what they 
mean... So in a sense they are useless. 
That's not just a remark, when you know that the default "Date created" in 
Windows Explorer does NOT give the same output as the (apparent) DOS equivalent 
!! 

Idem for the other date types proposed by Windows Explorer: none of them 
matches the output of the above DIR command... 
("Date acquired", "Date archived", "Date completed", "Date received", "Date 
released", and "Date sent" are even empty) 
Typical MS clumzyness, I guess. 

If you'd want a stance of the mess MS keeps making of Date/Time fields, have a 
look here: 
[ 
https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
 | 
https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
 ] 
Apparently, their meaning changes between versions (Win7 or Win10), and even 
among Win10 releases... Go figure! 

Nevertheless, thx to your feedback I've been able to confirm that indeed, this 
is not a Wget issue. 
I suppose I can use this info to work around Wget's missing option for a 
prefix/counter. (which remains the bottom line and triggered this question in 
the first place) 

PS: 
The workaround you suggest, is of the same type as the other ones mentioned 
before. 
For yes, it could be done by calling Wget as often as there are images to 
download, and (externally) adding a prefix (counter) for every single download. 
But any such workaround would miss out on the efficiency of feeding Wget with a 
plain input txt file. 
And I can only repeat that such a feature could ad some power to Wget, as it 
would avoid cumbersome workarounds. 

Thx again for all the feedback received, 

MK 



Van: "Tim Rühsen" <address@hidden> 
Aan: "Michel Kempeneers" <address@hidden>, "bug-wget" <address@hidden> 
Verzonden: Vrijdag 13 december 2019 15:39:24 
Onderwerp: Re: Wget: Adding a prefix to downloaded files? 

On 12/12/19 1:25 PM, address@hidden wrote: 


Hi, 

I run into a particular problem when I'm trying to download a bunch of URLs I 
grouped together in file "input.txt" like this: 

wget -nv -a log.txt -P .\Images\ -i input.txt 

Some of these files are huge, hence take a long time to download. 
As a consequence, they will not appear in the same sorting order in the 
download folder as int he input folder, and that's a problem, as this order has 
its importance. 


Since wget works sequentially, why do you think the order of downloads 
has something to do with the file size ? 

If 'Images' is a fresh and empty directory *and* all files download OK, 
the order in the directory is the same as the order in input.txt. At 
least a sane file system should keep the order (is NTFS sane ?). 

Then, what is irritating: 'dir' or 'ls' tools like to use a certain sort 
order by default. E.g. here on GNU/Linux 'ls' orders the output files 
alphabetical by name. 'ls -rc' prints with a reverse order by creation 
time (oldest first, then newer files), which seems to be what you want. 

In short, wget likely is not your problem. Find out what it really is 
and you can find a mitigation. 

As a 'dump' work-around, save your files into a temp directory, then 
move them to Images\ in the order of occurrence in input.txt. 

Regards, Tim 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]