bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Wget: Adding a prefix to downloaded files?


From: Tim Rühsen
Subject: Re: Wget: Adding a prefix to downloaded files?
Date: Tue, 17 Dec 2019 13:11:51 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0

Hi Michel,

wget has no parallel processing. Wget2 has.

Regards, Tim


On 12/17/19 12:56 PM, address@hidden wrote:
> Hi Tim, 
> 
> It seems completely logical that Wget --- or any application for that matter 
> --- works through an input list sequentially. 
> But the resulting order might depend upon whether Wget only handles a single 
> file at a time, or whether it is capable of processing several files in 
> parallel. 
> I suppose the answer is a single file only, as I cannot find anything about 
> parallel processing in the Manual. 
> But I wouldn't put money on it. 
> 
> On the other hand, I may have been tricked by the settings of Windows 
> Explorer when wondering if the file size had an impact. 
> Indeed, when I try to doublecheck this behavior, it turns out that the 
> downloads simply are executed too quickly to visually confirm sth. of the 
> kind! 
> 
> Also, you are right in pointing out that in fact the target directory is 
> "ruled" by local settings (e.g. a folder in Windows Explorer), including the 
> sort order, which can have a confusing effect. 
> Some further testing learned me that in this particular case I also needed to 
> change the time switch for DOS' DIR command. Indeed, 
> 
> DIR /O: D /T: C 
> 
> sorts files per D (ate), and uses the C (reation date) to do so. 
> (the default values being W (= last written) for the Date, and 
> "sort-of-alphabetically" if no O(rdering) switch is applied. See: 
> [ https://ss64.com/nt/dir.html | https://ss64.com/nt/dir.html ] 
> [ https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 | 
> https://devblogs.microsoft.com/oldnewthing/20140304-00/?p=1603 ] ) 
> 
> 
> Windows Explorer offers many more possibilities apart from its default values 
> ("Date Created" and/or "Date Modified", I'm not really sure). 
> See the following screen shot (if that's of any use; I'm not sure if this 
> forum persists them): 
> 
> 
> The problem being that Windows Explorer itself does not explain what they 
> mean... So in a sense they are useless. 
> That's not just a remark, when you know that the default "Date created" in 
> Windows Explorer does NOT give the same output as the (apparent) DOS 
> equivalent !! 
> 
> Idem for the other date types proposed by Windows Explorer: none of them 
> matches the output of the above DIR command... 
> ("Date acquired", "Date archived", "Date completed", "Date received", "Date 
> released", and "Date sent" are even empty) 
> Typical MS clumzyness, I guess. 
> 
> If you'd want a stance of the mess MS keeps making of Date/Time fields, have 
> a look here: 
> [ 
> https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
>  | 
> https://superuser.com/questions/147525/what-is-the-date-column-in-windows-7-explorer-it-matches-no-date-column-from
>  ] 
> Apparently, their meaning changes between versions (Win7 or Win10), and even 
> among Win10 releases... Go figure! 
> 
> Nevertheless, thx to your feedback I've been able to confirm that indeed, 
> this is not a Wget issue. 
> I suppose I can use this info to work around Wget's missing option for a 
> prefix/counter. (which remains the bottom line and triggered this question in 
> the first place) 
> 
> PS: 
> The workaround you suggest, is of the same type as the other ones mentioned 
> before. 
> For yes, it could be done by calling Wget as often as there are images to 
> download, and (externally) adding a prefix (counter) for every single 
> download. 
> But any such workaround would miss out on the efficiency of feeding Wget with 
> a plain input txt file. 
> And I can only repeat that such a feature could ad some power to Wget, as it 
> would avoid cumbersome workarounds. 
> 
> Thx again for all the feedback received, 
> 
> MK 
> 
> 
> 
> Van: "Tim Rühsen" <address@hidden> 
> Aan: "Michel Kempeneers" <address@hidden>, "bug-wget" <address@hidden> 
> Verzonden: Vrijdag 13 december 2019 15:39:24 
> Onderwerp: Re: Wget: Adding a prefix to downloaded files? 
> 
> On 12/12/19 1:25 PM, address@hidden wrote: 
> 
> 
> Hi, 
> 
> I run into a particular problem when I'm trying to download a bunch of URLs I 
> grouped together in file "input.txt" like this: 
> 
> wget -nv -a log.txt -P .\Images\ -i input.txt 
> 
> Some of these files are huge, hence take a long time to download. 
> As a consequence, they will not appear in the same sorting order in the 
> download folder as int he input folder, and that's a problem, as this order 
> has its importance. 
> 
> 
> Since wget works sequentially, why do you think the order of downloads 
> has something to do with the file size ? 
> 
> If 'Images' is a fresh and empty directory *and* all files download OK, 
> the order in the directory is the same as the order in input.txt. At 
> least a sane file system should keep the order (is NTFS sane ?). 
> 
> Then, what is irritating: 'dir' or 'ls' tools like to use a certain sort 
> order by default. E.g. here on GNU/Linux 'ls' orders the output files 
> alphabetical by name. 'ls -rc' prints with a reverse order by creation 
> time (oldest first, then newer files), which seems to be what you want. 
> 
> In short, wget likely is not your problem. Find out what it really is 
> and you can find a mitigation. 
> 
> As a 'dump' work-around, save your files into a temp directory, then 
> move them to Images\ in the order of occurrence in input.txt. 
> 
> Regards, Tim 
> 

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]