bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Wget for Windows unicode issue


From: Tim Rühsen
Subject: Re: Wget for Windows unicode issue
Date: Tue, 12 May 2020 09:42:31 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0

Hi,

the default charset encoding on Windows is likely not UTF-8 (maybe
cp1252 !?), so UTF-8 character read from myfile.txt are not correctly
converted.

But you can possibly use --remote-encoding=utf-8.

From the wget man page:
       --remote-encoding=encoding
           Force Wget to use encoding as the default remote server
encoding.  That affects how Wget converts URIs
           found in files from remote encoding to UTF-8 during a
recursive fetch. This options is only useful for IRI
           support, for the interpretation of non-ASCII characters.

Regards, Tim

On 12.05.20 04:37, Leonid Pavel wrote:
> I'm trying to use wget for windows with unicode characters and getting
> issues with filename creation.
> 
> Passing in "wget http://example.com/á.png"; directly works fine, however
> if I put the URL in a UTF-8 encoded file and run "wget -i myfile.txt",
> it downloads the file as "A¡.png" which is obviously incorrect.
> 
> Setting the file encoding as UTF-16 / UCS-2 just breaks entirely (tries
> to make a request to a gibberish URL)
> 
> However writing the file as ANSI/ASCII works correctly. This works for
> my example, but for characters that are not able to be represented as
> ASCII characters will surely fail.
> 
> Is this not possible to fix? Why does mingw not take this into account?
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]