lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] What encoding does wx_test console output use?


From: Greg Chicares
Subject: Re: [lmi] What encoding does wx_test console output use?
Date: Sat, 8 Sep 2018 14:28:38 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 2018-09-08 12:36, Vadim Zeitlin wrote:
> On Sat, 8 Sep 2018 09:57:24 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> $wine ./wx_test --ash_nazg --data_path=/opt/lmi/data --pyx=only_new_pdf 
> >../src/lmi/wx_test_output
> GC> $file -bi wx_test_output
> GC> application/octet-stream; charset=binary
> GC> 
> GC> I'd like to filter this, removing expected lines and leaving only
> GC> unexpected--much as 'nychthemeral_test.sh' does for other tests
> GC> with its '_clutter' sed scripts. I suppose
> GC>   iconv -t UTF-8 -f SOME_ENCODING wx_test_output
> GC> might work, for some value (what?) of SOME_ENCODING.
> 
>  Under "genuine" MSW it would be UTF-16, but I didn't test if it was the
> same thing under Wine. I'd expect it to be...

Yes, thanks, this converts it:
  iconv -t UTF-8 -f UTF-16
I should have thought to try that, but I figured that if it was
UTF-16, 'file' should have detected it.

'file' does detect UTF-16 if I iconv it back to UTF-16, though:

$iconv -t UTF-8 -f UTF-16 gui_test_output.raw >gui_test_output.txt 
$iconv -f UTF-8 -t UTF-16 gui_test_output.txt >gui_test_output.16  
$file gui_test_output.*
gui_test_output.16:  Little-endian UTF-16 Unicode text
gui_test_output.raw: data
gui_test_output.txt: ASCII text

The file grows by two bytes when I convert it back to UTF-16.

$od -t x1 gui_test_output.16 |head -1
0000000 ff fe 4e 00 4f 00 54 00 45 00 3a 00 20 00 73 00

The initial U+FEFF BOM is apparently what 'file' needs. But it's
optional, and I guess it's not customary to include it when
stdout is written to in msw.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]