[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #55452] fopen() does not support encoding argu
From: |
Andrew Janke |
Subject: |
[Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument |
Date: |
Sat, 9 Mar 2019 09:54:10 -0500 (EST) |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36 |
Follow-up Comment #8, bug #55452 (project octave):
Got a build of the current default and ran my test. A couple failures:
>> run_bug_55452_tests
Running fixed-text encoded file test ex-001:
Reference text: Hello, world! (13 chars)
running: ex-001 ISO-8859-1
decoded: Hello, world! (13 chars)
ok: ex-001 ISO-8859-1
running: ex-001 ISO-8859-15
decoded: Hello, world! (13 chars)
ok: ex-001 ISO-8859-15
running: ex-001 KOI8-R
decoded: Hello, world! (13 chars)
ok: ex-001 KOI8-R
running: ex-001 SHIFT_JIS
decoded: Hello, world! (13 chars)
ok: ex-001 SHIFT_JIS
running: ex-001 UTF-16
decoded: ��Hello, world! (28 chars)
FAIL: ex-001 UTF-16
running: ex-001 UTF-16 no-bom
decoded: Hello, world! (26 chars)
FAIL: ex-001 UTF-16 no-bom
Running fixed-text encoded file test ex-002:
Reference text: ありがとう丸 (18 chars)
running: ex-002 SHIFT_JIS
decoded: ���肪�Ƃ��� (12 chars)
FAIL: ex-002 SHIFT_JIS
running: ex-002 UTF-16
decoded: 0B0�0L0h0FN8 (13 chars)
FAIL: ex-002 UTF-16
Running fixed-text encoded file test ex-003:
Reference text: Kaßner Ökonom Schöps Übermut Müller (40 chars)
running: ex-003 ISO-8859-1
decoded: Ka�ner �konom Sch�ps �bermut M�ller (35 chars)
FAIL: ex-003 ISO-8859-1
running: ex-003 UTF-16
decoded: ��Ka�ner �konom Sch�ps �bermut M�ller (73 chars)
FAIL: ex-003 UTF-16
Looks like a couple things going on here:
- The BOM in UTF-16 files looks like it's being propagated to the decoded
string. That probably shouldn't happen.
- UTF-16 encoded text is being turned in to too many chars. Looks like a \0
char is getting inserted between each ASCII-like char.
>> fh = fopen('encoded-files/ex-001/txt-UTF-16.txt'); line = fgetl (fh);
fclose (fh);
>> line
line = ��Hello, world!
>> line == 0
ans =
0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
1 0
>>
- ISO-8859-1 doesn't seem to get converted to UTF-8.
This brings up another question: How can I read an entire text file in,
without having to iterate over doing a fgetl() on each line? `fscanf (fid,
"%s")`? Would `fread (fid, '*char')` be expected to work? (What _are_ the
semantics for reading chars with fread() on a stream with a non-native
encoding?)
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?55452>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument,
Andrew Janke <=
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09