[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Problems with file encoding
From: |
Jordi Gutiérrez Hermoso |
Subject: |
Re: Problems with file encoding |
Date: |
Thu, 12 May 2011 15:38:20 -0500 |
On 12 May 2011 13:46, Richard Balogh <address@hidden> wrote:
> From Your samples it is clear that non-working file is Unicode encoded
> file, and working file is ASCII encoded. The difference is that in Unicode
> each character requires two bytes.
I couldn't see the file being referred to, but I've seen this problem
before with encodings. To clarify, Unicode isn't an encoding, but a
family of them, and not all Unicode encodings require two bytes per
character. UTF-8 for example, is a kind of Unicode encoding that uses
one byte for ASCII characters, so UTF-8 and ASCII agree on files that
do not use more codepoints than those defined by ASCII.
I have seen that Windows sometimes uses UTF-16, and *that* encoding
does use at least two bytes per character. I don't think there is a
way to make Octave guess the encoding, but perhaps it could possibly
be told what the encoding is. That's a development I don't
particularly want to undertake myself, though.
- Jordi G. H.