[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
utf-16le vs utf-16-le
From: |
Eli Zaretskii |
Subject: |
utf-16le vs utf-16-le |
Date: |
Sun, 13 Apr 2008 10:54:30 -0400 |
These two encodings have confusingly similar names, but significantly
different semantics: one expects a BOM, the other does not. (I'll bet
a sixpack of beer that most of you will not know which one is which.)
A similar problem exists with the -be variant of UTF-16.
The fact that we have utf-16le-with-signature, but don't have the
corresponding -without-signature, also doesn't help.
I tripped over these when I tried to read debugging logs saved by
MS-Windows, which are in UTF-16 without a BOM: I used utf-16-le, which
swallowed the first character. When I realized it was due to a BOM,
it took me reading of the doc strings of each encoding to find out
what I did wrong.
Can we please come up with some more self-explanatory names, and lose
the confusing le vs -le thing? Please?
- utf-16le vs utf-16-le,
Eli Zaretskii <=
utf-16le vs utf-16-le, Stephen J. Turnbull, 2008/04/13