chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] BOM in a Scheme source file


From: Elf
Subject: Re: [Chicken-users] BOM in a Scheme source file
Date: Sun, 9 Sep 2007 18:22:40 -0700 (PDT)


everything ive seen on the unicode site itself seems to discourage the use of
a BOM outside of protocol ambiguous cases since its not a necessary object.
its not an easy thing to be tolerant of in code text, although it is
relatively easy to be tolerant of it in plain text. possibilties: is it possible to read the unicode strings from a different file in windows with the
bom?  using the BOM as a recognition is considered broken behaviour, in general,
for utf-8.

-elf

On Sun, 9 Sep 2007, Shawn Rutledge wrote:

On 9/8/07, Elf <address@hidden> wrote:
    and does not state anything about byte order.[1] Quite a lot of
    Windows software (including Windows Notepad) adds one to UTF-8 files.
    However in Unix-like systems (which make heavy use of text files for
    configuration) this practice is not recommended, as it will interfere

On 9/8/07, Elf <address@hidden> wrote:
why not fix scite to not put in chars it shouldnt?

It seems to be using the BOM to recognize that the file is UTF-8.  If
I remove the BOM and then re-open the file, it does not detect UTF-8
mode, and I see every byte as a separate character.  I can manually
select UTF-8 mode on the menu, and then if I re-save the file, it puts
back the BOM.

This is indeed what many Windows programs do.  At my job we have a
UTF-8 SQL script that creates a Firebird database, and have been
putting the BOM at the beginning of that; I will have to check if
Firebird can recognize it as UTF8 without the BOM.  I'm guessing it
might not.  (What I'm doing with UTF-8 and Chicken doesn't have
anything to do with my job though.)

Instead, you think Scite should assume that when it sees any bytes
with the MSB set, the file is UTF-8?  Or there is a better way to
detect it?

Vim recognizes the UTF-8 sequences correctly with or without the BOM;
and if I save the file, it will preserve the BOM if it was there
initially, but will not add the BOM if it was absent.

It would be nice if Chicken was tolerant as well, since the BOM is so common.


_______________________________________________
Chicken-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/chicken-users





reply via email to

[Prev in Thread] Current Thread [Next in Thread]