chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] SSAX, utf8 and the byte order mark


From: John Cowan
Subject: Re: [Chicken-users] SSAX, utf8 and the byte order mark
Date: Fri, 5 Jan 2007 15:54:03 -0500
User-agent: Mutt/1.3.28i

Charles Breathe scripsit:

> In the process of trying to write a script in Chicken I attempted
> to use the SSAX XML->SXML function with a stream that begins with a
> UTF-8 byte order mark. Unfortunately the function dies when it reads
> the BOM. Currently I'm converting it to a stream and then filtering
> out the offending characters, but that seems terribly ugly. Is there a
> better approach? Is this something that the XML->SXML function should
> be handling itself?

[putting on XML Core Working Group hat]

This is an XML->SXML issue.  For a long time it wasn't too clear whether
UTF-8-encoded XML documents were allowed to contain a BOM or not.
(UTF-16-encoded documents are required to do so.)  As a result of an
erratum, the XML Recommendation now requires that UTF-8 BOMs be accepted
and ignored.

-- 
Even the best of friends cannot                 John Cowan
attend each others' funeral.                    address@hidden
        --Kehlog Albran, The Profit             http://www.ccil.org/~cowan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]