[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qexo-general] How to get special characters in a query results
From: |
Per Bothner |
Subject: |
Re: [Qexo-general] How to get special characters in a query results |
Date: |
Mon, 30 Jun 2003 09:10:05 -0700 |
User-agent: |
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4) Gecko/20030612 |
Rami RIFAIEH wrote:
I am using Qexo implemantation with an XML document containing characters
such as ( é, à, ê,...)with ISO-8859-1 encoding.
...
Is there any special method (function) to concerve these characters in the
result document?
Preserving the non-ascii characters in the sense of emitting 'é' as 'é'
but emitting 'é' as 'é' because the data-model does not
distinguish them. It might be possible to distinguish them in the
TreeList representation, but it would be difficult to get that
consistent. I think we're stuck with 'é' and 'é' being treated the
same.
So the alternative is to change the output encoding. The new
"serialization" spec discusses an "encoding" parameter.
A complication is this recommendation:
It is possible that the data model will contain a character that
cannot be represented in the encoding that the processor is using
for output. In this case, if the character occurs in a context where
XML recognizes character references (that is, in the value of an
attribute node or text node), then the character should be output
as a character reference.
The problem is determining this. We can use FileWriter's getEncoding
method to determine if a character is supported, but I believe this
requires code that is specific to JDK 1.4.x, which I'm trying to avoid.
Plus it may be a bit complicated. But we can hardwire a few common
encoding names.
In the short term, you can just remove these two lines in the
writeChar method in gnu.xml.XMLPrinter:
else if (v >= 127)
super.write("&#"+v+";");
However, if we take this out, then people who use a character
not supported any the FileWriter's encoding will get nasty errors.
Hence the existing code: It's simple, correct, and safe.
--
--Per Bothner
address@hidden http://per.bothner.com/