Re: [Classpathx-xml] GNUJAXP.JAR Bug in gnu.xml.pipeline.TextConsumer co
From:
Arnd Beißner
Subject:
Re: [Classpathx-xml] GNUJAXP.JAR Bug in gnu.xml.pipeline.TextConsumer constructor
Date:
Wed, 3 Apr 2002 10:25:21 +0100
First, thank you for taking the time to reply.
> > when the TextConsumer class is used to emit plain XML
> > (not specially handled XHTML), the XML declaration is
> > missing in the output TextConsumer produces.
>
> Not necessarily a bug. And your ad-hoc fix is a NOP,
> since the default encoding (given no XML declaration)
> is already UTF-8 ... as it says in the XML REC! :)
Well, no, it's not really a NOP. Yes, it doesn't change the
encoding, but it DOES cause the xml declaration line to be
written. The XMLWriter implementation differentiates between
UTF8 being set explicitely and UTF8 being implicitely active.
> In such a case, yet another way to ensure there's an XML
> declaration is to just write it directly to the output stream
> yourself... :) But I'd not encourage such tricks, since strings
> are by definition _not encoded_ ... putting any kind of
> encoding declaration in them is error prone, since you may
> not know the encoding that'll be used when it's eventually
> encoded onto some OutputStream.
All true, I just don't see what encodings have to do with
emitting an xml declaration. If there's no defined encoding - fine.
Then the xml declaration just doesn't contain an encoding, but may
still contain a version attribute - besides being there in the first place 8-).
I'm happy to learn good reasons for not emitting an XML declaration
(besides old browser issues), but so far this behaviour makes no sense to me.
If I look at the XMLWriter spec:
> public XMLWriter(java.io.Writer writer)
>
> Constructs a handler which writes all input to the writer,
> and then closes the writer when the document ends. If an
> XML declaration is written onto the output, and this class
> can determine the name of the character encoding for this
> writer, that encoding name will be included in the XML declaration.
and at the XMLWriter(Writer writer, String encoding) constructor:
> * Constructs a handler which writes all input to the writer, and then > * closes the writer when the document ends. If an XML declaration is > * written onto the output, this class will use the specified encoding > * name in that declaration. If no encoding name is specified, no > * encoding name will be declared unless this class can otherwise > * determine the name of the character encoding for this writer.
this lets me assume that if don't otherwise prevent emitting an XML declaration (by toggeling canonicalizing or whatever), I will get an xml declaration even if there's no encoding. And I think what the spec says is ok, it just doesn't match the implementation.