lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Why should we sort XML documents?


From: Vadim Zeitlin
Subject: Re: [lmi] Why should we sort XML documents?
Date: Tue, 6 Mar 2018 01:30:03 +0100

On Mon, 5 Mar 2018 22:41:44 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2018-03-05 18:50, Vadim Zeitlin wrote:
GC> [...]
GC> > GC> [...if we omit sorting, then...] We really have nothing to gain.
GC> > 
GC> >  I somewhat disagree with this too. Simplification is always nice, and
GC> > removing the need to sort the cells here would allow us to not use
GC> > libxsltwrapp (and hence libxslt) at all any longer once the XSL-FO code is
GC> > finally removed which is, IMHO, a not-negligible payoff for very little
GC> > work.
GC> 
GC> Wow, I didn't see that--I somehow thought that XSD (and RNG) were
GC> supported by libxslt, but no, libxml2 does that.

 Sorry, I feel like I'm missing something here. Just to make it perfectly
clear: I didn't mean at all to say that libxml2 supported using RNG
directly (although it indeed does), just that we wouldn't need to use
libxsltwrapp to apply the XSLT used for sorting XML documents any longer.

 So I just don't see any link between what I wrote and your reply, which
bothers me -- even though it doesn't change the fact that both what you
wrote and, I think, what I wrote, is true.

GC> We absolutely will not add a new production dependency on "java", so we
GC> can't use 'jing', which means we can't validate with RNC. I.e., we can
GC> maintain the authoritative sources as RNC, but we don't have an RNC
GC> validator that we can use in production.
GC> 
GC> But we can translate RNC to RNG, and libxml2 can use RNG, which, AIUI,
GC> is just a different but equivalent representation of RNC: IOW, it's a
GC> lossless translation.

 Yes. And you're right that RNC to XSD translation is not lossless in
general, but it is lossless for the vast majority of schemas (i.e. most
reasonable ones), including all those that we use.

GC> And, because we've been relying exclusively on XSD, we've sorted the
GC> input--because XSD is less capable than RN[CG], and sorting partially
GC> mitigates that loss of capability.

 No, sorry, I just don't think this is true. XSD can validate unsorted cell
elements just fine right now. According to the link that you gave
previously (https://lists.nongnu.org/archive/html/lmi/2012-10/msg00000.html),
the real reason for sorting is that the definition of these elements might
change in the future (but, again, this didn't happen since at least 5, and
probably many more, years) and then XSD might not be sufficiently powerful
to validate them. But in the current state there is no problem with using
XSD at all.

[...]
GC> Of course, I'm assuming that xmlwrapp already handles RNG,

 No, it doesn't.

GC> or can easily be extended to do so;

 I think so, but, as usual, it would probably be faster to just do it than
trying to estimate the time needed to do it.

GC> and that it handles RNG in the same way that xmllint does, which seems
GC> highly probable.

 Yes, absolutely.

GC> But once we've made sure of those preconditions, getting rid of libxslt
GC> altogether is well worth the effort required to alter the schemata.

 What I still don't understand is why should we do all this instead of just
getting rid of libxslt _without_ doing all this. In the worst case, i.e. if
we ever need RelaxNG features not supported by XML Schema in the future, we
could always do this later. What is the motivation for doing it right now?

GC> BTW, when I searched the web to double-checking which xmlsoft library
GC> supports schema validation, I stumbled upon this mention of "schema":
GC> 
GC>   https://vslavik.github.io/xmlwrapp/manual/stylesheet_8h_source.html
GC> |   Errors are handled by @a on_error handler; by default, xml::exception
GC> |   is thrown on errors. If there's a fatal error that prevents the schema
GC> 
GC> which occurs twice, once for each ctor. AFAICT, "schema" should be
GC> changed to "stylesheet" in these two locations.

 Thanks, I've fixed this in Git, but I'm not sure if the above URL is
updated automatically or if it will only be done after the next release.

 Regards,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]