lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] Order of subelements for schema validation


From: Greg Chicares
Subject: [lmi] Order of subelements for schema validation
Date: Mon, 22 Oct 2012 23:35:29 +0000
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20121010 Thunderbird/16.0.1

External systems may create xml input files that present <cell>
subelements in a different order than lmi's schema specifies.
This doesn't matter to our xml parser. But it does matter for
schema validation, which we want to perform on all these files.
We have three options:

(1) Require external systems to write subelements in exactly the
order the schema accepts, as suggested here:
  http://www.ibm.com/developerworks/xml/library/x-eleord/index.html
| you might find that requiring order by default is the easiest
| way to ensure WXS friendliness
This is not practicable. Our order has no semantic significance
(it's just alphabetical, for readability and maintainability), so
it affords no strong reason to impose an inconvenient requirement
on multiple external systems.

(2) Liberalize the schema to accept elements in any order. This
is easy enough with Relax NG: just use <interleave>. However, we
also want to support the more widely used W3C schema language,
which has no such facility:
  http://books.xmlschemata.org/relaxng/relax-CHP-6-SECT-5.html
| If you need to insure that it will also be possible to model
| your vocabulary with a more rigid schema language such as W3C
| XML Schema, you will often have to restrict the usage of
| interleave patterns in your RELAX NG schemas.
There are partial workarounds, e.g.:
  
http://stackoverflow.com/questions/3347822/validating-xml-with-xsds-but-still-allow-extensibility#tab-top
  
http://stackoverflow.com/questions/104248/is-it-possible-in-w3cs-xml-schema-language-xsd-to-allow-a-series-of-elements?answertab=active#tab-top
but replacing <xs:sequence> with <xs:all> would mean we could
never add a subelement that occurs more than once (as might be
desirable, e.g., for the insureds on a multiple-life policy, or
for multiple fund selections). We don't want to paint ourselves
into that corner, and the other workarounds are too complicated.

(3) Sort the data before applying the schema. This is the best
option: it preserves future flexibility without imposing rigid,
costly requirements on others. File 'sort_cell_subelements.xsl'
in the lmi repository performs this sorting. Example:
  $ xsltproc sort_cell_subelements.xsl sample.ill \
    | diff --report-identical-files --strip-trailing-cr sample.ill -
  Files sample.ill and - are identical
In production, we plan to validate all files automatically when
they're loaded, and adding one more automated step costs little.

Even if we had chosen the second option, we would still want an
xsl template to sort <cell> subelements so that input files can
be graphically compared.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]