lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] Using libxml2 with compression [Was: Building shared zlib]


From: Greg Chicares
Subject: [lmi] Using libxml2 with compression [Was: Building shared zlib]
Date: Wed, 30 Aug 2017 13:03:42 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 2017-08-30 01:28, Greg Chicares wrote:
[...]
> There's one remaining problem, though: it doesn't work with lmi.
> I rebuilt lmi after building libxml2 with xz (and the example
> above proves that it works), but I get the same error message:
>   Unable to parse xml stream: Start tag expected, '<' not found
>   [file /opt/lmi/src/lmi/xml_lmi.cpp, line 127]
> with a compressed input file, no matter which algorithm I use
> to compress it. I tried this with copies of 'sample.ill' from
> the lmi repository, compressed with the same sort of commands
> used above:
> 
> -rw-r--r-- 1 greg greg 2460 Aug 29 23:59 gz.ill
> -rw-r--r-- 1 greg greg 2432 Aug 29 23:59 xz.ill
> 
> Any idea what's wrong? Do I need a later version of xmlwrapp?
> Would xmlwrapp need to be modified to use liblzma? I wouldn't
> think so [...]

Now I wonder whether the limitation is in libxml2. Observe:

/opt/lmi/bin[0]$wine ./lmi_wx_shared --ash_nazg --data_path=/opt/lmi/data

Here, I do 'Ctrl-F' and select an improbable set of columns. Inspecting
/opt/lmi/data/configurable_settings.xml , I see that selection.

/opt/lmi/bin[0]$pushd ../data                                           
/opt/lmi/data /opt/lmi/bin
/opt/lmi/data[0]$xz configurable_settings.xml                           
/opt/lmi/data[0]$mv configurable_settings.xml.xz configurable_settings.xml
/opt/lmi/data[0]$file configurable_settings.xml
configurable_settings.xml: XZ compressed data
/opt/lmi/data[0]$ls config*
configurable_settings.xml
/opt/lmi/data[0]$popd
/opt/lmi/bin
/opt/lmi/bin[0]$wine ./lmi_wx_shared --ash_nazg --data_path=/opt/lmi/data

Now, 'Ctrl-F' shows exactly the column selection I saved earlier.
This demonstrates that xmlwrapp and libxml2 successfully and
transparently read an XZ-compressed XML file.

How come a compressed XML configuration file can be read, but a
compressed XML input file cannot? I think I've found the answer.

A (compressed) XML configuration file is read by
  xml_lmi::dom_parser::dom_parser(std::string const& filename)
which calls
  DomParser(filename.c_str())
which is just xmlwrapp's tree_parser:
  typedef xml::tree_parser DomParser;
and according to this:
  https://github.com/vslavik/xmlwrapp/blob/master/src/libxml/tree_parser.cxx
this function:
  tree_parser::tree_parser(const char *name, bool allow_exceptions)
calls init(const char *name), which calls
  void tree_parser::init(const char *name, error_handler *on_error)
which calls
  xmlSAXParseFileWithData()

But OTOH for a (compressed) XML input file the call sequence is:
  xml_lmi::dom_parser::dom_parser(std::istream const& is)
  DomParser(s.c_str(), 1 + s.size())
  void tree_parser::init(const char *data, size_type size, error_handler 
*on_error)
  xmlParseDocument()

The libxml2 documentation:
  http://xmlsoft.org/html/libxml-parser.html
says:
  Function: xmlSAXParseFileWithData
  ...
  Automatic support for ZLIB/Compress compressed document is provided by default
but:
  Function: xmlParseDocument
  [no mention of compression]
Now, libxml2's 'parser.c' is 15817 lines long and I can't easily
understand it, so I can't definitively say that the documentation
is correct in not mentioning compression for xmlParseDocument(),
but that hypothesis would explain my findings above. And it kind
of makes sense: if we read a *file*, it is decompressed if needed,
but if we read a *string*, libxml2 assumes it's not compressed.

However, there's no practical problem here, because the only files
we really want to compress are product data files, and those work:

/opt/lmi/bin[0]$pushd ../data
/opt/lmi/data[0]$for y in database funds policy rounding strata; do xz --stdout 
--threads=32 sample.$y >xz.$y; done
/opt/lmi/data[0]$ls xz.*
xz.database  xz.finds  xz.funds  xz.policy  xz.rounding  xz.strata
/opt/lmi/data[0]$popd
/opt/lmi/bin
/opt/lmi/bin[0]$wine ./lmi_wx_shared --ash_nazg --data_path=/opt/lmi/data       
   

Now I can successfully run an illustration for the 'xz' product.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]