[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] H5MD for proteins

From: Konrad Hinsen
Subject: Re: [h5md-user] H5MD for proteins
Date: Tue, 10 Sep 2013 12:21:08 +0200

Olaf Lenz writes:

 > I would even go one step further: I think the idea of modules should be
 > part of the HDF5 specs. This would yield something like "XHDF5", the "X"
 > standing for "extensible", as in "XML". And indeed, such modules would
 > directly correspond to DTDs or schemas in XML. From my point of view,
 > this is exactly what HDF5 is missing. It would be perfect if there would

That issue has been discussed a few times on the HDF5 mailing list, but
with little success. Someone even proposed a concrete definition:


However, this was (justly) criticized by a member of the HDF team for
various deficiencies, such as the lack of a machine-verifiable schema

My impression is that most HDF5 users don't care, which means the HDF
group doesn't care, and no one else is big enough to get any
proposition accepted.

 > And even better, if you would define the correspondence between XML
 > and HDF5, one could even directly translate between XML and
 > HDF5.

That could be useful for certain applications, but for many the mismatch
between XML data types and HDF5 data types would be a problem. It's worth
trying, but it's not a small job.

 > This would basically make HDF5 a binary XML format, something
 > that a number of people have asked for.  It would be really nice
 > for H5MD, too, as an XML file is significantly simpler to produce
 > than an HDF5 file, so if you have your handwritten code, you just
 > need to output an XML file.

I am not sure I want my 10 GB HDF5 trajectories to pass through an XML
phase during construction.

 > However, this is probably too big for us. ;-) Stil, it probably helps to
 > think of modules as DTDs for HDF5. The H5MD specs would then represent a
 > "particle trajectory" module.

Yes, that's a good point of view. Something else we can do is provide
a validation program for H5MD data.

BTW, I chose the same approach for Mosaic. In HDF5, every Mosaic data
item has a "type stamp" (an attribute) that identifies it with its
Mosaic type and the Mosaic version number. The Mosaic library can
validate such data items for conformance.

 > And, in contrast to what Konrad claims, I think that even this basic
 > module has its value. First of all, it serves as a basis for further

Me too :-) It's obviously useful for structuring (and sharing)
programs that work on trajectory data. What I question is the
scientific utility because the semantic information about the data in
the file is so weak. It's not that there is no semantic information, but
even 90% of the semantic information you need is 10% short of something

Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
ORCID: http://orcid.org/0000-0003-0330-9428
Twitter: @khinsen

reply via email to

[Prev in Thread] Current Thread [Next in Thread]