[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] H5MD for proteins

From: Olaf Lenz
Subject: Re: [h5md-user] H5MD for proteins
Date: Tue, 10 Sep 2013 09:31:26 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130811 Thunderbird/17.0.8


On 09/10/2013 08:12 AM, Konrad Hinsen wrote:
> Pierre de Buyl writes: Indeed. But one question one needs to ask is:
> what's the goal of "bare" H5MD?  What can one do with a trajectory
> file if all one knows is that it's a valid H5MD file?

I would even go one step further: I think the idea of modules should be
part of the HDF5 specs. This would yield something like "XHDF5", the "X"
standing for "extensible", as in "XML". And indeed, such modules would
directly correspond to DTDs or schemas in XML. From my point of view,
this is exactly what HDF5 is missing. It would be perfect if there would
be a formal language to describe the syntax of a module the allowed
structure of a HDF5 file, as this would allow to validate HDF5 files, as
can be done with XML. And even better, if you would define the
correspondence between XML and HDF5, one could even directly translate
between XML and HDF5. This would basically make HDF5 a binary XML
format, something that a number of people have asked for.
It would be really nice for H5MD, too, as an XML file is significantly
simpler to produce than an HDF5 file, so if you have your handwritten
code, you just need to output an XML file.

However, this is probably too big for us. ;-) Stil, it probably helps to
think of modules as DTDs for HDF5. The H5MD specs would then represent a
"particle trajectory" module.

And, in contrast to what Konrad claims, I think that even this basic
module has its value. First of all, it serves as a basis for further
modules. It really gains most of its value in conjunction with other
modules, but even the base module carries quite some semantics. When you
have a file that is a valid H5MD file, you know the following things:

* It contains a particle trajectory (i.e. coordinates over time).
* It defines a generic "type" for data that may be time-dependent, or not.
* You have a defined system geometry and boundary conditions (which may
vary over time).
* Using the "species", you can distinguish between different particle
groups (which also may vary over time).
* Using the "id", you can change the ordering or particles within the
file, which is better suited for parallel IO.

What can you do with it?
* Provide a library function to unify reading of such a trajectory, so
that the reader does not need to know how exactly the trajectory is
stored (with varying id, varying particle numbers, etc.).
* Store states of such a system (further information may be required as
provided by some other means).
* Compute various observables, like MSD or RDF.
* Provide limited visualization. This is really rather limited, as the
base format does not provide any more information on the particles, no
radii, no colors, no bonding information.


Dr. rer. nat. Olaf Lenz
Institut für Computerphysik, Allmandring 3, D-70569 Stuttgart
Phone: +49-711-685-63607

Attachment: olenz.vcf
Description: Vcard

reply via email to

[Prev in Thread] Current Thread [Next in Thread]