[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[h5md-user] The Box Story

From: Konrad Hinsen
Subject: [h5md-user] The Box Story
Date: Thu, 26 Sep 2013 09:19:36 +0200

I'll start a list of pros and cons of the different box storage
arrangements, in order to advance with this issue.

Before discussing *where* to put box information, I think it's
important to agree on *what* exact information should be stored.

Proposition 1: Store a single time series with box information for the
whole trajectory. It must cover at least those steps for which any
position information is stored. The box information for a given step
must be retrieved by binary search for random-access step
retrieval. For sequential traversal of the trajectory, more efficient
methods are available.

 + Simplicity. Easy to understand, easy to check.

 + Efficient storage: no duplication of box data.

 - Box information retrieval is less efficient.

 - Parallel writing (in the sense of parallel I/O) of independent
   position time series requires coordination between processes.

Proposition 2: With every position time series, store a box time
series at exactly the same step numbers. If multiple such box time
series are identical, links can be used to avoid duplicating the data.

 + Efficient random read access to positions with matching box information.

 - Efficient writing (without data duplication) requires some effort
   and careful thought.

I don't think this list is complete, but that's what comes to mind
immediately. As I said before, I don't have a clear preference.

Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
ORCID: http://orcid.org/0000-0003-0330-9428
Twitter: @khinsen

reply via email to

[Prev in Thread] Current Thread [Next in Thread]