h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Species Data Type


From: Peter Colberg
Subject: Re: [h5md-user] Species Data Type
Date: Tue, 6 Aug 2013 14:47:14 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

Hi all,

After Nicolas' enquiry on identifying particle species using strings,
I gave HDF5 enumerated types a try myself. It turns out that enums are
quite useful, which makes me wonder whether we should recommend them
explicitly in the H5MD specification (which already implicitly allows
enums, being of “integer kind”).

An HDF5 enumerated type is derived from a given integer type, and
defines a set of strings along with a set of corresponding integer
values. The key to enums is that the HDF5 library transparently
translates between different sets of integer values representing
a common set of strings.

As an example, here is a problem I had a few months ago. The task was
to use the output trajectory sample of one program as the input to a
second program. However, the first program produced two particle
subgroups with species {N = 0, C = 1} and {A = 2, B = 3}, while the
second program expected species {N = 0, C = 1} and {A = 0, B = 1}.
This required manual translation of the species between the programs.

With an enumerated type, this issue does not occur. Here is an example
of writing a species dataset in pseudo-code. It uses the species
representation {A = 2, B = 3} both in memory, and in the file.

  -- In the file, the species dataset has an 8-bit integer type.
  local filetype = hdf5.create_enum_type("native_uint8")
  filetype:enum_insert("A", 2)
  filetype:enum_insert("B", 3)

  local dset = group:create_dataset("value", filetype, filespace)

  -- In memory, the species array has a 32-bit integer type.
  local memtype = hdf5.create_enum_type("native_int32")
  memtype:enum_insert("A", 2)
  memtype:enum_insert("B", 3)

  -- The species array contains 32-bit integers in {2, 3}.
  dset:write(memtype, memspace, filespace, nil, species)

A different program reads the species dataset, using the species
representation {A = 0, B = 1} in memory. HDF5 translates between
the different representations within file and memory.

  local dset = group:open_dataset("value")

  local memtype = hdf5.create_enum_type("native_uint32")
  memtype:enum_insert("A", 0)
  memtype:enum_insert("B", 1)

  dset:read(memtype, memspace, filespace, nil, species)
  -- The species array contains 32-bit integers in {0, 1}.

Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]