h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] units module


From: Felix Höfling
Subject: Re: [h5md-user] units module
Date: Mon, 04 Nov 2013 10:32:36 +0100
User-agent: Opera Mail/12.15 (Linux)

Am 01.11.2013, 16:13 Uhr, schrieb Peter Colberg
<address@hidden>:

Hi Felix, hi all,

On Thu, Oct 31, 2013 at 05:17:18PM +0100, Felix Höfling wrote:
I made an effort to write down a specification for the units module
to make progress. I took up Pierre's suggestion and added a list of
units inspired by Mosaic and udunits2.

Thank you for working on the units module!

For the encoding, can we just go with UTF8, instead of both ASCII and UTF8?

The issue with encodings is that HDF5 does not support implicit
datatype conversion between ASCII and UTF8. So the reader needs to
specify the correct encoding when reading a "unit" attribute, which
is addressed in commit c065ace by the module attribute "encoding".

Since UTF8 is a superset of ASCII (characters 0-127), the only thing a
C or Fortran writer has to do to use UTF8 encoding is call H5Tset_cset
on the datatype, e.g.,

  hid_t dtype = H5Tcopy(H5T_C_S1);
  H5Tset_size(dtype, H5T_VARIABLE);
  H5Tset_cset(dtype, H5T_CSET_UTF8);

In Python the string needs to be in UTF8 encoding:

  dataset.attrs["unit"] = u"nm"

Peter


Hi Peter,

The field was mainly thought for the reader: a minimal reader may want to
process only ASCII strings and thus ignore the units if not in ASCII. I
thought making a promise at the beginning would simplify things instead of
checking the encoding for each string read.

Writing UTF8 is easy as you pointed out, what about reading? I've never
used it in practice. Can a reader store the raw string in char* and pass
it to the udunits2 library? If this work with either encoding we may drop
the "encoding" field of course.

Before we add something to the specificiation we should test it somehow.
What about providing a code snippet in the implementation part of how to
read UTF8 unit strings and how to interact with, e.g., udunits2?

Best regards,

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]