octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #58368] UTF16 and UTF32 characters in MAT file


From: Markus Mützel
Subject: [Octave-bug-tracker] [bug #58368] UTF16 and UTF32 characters in MAT files
Date: Sun, 17 May 2020 07:17:08 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0

Follow-up Comment #5, bug #58368 (project octave):

Even worse that the behavior of Matlab seems to be undetermined when reading
UTF-8 strings with non-ASCII characters.

The attached patch converts character strings to UTF-16 (the encoding Matlab
uses for strings) before saving them to .mat files (-v6 or -v7).
It also adds some tests.

Again, the conversion cannot be safely performed in general for character
matrices and ND-arrays. So they are stored as UTF-8 (the encoding Octave uses
for strings).
But character matrices and ND-arrays should be avoided anyway imho. At least
as an exchange format.

With that change, I can successfully load the .mat file saved by Octave in
Matlab R2020a. No random trailing characters.

(file #49101)
    _______________________________________________________

Additional Item Attachment:

File name: bug58368_utf_mat_v3.patch      Size:13 KB
    <https://savannah.gnu.org/file/bug58368_utf_mat_v3.patch?file_id=49101>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58368>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]