octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Improving strread / textread / textscan


From: Ben Abbott
Subject: Re: Improving strread / textread / textscan
Date: Sun, 23 Oct 2011 20:37:45 -0400

On Oct 23, 2011, at 6:42 PM, Ben Abbott wrote:

> Ok. Lets start with writing tests for ML. I'll start by extracting Octave's 
> tests and confirm they work on ML.
> 
> Ben

I've copied the tests from textscan and modified them to run on ML. To do that 
I wrote a simple oct_assert function to handle the asserts. Of the total 14 
asserts, two of them have failed.

Test #1: Passed.
Test #2: Passed.
Test #3: Passed.
Test #4: Passed.
Test #5: Passed.
Test #6: Passed.
Test #7: Passed.
Test #8: Passed.
Test #9: Failed.
OBSEVED:
          16         241           3

EXPECTED:
          16         241           3           0

Test #10: Passed.
Test #11: Passed.
Test #12: Failed.
OBSEVED:
           2

EXPECTED:
           2
           4
           0

Test #13: Passed.
Test #14: Passed.

The script with the  tests and the oct_assert function are attached.

Isolating the 9th test

str = sprintf (Km:25 = hhhZ\r\n);
fmt = 'Km:%d = hhh%1sjjj miles%dhour';
a = textscan (str, fmt, 'delimiter', ' ');

ML returns ...

a =     [25]    {1x1 cell}    [0x1 int32]

This is not what I expected. The ML docs (seem to?) indicate that the 
"emptyvalue" value should be return. The default is NaN. Reading the link below 
clears up the confusion. int32 doesn't support NaN, thus it is returned as an 
empty.

        
http://stackoverflow.com/questions/6657963/textscan-in-matlab-read-null-value-as-nan

I wondered what ML would do if the empty value wasn't the last one. ML and 
Octave agree in this case.

cell2mat (textscan (sprintf ('a1\na2\na3\na\na4'), 'a%d'))
ans =

  1
  2
  3
  0
  4

For the simplified version of test #9, Octave throws an error.

error: strread: A(I): index out of bounds; value 4 out of bound 3
error: called from:
error:   /Users/bpabbott/Development/mercurial/default/scripts/io/strread.m at 
line 456, column 26
error:   /Users/bpabbott/Development/mercurial/default/scripts/io/textscan.m at 
line 221, column 11
error:   /Users/bpabbott/Development/Octave_Toolbox/textscan/test9.m at line 
13, column 3

For test 12, I simplified to ...

a1 = cell2mat (textscan (sprintf 
('Text1Text2Text\nText3Text4Text\nText57TextText'), 'Text%*dText%dText'))
a2 = cell2mat (textscan (sprintf 
('Text1Text2Text\nText3TextText\nText57Text63Text'), 'Text%*dText%dText'))
a3 = cell2mat (textscan (sprintf 
('Text1Text2Text\nText3TextText\nText57Text63Text'), 'Text%dText%dText'))

Matlab returns ...

a1 =

           2
           4

a2 =

           2

Error using cat
CAT arguments dimensions are not consistent.

Octave returns ...

a1 =

  2
  4
  0

a2 =

   2
   0
  63

a3 =

   1   2
   3   0
  57  63

I'm having trouble understanding just what ML is doing. So, Octave's behavior 
looks more consistent to me. Other thoughts / opinions?

I think we should do what ML didn't and document this behavior. I noticed the 
texinfo doesn't mention "emptyvalue". I can add that as well.

I'll also add an expected failure for the simplied version of test9 that threw 
an error.

I'll prepare a changeset.

Ben

Attachment: test_oct_textscan.m
Description: Binary data

Attachment: oct_assert.m
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]