octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #53685] textscan() with Delimiter specified al


From: Rik
Subject: [Octave-bug-tracker] [bug #53685] textscan() with Delimiter specified always treats multiple delimiters as one
Date: Wed, 18 Apr 2018 13:25:14 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0

Update of bug #53685 (project octave):

                  Status:                    None => Need Info              

    _______________________________________________________

Follow-up Comment #1:

Can you run your test with Matlab and see if it fails there?  textscan() is a
very complicated function, and unfortunately Octave has to try and reproduce
Matlab behavior exactly because many users have gotten used to its
idiosyncracies.

As a start, see the documentation at
http://www.mathworks.com/help/matlab/ref/textscan.html.

For issue #1, "spaces in the string fields act as delimiters, creating extra
fields", Matlab says


Within each row of data, the default field delimiter is white-space.
White-space can be any combination of space (' '), backspace ('\b'), or tab
('\t') characters. If you do not specify a delimiter, then:

    the delimiter characters are the same as the white-space characters. The
default white-space characters are ' ', '\b', and '\t'. Use the 'Whitespace'
name-value pair argument to specify alternate white-space characters.

    textscan interprets repeated white-space characters as a single
delimiter.


It is essentially as if they set the Delimiter option to whitespace and
MultipleDelimsAsOne to true.

So, if your delimiter is actualy a comma then you will need to say so.  For
example,


textscan ("A,B C,D", "%s", "delimiter", ',')
ans =
{
  [1,1] =
  {
    [1,1] = A
    [2,1] = B C
    [3,1] = D
  }

}


which works to preserve the space in the string.

For the last issue, I find that MultipleDelimsAsOne works.  For example


textscan ("A,,,B C,D", "%s", "delimiter", ',')
ans =
{
  [1,1] =
  {
    [1,1] = A
    [2,1] = 
    [3,1] = 
    [4,1] = B C
    [5,1] = D
  }

}


As expected, there were two empty strings created where there were extra
commas.  Now switching on the MultipleDelimsAsOne option


textscan ("A,,,B C,D", "%s", "delimiter", ',', "multipledelimsasone", 1)
ans =
{
  [1,1] =
  {
    [1,1] = A
    [2,1] = B C
    [3,1] = D
  }

}


I don't exactly know your file, but lets say your trying to read a number,
string, number, string.


textscan ("1.1,A,2.2,B C", "%f %s %f %s", "delimiter", ',')
{
  [1,1] =  1.1000
  [1,2] =
  {
    [1,1] = A
  }

  [1,3] =  2.2000
  [1,4] =
  {
    [1,1] = B C
  }
}


That seems right.


textscan ("1.1,A,,B C", "%f %s %f %s", "delimiter", ',')


If one of the numbers is missing, that seems to work too.


textscan ("1.1,A,,B C", "%f %s %f %s", "delimiter", ',')
ans =
{
  [1,1] =  1.1000
  [1,2] =
  {
    [1,1] = A
  }

  [1,3] =  NaN
  [1,4] =
  {
    [1,1] = B C
  }

}


Is there a one-line example that shows how Octave textscan is not behaving
identically to the Matlab textscan function?

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?53685>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]