octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #54661] textscan() continues from next line if


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #54661] textscan() continues from next line if line ends with delimiter
Date: Sat, 15 Sep 2018 20:57:52 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0

Follow-up Comment #4, bug #54661 (project octave):

Attached is a patch which is not 100% complete.  To make things correct would
require much more modification and testing than I can do at the moment.

With regard to this particular bug, the fundamental issue is that the scanning
process for an individual value skips EOL.  Hence, something like 9,10,, is
processed as


scan the 9 -->

,10,,

skip delim -->

10,,

scan the 10 -->

,,

skip delim -->

,

scan the NAN -->

,

skip delim -->



scan the ??? 13 -->

,14,15,16


So, I believe I have addressed that with the patch.

*However*, the changeset results in there being a failed test, i.e.,


octave:7> test <path>/libinterp/corefcn/file-io.cc
***** test
 str = "12.234e+2,34, \n12345.789-9876j,78\n,10|3";
 c = textscan (str, "%10.2f %f", "delimiter", ",", "collectOutput", 1,
                    "expChars", "e|");
 assert (c, {[1223, 34; 12345.79-9876j, 78; NaN, 10000]}, 1e-6);
!!!!! test failed
ASSERT errors for:  assert (cond {i},expected {i},tol)

  Location  |  Observed  |  Expected  |  Reason
     .          O(4x2)       E(3x2)      Dimensions don't match


One can see what the issue is in this case.  The extra delimiter ", \n"
doesn't agree with the format string, i.e., 2 fields.  But when the patch
removes the code that skips the delimiter, white space *and* EOL, the patched
code ends up scanning an extra value for the ", \n".  This would be an easy
fix: just skip to after the next EOL character after calling
textscan::read_format_once(), which reads one line in the file according to
the format specified.

The problem is that the current code conflates the delimiter skip and EOL skip
in the same textscan::skip_delim() run for every *individual* field.  That
simply can't be done.  Just think about it for a while: the fields all have to
be scanned, *then* we consider skipping past the EOL.  If there is an EOL skip
within the individual fields (i.e., the field loop) there's no way after that
fact to determine whether an EOL has been skipped or not... so if we
automatically skip EOL after having read individual fields we could end up
going past the next line.

So, 


  textscan::skip_delim (delimited_stream& is)
  {


needs to be redone, but there is too much eol1 equal this, eol2 equal that,
etc.

(file #45019)
    _______________________________________________________

Additional Item Attachment:

File name: octave-textscan_no_eol_skip-djs2018sep15.patch Size:2 KB


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?54661>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]