[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
xtextscan [WAS: Re: strread.m]
From: |
Philip Nienhuis |
Subject: |
xtextscan [WAS: Re: strread.m] |
Date: |
Thu, 04 Aug 2011 23:38:40 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6 |
John W. Eaton wrote:
On 3-Aug-2011, Philip Nienhuis wrote:
|> I will probably try to write textscan in C++. It's up to you whether
|> you want to continue fixing problems in strread, but given the
|
| Do you have a time schedule in mind?
| That would help me make a better decision of what to do.
I started working on it yesterday. So far I've only implemented the
Magnificent.
Are you planning to get it finished before Octave 3.4.3?
Just today I prepared a fix for bug #33876 along the lines I sketched
yesterday... never mind.
part that decodes the format. I'll try for at least some of the
conversions today. Then I may need help in figuring out how to
properly return the variables that are read from the file. Then we
will also need to handle the parameter/value options.
Whitespace and delimiter processing was a bit of sorting out.
There are also some "implicit" options, like presence of a trailing "\n"
in the input stream.
Once you get the format string properly parsed I suppose it is fairly
straightforward to match it to the input stream.
But just FYI, here is some ML r2007a behavior that I find peculiar:
Assume an input string '54321a'.
Applying a format string like '%f321a' it turns out that Matlab prefers
to interpret it as '%f32', ignores the digits in the literal and also
the trailing "a", yielding 54321 (class single).
If you do
c = textscan ('54321a', '%f321a', 'returnonerror' 0)
it emerges that ML first parses the number as far as it can, rather than
first analyzing the trailing literal to see where the numeric field is
supposed to end.
To read the field as a double you'd need '%f 321a' (yielding 54321), or
if you'd rather expect 54, use '%2f64321a'.
Another one:
c = textscan ('54321a', '%2f64') gives {54; 32; 1}
(Given field width is ignored for the last number which is reported as
OK. "'returnonerror', 0" shows that ML complains about row 4, the "a")
I find this behavior (a.o., mixing up a literal if it starts with
digits, and lax interpretation of user-specified field width) a bit
inconsistent from a user point of view - of course from a programmers
POV it may just be obvious although I don't see it.
These examples do show that setting the returnonerror parameter to false
is vital for understanding what ML does.
The point here:
I assume (that is, I hope) you have a clearer view of this than me, but
IMO we should be wary of striving for ML compatibility so much that we
wander into various degrees of bug-for-bug compatibility.
Or should we call it "surprise-for surprise" compatibility?
The diffs below are what I have now. You can do things like
fid = fopen ("any-existing-file");
xtextscan (fid, "any format here for testing")
and xtextscan will display the components of the format.
I can't comment as this is the Octave dialect of C++ :-) (beyond me)
Thank you anyway.
Philip
- Re: Release goals for 3.6, (continued)
- Re: Release goals for 3.6, PhilipNienhuis, 2011/08/02
- strread.m (was: Re: Release goals for 3.6), John W. Eaton, 2011/08/02
- Re: strread.m, Philip Nienhuis, 2011/08/02
- Re: strread.m, John W. Eaton, 2011/08/02
- Re: strread.m, Philip Nienhuis, 2011/08/02
- Re: strread.m, John W. Eaton, 2011/08/02
- Re: strread.m, Philip Nienhuis, 2011/08/03
- Re: strread.m, John W. Eaton, 2011/08/03
- Re: strread.m, Philip Nienhuis, 2011/08/03
- Re: strread.m, John W. Eaton, 2011/08/04
- xtextscan [WAS: Re: strread.m],
Philip Nienhuis <=
- Re: strread.m, Ben Abbott, 2011/08/04
- Re: strread.m, Ben Abbott, 2011/08/02
- Re: strread.m, John W. Eaton, 2011/08/02
Re: Release goals for 3.6, Konstantinos Poulios, 2011/08/03