Re: [Bug] strread() elaborated format strings

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug] strread() elaborated format strings

From:	Philip Nienhuis
Subject:	Re: [Bug] strread() elaborated format strings
Date:	Mon, 14 May 2012 21:12:37 +0200
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6

Hi Júlio:

Júlio Hoffimann wrote:

Thanks Philip, that's why i didn't found in the code where %[^] was
being handled. I know John wants to rewrite some of this I/O functions
in pure C++, but if i have time to do a quick fix in strread.m, should i
add something around the line 473? It's the section where we need a new
branch for dealing with the mentioned format specifier?

I'm not quite sure whether %[] format specifiers can be implementedefficiently in strread's current form.

Last summer I tried to get it together but it turned out to be a messyaffair full of gotchas and corner cases, and as a consequence, lots ofif clauses and thus very slow code. (I actually needed %[] and %[^]myself but luckily I found ways to avoid them. Plus, at work we do haveMatlab.)In addition, IIRC later on Rik found that Octave's regexp (based onpcre) is relatively slow, and I think for each %[] we need one or twocalls to regexp().

Nevertheless, if you really need %[] I'm happy to again look into it.The code for splitting the data into columns is much more reliable thesedays so perhaps %[] can be made to work now.

The very best option would be to implement a binary (compiled) textscanas work horse for strread (instead of vice versa). A while ago John hassent me a rough textscan.c framework it but I lack C++ proficiency (andI suppose John lacks time).So, the question remains whether it is at all worthwile to again investin strread.m given the plans to have a binary textscan()


Anyway, if you're in a hurry, be my guest to give it a try.

Note: currently there are some pending strread.m fixes in the bugtracker. See bugs #36356 + #36392 and #36398 (the last one should berebased). (I can't push those as my hgrc/mercurial setup got fubarredrepeatedly and I have neither time nor appetite to again fix it.)


Some guidelines (don't be put off):

First, you'd have to adapt the format string parsing code (L.284-309 in-my patched see bug #s above- strread.m) to correctly parse and isolate%[] specifiers. Shouldn't be too hard.Next you'll have to adapt the format string matching code in L.450-530,and adapt the column-splitting code in L.532-618. Especially this partof the code is where I expect you to spend many an evening. (But whoknows...)Then, further below you'd have to add a stanza for processing the %[]specifiers to every matching column. Probably a breeze once the columnsplitting is right.Finally, a fair number of test cases should be added, covering allimaginable corner cases. Have Matlab at hand for comparison.

Bug reported: https://savannah.gnu.org/bugs/index.php?36464


Thanks,

I'll first add a format scan for all not(-yet)-implemented ML formatspecifiers + error msg.Only then I'll start thinking about %[] (unless you or someone elsebeats me to it).


Philip

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug] strread() elaborated format strings, Júlio Hoffimann, 2012/05/10
- Re: [Bug] strread() elaborated format strings, Philip Nienhuis, 2012/05/13
  - Re: [Bug] strread() elaborated format strings, Júlio Hoffimann, 2012/05/14
    - Re: [Bug] strread() elaborated format strings, Philip Nienhuis <=

Prev by Date: Re: Octave 3.6.2-rc0 release candidate available for ftp
Next by Date: Re: julia language
Previous by thread: Re: [Bug] strread() elaborated format strings
Next by thread: package "general" installation problem
Index(es):
- Date
- Thread