octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New importdata function testing


From: Rik
Subject: Re: New importdata function testing
Date: Mon, 22 Oct 2012 08:05:18 -0700

On 10/22/2012 05:51 AM, Jordi Gutiérrez Hermoso wrote:
> On 21 October 2012 12:07, Rik <address@hidden> wrote:
>> 10/20/12
>>
>> Erik,
>>
>> I did just a small test with importdata and it doesn't seem to work as
>> expected.
>>
>> For a file, I used import.tst containing
>>
>> 1,2,3
>> 4,5,6
>>
>> And then in Octave, I used
>> importdata ('import.tst', ',')
>> warning: unrecognized escape sequence '\S' -- converting to 'S'
> Oops, my bad:
>
>      http://hg.savannah.gnu.org/hgweb/octave/rev/9a455cf96dbe#l2.365
>
>> I am also concerned that the implementation reads the entire file into a
>> string and then uses a number of for loops and regexp which will be slow in
>> Octave.  I did a benchmark with the following:
>>
>> x = rand (1e4, 10);
>> dlmwrite ('tst.csv', x, ',')
>> tic; y = dlmread ('tst.csv', ','); toc
>> Elapsed time is 0.209933 seconds.
>> tic; y = importdata ('tst2.csv', ','); toc
>> Elapsed time is 3.2 seconds.
>>
>> I believe it would be faster  to have importdata check the header lines
>> only and then pass off the work to dlmread if possible.  dlmread is written
>> in C++ and, per the benchmarking above, is very fast.
> It would be preferrable if we could write some minimum common subset
> of this family of functions as a C++ function and leave the rest in
> m-file language. I consider writing code in C++ a last resort for
> optimisation at the very high cost of making the code less
> understandable for most people. Many of our users are scared by C++,
> but any Octave user understands the m-file language.
I think that is why my proposal would make sense.  The parsing of the
header lines could be done with an m-file script because there won't be
much work to do there, and then reading could be passed off to dlmread
which is already a core Octave and Matlab function.  I don't propose
writing any more C++ if it can be avoided.  On that note, there has been
talk of having a C++ version of textscan.  When that is done a number of
these functions could switch to relying on that function because it is the
most general and can accept mixed numeric and text data.

--Rik



reply via email to

[Prev in Thread] Current Thread [Next in Thread]