help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: reading data from ascii files.


From: Manoj Deshpande
Subject: Re: reading data from ascii files.
Date: Fri, 16 Jul 2010 11:32:04 -0400

I have a very similar problem, slightly more complicated i would think. I have 7 text files, which i read sequentially, each file has about 400,000 rows, i had written script which had nested loops , and it took over 3 days for my script to complete. I quickly changed the scope of my script, and also decided avoid nested loops , since i read it some where on this forum, that nested loops can be a problem. yet i seem to get very slow processing for these 3million rows.

End motive : loop over  7 text files, in each file, read one row at a time(while(fgets)and dlmread), identify if row is useful, append one of the column values into one of the 12 matrices i have, at the end of reading all rows, draw histograms, one for  each matrix,

Problem : No prior knowledge of size of matrices, or the number of rows in all the files.
And yes i am doing all this on RH7 linux.

Is there a way to quicken the read and append to matrix, whose sizes cant be pre-determined ?

Thanks,
Manoj

On Fri, Jul 16, 2010 at 8:17 AM, Siep Kroonenberg <address@hidden> wrote:
On Fri, Jul 16, 2010 at 02:58:43AM -0700, Veloci wrote:
>
> Hi everyone,
>
> I know that the topic sounds familiar to many of you, so i apologize for any
> annoyance it may cause.
> My problem with the data input is not so much the "what functions to use",
> more like "making it faster".
> Starting point are a couple of data files in ascii format with a specified
> number of header rows (data on the experiment and so on) followed by a
> unspecified number of data-rows (depending on the measurement time and the
> sampling rate i use).
> What i want to do: I want to implement a automated input of these data files
> into octave matrices and calculate data from these (as the most people
> would).
> What i want to use for that: The most promising function seems to be
> "dlmread", as my data is organised in rows with a fixed number of columns
> and a fixed delimiter.
> My problem: As the number of rows is not specified, i can't give dlmread a
> range vector with the coordinates of the last row. This is not a tragic
> problem for dlmread, because it still works with only the starting row and
> column, but it is a rather time consuming matter concerning the huge data it
> has to read. It would be much faster if i could allocate the needed memory
> in a zero matrix with the size of the data. For this i need to know the
> number of rows in my ascii file. You could say, "Open the file in a good
> editor and seek for the end of rows". That would be a solution for manual
> data processing, but not for a automated procedure.
> The solution i seek: I've searched on the net for a solution to my problem,
> but i couldn't find anything that would be possible in Octave. In MATLAB
> there was a possibility with memmapfile, that calculates the number of rows
> in a file. But memmapfile is not implemented in Octave. So i tried to find
> something similar, but couldn't find any example. I could imagine using the
> "eof"-pointer, but i don't know how to calculate the end row from that
> position.
>
> If there is any solution that you could suggest, it would be much
> appreciated.

Couldn't you just read and count all lines without parsing?

rnum = 0;
while (! feof(fid))
 rnum = rnum + 1;
 l = fgetl (fid);
end

--
Siep Kroonenberg
_______________________________________________
Help-octave mailing list
address@hidden
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave


reply via email to

[Prev in Thread] Current Thread [Next in Thread]