[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Pre-Allocating Memory for Speedup
From: |
CdeMills |
Subject: |
Re: Pre-Allocating Memory for Speedup |
Date: |
Fri, 23 Jul 2010 01:28:17 -0700 (PDT) |
Hello,
why not use regexp and cellfun to do all the dirty work at once ? This is a
snippet of code I use to read a comma-delimited file, whose name is
contained in the string 'x':
fid = fopen(tilde_expand(x));
in = fscanf(fid, "%c"); %# slurps everything
fclose(fid);
lines = regexp(in, '(^|\n)[^\n]+', 'match'); %# cut into lines
content = cellfun(@(x) regexp(x, '(\b|'')[^,]+(''|\b)', 'match'), ...
lines, 'UniformOutput', false); %# extract fields
Explanation: first, the whole file is read at once. Then, the content is cut
into lines, using the regexp (^|\n)[^\n]+, meaning 'a line contains a
beginning of -line or a end-of-line, followed by one or more non-end-of-line
character(s)'. The result is a cell array where each cell holds the content
of one line. The construct ensures there is a one-to-one mapping between
line numbers and cell indexes.
Then, each line is further split with the cellfun operator. Given some input
x, (one line), it splits into fields, a field is defined as:
(\b|") : a word boundary or a quote
[^,]+ : one or more characters, excluding comma
("|\b) : the closing quote or a word boundary
The result is thus a (n, m) cell array, where n is the number of lines and m
is the maximum number of rows. Adapt it to your needs, have a look at the
"regexp" entry in the octave manual. The good point is that nothing is
pre-allocated, you can further process the input with 'cellmat' or 'char',
who accept cell arrays as inputs.
Regards
Pascal
--
View this message in context:
http://octave.1599824.n4.nabble.com/Pre-Allocating-Memory-for-Speedup-tp2299707p2299847.html
Sent from the Octave - General mailing list archive at Nabble.com.