octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Vectorizing strcmp


From: Keith Goodman
Subject: Vectorizing strcmp
Date: Wed, 15 Sep 2004 11:34:29 -0500

On the one hand, it's nice that strcmp is a user-defined
function---it's easy to look at the code to see what it does. On the
other hand, the reason I'm looking at the code is that it is slow.

I'm using strcmp(s1,s2) where s1 is a cell array of strings (around
1000X1) and s2 is a string.

Currently, in 2.1.58, strcmp does

n = length(t1);
for i = 1:n
   retval(i,:) = strcmp (t1{i}, s2);
endfor

which recursively calls strcmp for two strings and where t1 is s1(:).
The for loop is slow.

Short of making strcmp a built-in function, one way to speed it up is
to replace the for loop with a repmat like this

c1 = char(t1);
[rc1, cc1] = size(c1);
ns2 = size(s2,2);
blank = ' ';
ms2 = repmat([s2 repmat(blank,1,max(cc1 - ns2, 0))], rc1, 1);
mc1 = [c1 repmat(blank,1,max(ns2 - cc1, 0))];
retval2 = sum(mc1==ms2, 2)==size(mc1, 2);

The blank is to pad the strings if char(s1) and s2 are different widths.

The second code fragment is about 10 times faster than the first for
length(s1)==1000. For length(s1)==10 the speed is about the same. And
for length(s1) < 10 the first approach is faster.

For me the added complexity in the code is more than offset by the
gain in speed.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]