[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Stemming Porter algorithm
From: |
John W. Eaton |
Subject: |
Re: Stemming Porter algorithm |
Date: |
Sun, 24 Feb 2008 05:32:20 -0500 |
On 24-Feb-2008, Tiago Charters de Azevedo wrote:
| It does not work in Octave (GNU Octave, version 2.1.73(x86_64-pc-linux-gnu)).
Octave 2.1.73 is obsolete. I'd suggest upgrading to the current
stable version, 3.0.0.
You should be able to make the function work in either version of
Octave if you apply the attached patch.
In any case, the function exposes a bug in Octave's parser that still
exists in 3.0.0, but it is one that is not simple to fix (in Octave)
but which has a simple workaround (use a statement separator before
the END token) so I'm not sure that it is worth fixing.
jwe
--- matlab.txt 2008-02-24 05:24:22.000000000 -0500
+++ porterStemmer.m 2008-02-24 05:29:13.000000000 -0500
@@ -328,40 +328,40 @@
global j;
switch b(k-1)
case {'a'}
- if ends('al', b, k) end;
+ if ends('al', b, k), end;
case {'c'}
if ends('ance', b, k)
- elseif ends('ence', b, k) end;
+ elseif ends('ence', b, k), end;
case {'e'}
- if ends('er', b, k) end;
+ if ends('er', b, k), end;
case {'i'}
- if ends('ic', b, k) end;
+ if ends('ic', b, k), end;
case {'l'}
if ends('able', b, k)
- elseif ends('ible', b, k) end;
+ elseif ends('ible', b, k), end;
case {'n'}
if ends('ant', b, k)
elseif ends('ement', b, k)
elseif ends('ment', b, k)
- elseif ends('ent', b, k) end;
+ elseif ends('ent', b, k), end;
case {'o'}
if ends('ion', b, k)
if j == 0
elseif ~(strcmp(b(j),'s') || strcmp(b(j),'t'))
j = k;
end
- elseif ends('ou', b, k) end;
+ elseif ends('ou', b, k), end;
case {'s'}
- if ends('ism', b, k) end;
+ if ends('ism', b, k), end;
case {'t'}
if ends('ate', b, k)
- elseif ends('iti', b, k) end;
+ elseif ends('iti', b, k), end;
case {'u'}
- if ends('ous', b, k) end;
+ if ends('ous', b, k), end;
case {'v'}
- if ends('ive', b, k) end;
+ if ends('ive', b, k), end;
case {'z'}
- if ends('ize', b, k) end;
+ if ends('ize', b, k), end;
end
if measure(b, k0) > 1
s4 = {b(k0:j), j};