varnamproject-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Varnamproject-discuss] Next step


From: Kevin Martin Jose
Subject: Re: [Varnamproject-discuss] Next step
Date: Sat, 28 Jun 2014 17:07:57 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

Agglutinated words can also be stemmed. If you are asking whether it is possible to separate the words, no. For example,   സന്തോഷസന്താപങ്ങൾ will stem to സന്തോഷസന്താപം. Resolving agglutination is an even more difficult problem than stemming I guess. Santhosh Thottingal had suggested the use of hunspell to do the same. Later maybe :)

On Friday 27 June 2014 08:44 PM, aboobacker sidheeque mk wrote:

On Fri, Jun 27, 2014 at 8:39 PM, Kevin Martin <address@hidden> wrote:
he stem rules for varnam. The accuracy of stemming is around 85% when testing with words from malayalam wikipedia articles. However, this 85% accuracy involves words that are not stemmed at all. That is, if there are 100 words, 40 (speculation) of them won't be stemmed at all and would still be counted
agglutinated words സ്റ്റെം ചെയ്തത് എങ്ങനെയാ :-)



--
Aboobacker MK
GSoC Student


reply via email to

[Prev in Thread] Current Thread [Next in Thread]