aramorph-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aramorph-users] XML tables


From: Pierrick Brihaye
Subject: Re: [Aramorph-users] XML tables
Date: Tue, 16 Aug 2005 16:49:36 +0200
User-agent: Mozilla/5.0 (Windows; U; Win98; fr-FR; rv:1.7.8) Gecko/20050511

Hi,

Ahmed El-dawy wrote:

    See : http://www.nongnu.org/aramorph/english/dictionaries.html#Stems.
    The root "ktb" (";--- ktb" in the file) has *many* lemmas. However,
    keeping a trace of it may help in writing a root analyzer (useful for
    linguists ;-).

This means that the stems dictionary will be stored as a list of dictionary entries like prefix dictionary, and all entries will have attributes, or whatever, for lemma-id and root. I think it will be better to be <root> tags, with <lemma> tags inside, and then <entry> tags inside each <lemma> tag. This is better because there will not be any redundency.

Of course... What is your proposition ?

    Eeeer... the Java *standard* SAX parser does it, doesn't it ? A SAX
    parser is really the thing we need here : big file, poor structure.
I saw the SAX parser and I can use it. However, Digester is much easier to use. Also I have done it already using Digester. Is there a problem using libraries from Jakarta Project?

No... but why an external library when the standard classes are enough ?

    BTW, still as a quick answer : I think that the 3 compatibility
    tables may be merged in one single file.
What is the gain of this?

Size : redundancy should give better compression results.

Cheers,

p.b.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]