lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason <ginger...@gmail.com>
Subject Re: Stemmer algorithms
Date Tue, 14 Feb 2006 07:47:18 GMT
Hi,

I have test some stemmer algorithms in my application. However, i think we'd
better writer a weaker algorithm. I mean, the Porter and some other
algorithms are too strong. maybe an algorithm which can convert plural to
single noun is enough.

On 2/14/06, Yilmazel, Sibel <syilmazel@navisite.com> wrote:
>
> Hello all,
>
> We have done some preliminary research on Porter2 and K-stem algorithms
> and have some questions.
>
> Porter2 was found to be a 'strong' stemming algorithm where it strips
> off both inflectional suffixes (-s, -es, -ed) and derivational suffixes
> (-able, -aciousness, -ability). K-Stem seemed to be a weak stemming
> algorithm as it strips off only the inflectional suffixes (-s, -es,
> -ed).
>
> In IR, it is usually recommended using a "weak" stemmer, as the "weak"
> stemmer seldom hurts performance, but it usually provides significant
> improvement with precision.
>
> However, Porter2 is the most widely used stemming algorithm AND it is a
> 'strong' stemmer which is contrary to what is said above.
>
> Can you share your ideas, experiences with stemmer algorithms? Thanks in
> advance.
>
> Sibel
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message