lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Stemming behavior
Date Sat, 20 Dec 2008 01:10:40 GMT
This is likely one of the many subtleties of the Porter stemmer.  Dr.  
Porter has chosen a particular way of doing things, but it isn't  
necessarily right for everyone.  You really have to measure the net  
benefit across all your searches, not specifically just one.  If you  
can't live with this particular case, you can implement a protected  
words approach or try some other stemmers.

If you go to the snowball site and peruse their archives you will find  
much discussion of these kinds of issues.

Sorry I can't offer more in terms of a solution.


On Dec 19, 2008, at 5:33 AM, Jay Malaluan wrote:

> Hi,
> I'm using the SnowballAnalyzer for my stemming processing.
> search words: love, loved, loveliness, loveless, lovely, and loving
> On my index I have the word love. The behavior during searching is  
> that it
> can't correctly stem the two words loveliness, loveless to love. And  
> the odd
> thing is loveliness is stemmed to "loveli" and loveless is not  
> stemmed at
> all.
> Does anyone already encountered this and have suggestions on other
> Analyzers?
> Regards,
> Jay Malaluan
> -- 
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll

Lucene Helpful Hints:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message