lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Goldowsky <bo...@alum.mit.edu>
Subject Stemming options
Date Sun, 11 Apr 2004 17:55:32 GMT
Has anyone on the list implemented a dictionary-based English stemmer
with Lucene?  Perhaps based on the freely-available ispell dictionaries
or something like that?  The Porter and Snowball stemmers have not
worked that well for our application, but it is a bit daunting to start
from scratch in developing an alternate stemmer.

Alternatively, is there an algorithmic stemmer that anyone has used
which is a little less aggressive than the Porter algorithm?  We've been
having problems with searches for "conversion" returning "converse" and
"conversational"; and "animal" returning "animate".  Yes, these are
morphologically related, but in our particular application it would be
better to stick with removing simple inflections.

Thanks for any pointers --

Boris



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message