lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From java_user_ <>
Subject Stemmer and Synonym analyzer
Date Wed, 24 Oct 2007 18:01:00 GMT

I am planning on building an analyzer that has stemming, stopwords and
synonyms.  I am planning on using the Snowball Porter stemmer and the
WordNet synonym engine.

Does it make sense to stem the synonym index?  

I do not want to stem the term “history” and  then try to find the synonym. 
The stem of “history” is “histori” which will not have a synonym in the
index unless I originally stemmed all the terms in the synonym index.
(synonyms(stem(tokenstream), stopwordlist))

Alternatively, I can find the synonyms of the token stream and then stem all
of them.  This solution should not require stemming the synonym index.
(stem(synonyms(tokenstream)), stopwordlist)

Does anyone have any experience combining a stemmer and a synonym analyzer

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message