lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Re: Rebuilding Document from index?
Date Fri, 29 Feb 2008 03:35:26 GMT
On Wednesday 27 February 2008 03:33:53 Itamar Syn-Hershko wrote:
> I'm still trying to engineer the best possible solution for Lucene with
> Hebrew, right now my path is NOT using a stemmer by default, only by
> explicit request of the user. MoreLikeThis would only return relevant
> results if I will use a non-stemmed scoring and lookup.

This appears to be the case for all languages too, the stemming will skew 
similarity and result in unrelated documents scoring higher than they need 
to.

Some people seem to be working around this by having two fields where one is 
stemmed and the other isn't.  You could then use the stemmed field when doing 
queries but use the non-stemmed field for MoreLikeThis.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message