lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ype Kingma <>
Subject Re: too many hits - OutOfMemoryError; Low frequency terms
Date Fri, 30 May 2003 20:52:49 GMT

On Thursday 29 May 2003 14:30, Doug Cutting wrote:
> Ype Kingma wrote:
> > Terms that inadvertantly have a low document frequency (spelling
> > errors for example), get a term relevancy in query execution that
> > is higher than they actually deserve.
> > This problem surfaces when term expansion results in such terms.
> > Is there a way in Lucene to give all expanded terms the same relevancy?
> You could override Similarity.idf(Term, Searcher), so that all query
> terms get the same weight.  Or, if you only wanted to apply this to

That seems to be overdoing it a bit.

> expanded queries, you could change your term expander so that each
> term's boost is set to 1/Similarity.idf(term, searcher), in order to
> cancel the effect of IDF for just the expanded terms.


I'll have a look at a change in term expansion.
Thanks a lot.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message