lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Help: tweaking search - reducing IDF skew and implementing score cutoff
Date Fri, 10 Feb 2006 07:56:21 GMT
: Sunday gets ranked highly due to idf. How do I reduce this skewness
: due to the date-posted field? I saw a reference earlier to
: ConstantScoreRangeQuery on JIRA - is it the solution?

Yes.  RangeQuery expands to a BooleanQuery containing all of the terms in
the.  The number of terms (and the frequency of thsoe terms in the index)
will allways affect those scores.  This is why i constantly argue that
when using dates or numbers a RangeQuery never makes sense -- allways use
a RangeFilter, and if you must have a "Query" object, use

: 2. If I choose to sort the results by date, then recent documents with
: very very low relevancy (say the words searched appears only in
: content, and not in title/bylines/summary fields that are boosted
: higher) are still shown relatively high in the list, and I wish to
: omit them in general. What is the best way to implement some sort of a
: relevancy filter (include only documents with an normalized score of
: 0.2 or more....)? Or is there a better way around it?

there is no safe way to filter by score, this is mentioned in the FAQ...

An alternate approach is to sort by score, but use something like a
FunctionQuery to inflate the scores of more recent documents...


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message