lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Use a date field for ranking
Date Fri, 07 Jan 2005 18:39:14 GMT
: we are currently implementing a search engine for a news site. Our goal
: is to have a search result that uses the publish date of the documents
: to boost the score of the documents.

: have to use something that boosts the scores at _search_ time.

1) There is a way to boost individual Query objects (which you may then
compose into a Tree of BooleanQueries) see Query.setBoost(float)

2) if you are planning to rebuild your index on a regular basis (ie:
nightly) then you can easily apply boosts to your documets when you index
them.

If you want to be able to do only incrimental additions...

3) I'm sure there is a very cool and efficient way to do this using a
custom Similarity implimentation (which somhow causes the default score
to be divided by the age of the document) but i've never acctualy played
with the SImilarity class, so i won't say for certain it can be done that
way (hopefully someone else can chime in)

4) I can tell you what i cam up with when i was proof of concepting this a
while back...

In my case, I'm willing to accept that there is some finite granularity of
time at which "newer" documents are no longer very much more "fresh" then
"older" documents (ie: articles from the same week are equally "fresh" to
me) I also have a practicle cut off of how old things can get before they
are just plan old: 52 weeks.

With those numbers in mind, I can add a special field to each document
that indicates which week the article was published (ie: 2004w1, 2004w2,
2004w3, etc...).  At search time, my query can include a BooleanQuery of
52 clauses ORed together, each one containing the magic token for the last
52 weeks prio to when the search was execuded, each with a slightly
decreasing boost from the week before.





-Hoss

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message