lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chong, Herb" <HCho...@bloomberg.com>
Subject RE: Dates and others
Date Mon, 01 Dec 2003 18:18:49 GMT
ad hoc techniques run into lots of trouble because the requirement on Lucene isn't well specified.
is a document with one of the search terms that is a week newer enough to move it ahead of
a document that has all of the search terms? the boost mechanism is a way to move documents
around in the ranking list, but it clearly is a way to reweight the importance of the query
terms and not to impose external constraints that properly should be handled outside the search
engine.

Herb...

-----Original Message-----
From: Doug Cutting [mailto:cutting@lucene.com]
Sent: Monday, December 01, 2003 1:11 PM
To: Lucene Users List
Subject: Re: Dates and others

The problem with this approach is that eventually you'll exhaust the 
range of the boost.  So this will only work if you re-index things from 
scratch periodically, with a boost of something like 1/days-ago.

If you're adding documents to the index in date order, then you could 
use a HitCollector which adjusts scores according to the document 
number, since document numbers increase as you add to the index.

If you're not adding things in date order, then you can, when you open 
the index, build an array mapping document numbers to integer dates. 
Then your hit collector can use this to either boost or sort hits by date.

Or you could add a "month" or "week" field to documents, then add it as 
a clause to your queries with a boost.  Then documents matching the most 
recent week(s) and/or month(s) would get the boost.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message