lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Date Boosting
Date Thu, 30 Mar 2006 19:09:44 GMT

On Mar 30, 2006, at 11:28 AM, Schwenker, Stephen wrote:
> Maybe I'm not being quite clear enough.  I'm not simply looking to  
> boost a field by a fixed amount and it's not likely that the field  
> is going to match a word in the query because we won't be searching  
> for dates.  So that means that that fields boost will not be taken  
> into consideration because there is no match.

At indexing time, you can boost a document and/or any of its fields.   
Those factors are all multiplied together to form a single document- 
level boost factor.  So a field boosted at indexing time would not  
need to be part of the query itself to boost found documents.

> For example,  All my documents have a publication date.  And I want  
> newer documents to be ranked slightly higher than older documents.   
> Say I am searching for the word "Lucene" and it returns a list of  
> 10 documents.  The third one was written today and the first and  
> second documents were written 2 months ago but get ranked slightly  
> higher because of their score.  Each of these documents have a date  
> field("pubdate") which has the following values.
>
> 20060130
> 20060210
> 20060329
>
> Now, I want to turn these dates into numbers, multiply them by a  
> factor and add them to the total weight. e.g.
>
> Date -> (Numerical value(mil.) / (Convert to days) * (daily  
> multiply factor) = (Boost Result)
> 20060130 -> 1138597200000 / (1000 * 60 * 60 * 24 ) * 0.00002 =  
> 0.26356416...
> 20060210 -> 1139547600000 / (1000 * 60 * 60 * 24 ) * 0.00002 =  
> 0.26378416...
> 20060329 -> 1143608400000 / (1000 * 60 * 60 * 24 ) * 0.00002 =  
> 0.26472416...
>
> This is the current equation I'm hoping to use but I haven't quite  
> worked it out.  If I can add the boost result to the final score  
> then I'm hoping more recent articles will get a slightly higher  
> ranking.
>
> I hope you understand what I'm trying to accomplish and maybe you  
> can help me figure out where I should look.


Since you need to factor in some type of factor based on the  
difference of *today* and the publication date, I think FunctionQuery  
is perhaps what you're after.  It is part of Solr currently:

	<http://incubator.apache.org/solr/docs/api/org/apache/solr/search/ 
function/FunctionQuery.html>

Erik




>
> Thank you,
>
>
> Steve.
>
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Thursday, March 30, 2006 9:51 AM
> To: java-dev@lucene.apache.org
> Subject: Re: Date Boosting
>
>
>
> On Mar 30, 2006, at 8:50 AM, Schwenker, Stephen wrote:
>> I'm new to Lucene and I want to make a query to dynamically boost a
>> document slightly based on a date field.  I'm not sure which
>> classes are used to calculate the boost, so I wanted to ask which
>> classes I should extend to accomplish this?  I'm just asking so I
>> can get to the job faster.  I don't want to waste my time looking
>> in places I don't need to.
>
> Extending classes is not necessary.  To boost a date field you can
> simply call Field.setBoost().   Use IndexSearcher.explain() to see
> how your boosts affect scoring.
>
> Since you might need dynamic data boosting, perhaps the new
> FunctionQuery would be more what you're after though?
>
> 	Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message