lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Javadoc error?
Date Thu, 24 Feb 2005 04:47:55 GMT
Mark Woon wrote:
> The javadoc for Field.setBoost() claims:
> "The boost is multiplied by |Document.getBoost()| 
> <http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/document/Document.html#getBoost%28%29>

> of the document containing this field. If a document has multiple fields 
> with the same name, all such values are multiplied together."
> 
> However, from what I can tell from IndexSearcher.explain(), multiple 
> fields with the same name have their boost values added together.  It 
> might very well be that I'm misinterprating what I'm seeing from 
> explain(), but if I'm not, then either the javadoc is wrong or there's a 
> bug somewhere...
> 
> Does anyone know which way it's actually supposed to work?

Boosts for multiple fields with the same name in the a document are 
multiplied together at index time to form the boost for that field of 
that document.  At search time, if multiple query terms from the same 
field match the same document, then that document's field boost is 
multiplied into the score for both terms, and these scores are then 
added.  If boost(field,doc) is the boost, and raw(term,doc) is the raw, 
unboosted score (I'm simplifying things) then the score for a two term 
query is something like:

   boosted(<t1,t2>,d) =
     boost(t1.field,d)*raw(t1,d) + boost(t2.field,d)*raw(t2,d)

which, when t1 and t2 are in the same field, is equivalent to:

   boosted(<t1,t2>,d) = boost(field,d)*(raw(t1,d) + raw(t2,d))

The explain() feature prints things in the first form, where the boosts 
appear in separate components of a sum.

Does that help?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message