lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jordi Salvat i Alabart (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3957) Document precision requirements of setBoost calls
Date Wed, 18 Apr 2012 17:38:40 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256736#comment-13256736
] 

Jordi Salvat i Alabart commented on LUCENE-3957:
------------------------------------------------

Wherever, but please increase exposure of this detail, for the sake of those who're coming
in later.

It's a VERY long and winded way between setting a $docBoost in a SolR dataSource transformer
script and the current location of this information. Would be kind of acceptable if the value
was held with a couple of decimal digits of precision, but the default (3-bit) implementation
doesn't hold even ONE digit, making the behaviour really shocking -- specially since the boost
value is not directly visible, but only through its impact on search scores.
                
> Document precision requirements of setBoost calls
> -------------------------------------------------
>
>                 Key: LUCENE-3957
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3957
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: general/javadocs
>    Affects Versions: 3.5
>            Reporter: Jordi Salvat i Alabart
>
> The behaviour of index-time boosts seems pretty erratic (e.g. a boost of 8.0 produces
the exact same score as a boost of 9.0) until you become aware that these factors end up encoded
in a single byte, with a three-bit mantissa. This consumed a whole day of research for us,
and I still believe we were lucky to spot it, given how deeply dug into the code & documentation
this information is.
> I suggest adding a small note to the JavaDoc of setBoost methods in Document, Fieldable,
FieldInvertState, and possibly AbstractField, Field, and NumericField.
> Suggested text:
> "Note that all index-time boost values end up encoded using Similarity.encodeNormValue,
with a 3-bit mantissa -- so differences in the boost value of less than 25% may easily be
rounded away."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message