lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (LUCENE-517) norm compression breaks ranking for small fields
Date Fri, 10 Mar 2006 21:50:57 GMT
    [ ] 

Yonik Seeley commented on LUCENE-517:

Yes, the error bars seem kind of large for the normal usage of norms, which is just length
normalization if you don't include boosts.  You could still use a single byte, but increase
the number of bits dedicated to the mantissa to get better resolution (but with less range).

You could easily make the change for your index, but if would break existing indexes if we
changed the default in Lucene.

> norm compression breaks ranking for small fields
> ------------------------------------------------
>          Key: LUCENE-517
>          URL:
>      Project: Lucene - Java
>         Type: Bug
>   Components: Index, Search
>     Versions: 1.9
>  Environment: N/A
>     Reporter: Randy Puttick

> The scheme of compressing document norms to one byte loses a lot of information.  This
completely breaks search ranking on small fields because there is no way to see the difference
between documents with shorter and longer fields that contain the same number of matching
query terms.  Unfortunately the export of norms as a byte array seems to be pretty well embedded
in the code base so a fix would seem to require a major rev.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message