lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-7475) Sparse norms
Date Thu, 06 Oct 2016 10:34:20 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551581#comment-15551581
] 

Michael McCandless commented on LUCENE-7475:
--------------------------------------------

This is a great change.  I would almost call it fixing a "bug", in that
it fixes the norms iteration to never iterate to a document that did
not have that field.  Sort of as if we had added {{docsWithField}} to
norms, in the past.

So if only 1 doc out of zillions is missing the value, we use the
sparse form.  We can improve how we encode it on future issues.

And of course for very sparse fields, it will be a big win ("pay for
what you actually use", like postings and (nearly) stored fields).

I saw some minor things:

  * In {{Lucene70NormsProducer}} you can use
    {{DocValues.emptyNumeric}} instead of making your own?

  * You can let {{longValue}} directly throw {{IOException}} now, in
    {{Lucene70NormsProducer}} (it's still re-throwing as
    {{RuntimeException}} in a few places).

The test improvements are wonderful.

+1 to push!


> Sparse norms
> ------------
>
>                 Key: LUCENE-7475
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7475
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>             Fix For: master (7.0)
>
>         Attachments: LUCENE-7475.patch, LUCENE-7475.patch
>
>
> Even though norms now have an iterator API, they are still always dense in practice since
documents that do not have a value get assigned 0 as a norm value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message