lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5122) DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord
Date Fri, 19 Jul 2013 13:54:49 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713670#comment-13713670
] 

Adrien Grand commented on LUCENE-5122:
--------------------------------------

For SortingMP, we only provide the ability to sort by a NumericDocValues field out of the
box because numbers feel more natural to define a static rank.

Maybe another case where BlockPackedReader could help is if almost all documents have the
same value. In that case BlockPackedReader will be able to require 0 bits per value for all
blocks that contain a single unique value.

But I agree PackedInts would likely better in general and remove one level of indirection.
                
> DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-5122
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5122
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>
> I dont think "blocking" provides any benefit here in general. we can assume the ordinals
are essentially random and since SortedDV is single-valued, its probably better to just use
the simpler packedints directly? 
> I guess the only case where it would help is if you sorted your segments by that DV field.
But that seems kinda wierd/esoteric to sort your index by a deref'ed string value, e.g. I
don't think its even supported by SortingMP.
> For the SortedSet "ord stream", this can exceed 2B values so for now I think it should
stay as blockpackedreader. but it could use a large blocksize...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message