lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4764) Faster but more RAM/Disk consuming DocValuesFormat for facets
Date Sun, 10 Feb 2013 13:27:12 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575436#comment-13575436
] 

Shai Erera commented on LUCENE-4764:
------------------------------------

I think that 30% more RAM is ok .. i.e. either you will have enough RAM on the machine, or
those 30% won't make a big difference (for really large indexes). What bothers me is that
there's no way to do that out-of-the-box ... not with how facets are indexed today. E.g.,
if facets were in core, then we could modify IWC to detect when facets are used (e.g. isEnableFacets)
and then create the optimized Codec for them...

And the problem is that unlike with a caching decision yes/no, here the situation is that
facets are loaded into RAM by default, we just offer a better way to load them. I think that
if we can find a justification to a FacetsCodec in general, then we could stuff such optimizations
in and would tell users that if they want to index facets, they should work with that Codec...

Or .. we can just leave it as-is and document somewhere that you might want to consider that
DV format, at the expense of more RAM but faster search.
                
> Faster but more RAM/Disk consuming DocValuesFormat for facets
> -------------------------------------------------------------
>
>                 Key: LUCENE-4764
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4764
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.2, 5.0
>
>         Attachments: LUCENE-4764.patch
>
>
> The new default DV format for binary fields has much more
> RAM-efficient encoding of the address for each document ... but it's
> also a bit slower at decode time, which affects facets because we
> decode for every collected docID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message