lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] [Commented] (LUCENE-5425) Make creation of FixedBitSet in FacetsCollector overridable
Date Sat, 01 Feb 2014 04:08:09 GMT


Shai Erera commented on LUCENE-5425:

If we need to make any change to the API, it has to be a DocIdSet and not Iterator, because
the iterator takes away one layer that could be useful (such as a specialized implementation
which uses instanceof FixedBitSet check, as what I think Rob suggests).

But, John, didn't we say we should explore the move to a DocIdSet-based API in a separate
issue where we also benchmark the implications of using all of those abstractions (both at
collection and aggregation phases)? This issue was supposed to be about letting you cache
the FBS instance.

I don't think we should commit this patch. This issue should allow you to reuse a FixedBitSet.
A separate issue should benchmark the move to a more general API. I want to be sure that whatever
abstractions that we add do not hurt faceted search, or at least note by how much and why
they are worth it. For instance, if we move to a DocIdSet API where none of the Lucene sets
improves faceted search over FixedBitSet, I don't think we should do it...

So John, please revert back to the simple patch w/ the protected method on FacetsCollector
(and on trunk-based code) so we can final review and commit it. And please open a separate
issue to explore using a DocIdSet in MatchingDocs, instead of FixedBitSet. Thanks!

> Make creation of FixedBitSet in FacetsCollector overridable
> -----------------------------------------------------------
>                 Key: LUCENE-5425
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>    Affects Versions: 4.6
>            Reporter: John Wang
>         Attachments: facetscollector.patch, facetscollector.patch
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query. For large
indexes where maxDocs are large creating a bitset of maxDoc bits will be expensive and would
great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining current

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message