lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4769) Add a CountingFacetsAggregator which reads ordinals from a cache
Date Tue, 12 Feb 2013 13:29:13 GMT


Shai Erera commented on LUCENE-4769:

bq. Sometimes I wonder if you are having this argument with me to avoid a single type cast
in the facets codebase or for some other cosmetic reason

Not at all. I'm looking for the cleanest solution. I'm just *much* less familiar that you
when it comes to Codecs and Formats, and therefore I fail to think of ways to piggy-back (abuse?
:)) them for facets.

I certainly think your FacetsBDV idea is good. I don't mind casting if it will help the API.
Basically .getInts() is the API of CategoryListIterator today ... maybe we can nuke CLI in
favor of FacetsBDV? Definitely worth looking at. Lets's explore that in a separate issue though.
So according to the results on LUCENE-4764, we don't need to do any specialization to get
the bytes of a certain document. And in the future, with FacetsBDV we won't need CachedInts
as aggregator, but as another Codec, and again, won't need to specialize (I hope!).

And if we ever develop a custom IR for facets, we can add the .getInts API higher-up the chain,
and not necessarily depend on DocValues.
> Add a CountingFacetsAggregator which reads ordinals from a cache
> ----------------------------------------------------------------
>                 Key: LUCENE-4769
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-4769.patch, LUCENE-4769.patch
> Mike wrote a prototype of a FacetsCollector which reads ordinals from a CachedInts structure
on LUCENE-4609. I ported it to the new facets API, as a FacetsAggregator. I think we should
offer users the means to use such a cache, even if it consumes more RAM. Mike tests show that
this cache consumed x2 more RAM than if the DocValues were loaded into memory in their raw
form. Also, a PackedInts version of such cache took almost the same amount of RAM as straight
int[], but the gains were minor.
> I will post the patch shortly.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message