lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (LUCENE-5735) Faceting for DateRangePrefixTree
Date Thu, 05 Feb 2015 05:33:34 GMT


David Smiley commented on LUCENE-5735:

The PrefixTreeFacetCounter utility is good; if it doesn't get committed to 5x as part of this
issue first, it will for the heatmap one.

There's a bug in NumberRangePrefixTreeStrategy.calcFacets in which all cells above the parent
are counted as topLeaves, when really that can only be done if the leaf cell _contains_ the
facet range.  I have a fix in-progress in which I detect this and if the cell doesn't contain
the facet range then I walk the sub-cells and increment the counters on the parent facet cells.
 _There's a rare-ish bug I need to debug still._  But thus far there are a few changes pending
in my local check-out:
* Make TreeCellIterator public (lucene.internal, still) and allow the 'cell' to be a cell
other than the top world cell.  Probably add a reset() constructor-like method to re-use an
* NRCell has an optimization when getting subCells that seems to work fine in the normal code-paths
thus far but the updated faceting code in-progress has shown the optimization to be faulty,
so I just removed it as I don't think it was worth trying to make it work.
* NRCell sometimes can't get subCells if it was initialized from a short length shape/bytes;
it should instead always initialize it's array to maxLevels.  Again; this apparently never
happen in normal code paths but in some toy test code I triggered it.
* Refactor the two main date range tests to share a random calendar utility (RandomCalHelper).

> Faceting for DateRangePrefixTree
> --------------------------------
>                 Key: LUCENE-5735
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 5.x
>         Attachments: LUCENE-5735.patch, LUCENE-5735.patch, LUCENE-5735__PrefixTreeFacetCounter.patch
> The newly added DateRangePrefixTree (DRPT) encodes terms in a fashion amenable to faceting
by meaningful time buckets. The motivation for this feature is to efficiently populate a calendar
bar chart or [heat-map|]. It's not hard if you have date
instances like many do but it's challenging for date ranges.
> Internally this is going to iterate over the terms using seek/next with TermsEnum as
appropriate.  It should be quite efficient; it won't need any special caches. I should be
able to re-use SPT traversal code in AbstractVisitingPrefixTreeFilter.  If this goes especially
well; the underlying implementation will be re-usable for geospatial heat-map faceting.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message