lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-5339) Simplify the facet module APIs
Date Wed, 13 Nov 2013 16:19:21 GMT


Robert Muir commented on LUCENE-5339:

I took a look at the patch (actually just the tests!) and had these random thoughts:

Can we rename CategoryPath to FacetLabel or something more intuitive?
Can there be sugar like addFields(doc, FacetLabel) so a user doesnt have to make lists etc?
Maybe it could be varargs like FacetLabel..., so just some sugar:
public void addFields(Document doc, FacetLabel... labels) {
  addFields(doc, Arrays.asList(labels));

new LongRange("less than 10", 0L, true, 10L, false) <-- can we make it so this is less

I like that RangeFacetCounts takes varargs though!

For the Taxo case, I think the "document build" process is still too complicated.
What if it worked like this:
IndexWriter iw = new FacetIndexWriter(dir1, dir2);
Document doc = new Document();
doc.add( TextField(body))
doc.add(new FacetField("foo", "bar"))

Then this FacetIW calls super.addDoc, with an iterable filtering out FacetFields, and also
does whatever it needs to do with the FacetFields on the taxo index. 

> Simplify the facet module APIs
> ------------------------------
>                 Key: LUCENE-5339
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-5339.patch
> I'd like to explore simplifications to the facet module's APIs: I
> think the current APIs are complex, and the addition of a new feature
> (sparse faceting, LUCENE-5333) threatens to add even more classes
> (e.g., FacetRequestBuilder).  I think we can do better.
> So, I've been prototyping some drastic changes; this is very
> early/exploratory and I'm not sure where it'll wind up but I think the
> new approach shows promise.
> The big changes are:
>   * Instead of *FacetRequest/Params/Result, you directly instantiate
>     the classes that do facet counting (currently TaxonomyFacetCounts,
>     RangeFacetCounts or SortedSetDVFacetCounts), passing in the
>     SimpleFacetsCollector, and then you interact with those classes to
>     pull labels + values (topN under a path, sparse, specific labels).
>   * At index time, no more FacetIndexingParams/CategoryListParams;
>     instead, you make a new SimpleFacetFields and pass it the field it
>     should store facets + drill downs under.  If you want more than
>     one CLI you create more than one instance of SimpleFacetFields.
>   * I added a simple schema, where you state which dimensions are
>     hierarchical or multi-valued.  From this we decide how to index
>     the ordinals (no more OrdinalPolicy).
> Sparse faceting is just another method (getAllDims), on both taxonomy
> & ssdv facet classes.
> I haven't created a common base class / interface for all of the
> search-time facet classes, but I think this may be possible/clean, and
> perhaps useful for drill sideways.
> All the new classes are under oal.facet.simple.*.
> Lots of things that don't work yet: drill sideways, complements,
> associations, sampling, partitions, etc.  This is just a start ...

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message