lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-5339) Simplify the facet module APIs
Date Sun, 29 Dec 2013 07:49:50 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858276#comment-13858276
] 

Shai Erera edited comment on LUCENE-5339 at 12/29/13 7:48 AM:
--------------------------------------------------------------

It's just that we have \*.facet.taxonomy package, yet many taxonomy related classes are outside
it. I prefer to have a more organized package hierarchy which groups classes together, than
having them in an arbitrary \*.facet package. For instance, the \*.facet package alone contains
40 classes, yet the "suggest" package contains a total of 28 classes, that are divided into
logical packages (\*.analyzing, \*.fst, \*.tst, \*.jaspell and \*.suggest itself). What's
the benefit of dumping all the classes in one package, when they don't share any common code?
If we have a \*.taxonomy, \*.sortedset and \*.range, you can at least know where to look for
if you want to e.g. facet by taxonomy or sortedset. I don't know why you think packages are
intimidating, they are meant to organize the code, and help users find related stuff. I did
a quick count and compare:

* Suggest module's packages contain 6 classes under \*.analyzing and \*.fst (each), 2 classes
under \*.jaspell and 3 classes under \*.tst.
* Facet module could contain 6 classes under \*.range, 3 classes under \*.sortedset and 9
classes under \*.taxonomy (besides the ones that are already there).

The two modules are similar IMO, just like you have several methods for "suggesting", you
have several methods for "faceting"...


was (Author: shaie):
It's just that we have *.facet.taxonomy package, yet many taxonomy related classes are outside
it. I prefer to have a more organized package hierarchy which groups classes together, than
having them in an arbitrary *.facet package. For instance, the *.facet package alone contains
40 classes, yet the "suggest" package contains a total of 28 classes, that are divided into
logical packages (*.analyzing, *.fst, *.tst, *.jaspell and *.suggest itself). What's the benefit
of dumping all the classes in one package, when they don't share any common code? If we have
a *.taxonomy, *.sortedset and *.range, you can at least know where to look for if you want
to e.g. facet by taxonomy or sortedset. I don't know why you think packages are intimidating,
they are meant to organize the code, and help users find related stuff. I did a quick count
and compare:

* Suggest module's packages contain 6 classes under *.analyzing and *.fst (each), 2 classes
under *.jaspell and 3 classes under *.tst.
* Facet module could contain 6 classes under *.range, 3 classes under *.sortedset and 9 classes
under *.taxonomy (besides the ones that are already there).

The two modules are similar IMO, just like you have several methods for "suggesting", you
have several methods for "faceting"...

> Simplify the facet module APIs
> ------------------------------
>
>                 Key: LUCENE-5339
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5339
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-5339.patch, LUCENE-5339.patch, LUCENE-5339.patch
>
>
> I'd like to explore simplifications to the facet module's APIs: I
> think the current APIs are complex, and the addition of a new feature
> (sparse faceting, LUCENE-5333) threatens to add even more classes
> (e.g., FacetRequestBuilder).  I think we can do better.
> So, I've been prototyping some drastic changes; this is very
> early/exploratory and I'm not sure where it'll wind up but I think the
> new approach shows promise.
> The big changes are:
>   * Instead of *FacetRequest/Params/Result, you directly instantiate
>     the classes that do facet counting (currently TaxonomyFacetCounts,
>     RangeFacetCounts or SortedSetDVFacetCounts), passing in the
>     SimpleFacetsCollector, and then you interact with those classes to
>     pull labels + values (topN under a path, sparse, specific labels).
>   * At index time, no more FacetIndexingParams/CategoryListParams;
>     instead, you make a new SimpleFacetFields and pass it the field it
>     should store facets + drill downs under.  If you want more than
>     one CLI you create more than one instance of SimpleFacetFields.
>   * I added a simple schema, where you state which dimensions are
>     hierarchical or multi-valued.  From this we decide how to index
>     the ordinals (no more OrdinalPolicy).
> Sparse faceting is just another method (getAllDims), on both taxonomy
> & ssdv facet classes.
> I haven't created a common base class / interface for all of the
> search-time facet classes, but I think this may be possible/clean, and
> perhaps useful for drill sideways.
> All the new classes are under oal.facet.simple.*.
> Lots of things that don't work yet: drill sideways, complements,
> associations, sampling, partitions, etc.  This is just a start ...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message