lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-6348) Collection of improvements to StatsComponent & FacetComponent for loose coupling
Date Sat, 09 Aug 2014 01:26:12 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091547#comment-14091547
] 

Hoss Man commented on SOLR-6348:
--------------------------------


Since I don't normally mass-open a bunch of jira's like this all at once, i wanted to share
some back story...

The origins of this idea came from the discussions i had several months back with various
folks about the "AnalyticsComponent" (SOLR-5302 & SOLR-5963).  As noted in the comments
of those Jiras, the overall design of that component, and the expressiveness of it's user
syntax/API, make it pretty much impossible to implement distributed support (ie: SolrCloud)
for all possible options the user could specify.

Arround that time, i was talking with some customers who were excited by the prospect of better
"facetted statistics" in Solr (but really needed distributed support as well) about what types
of real world use cases they had.  And then i talked to some folks at apachecon in denver
about some "pie in the sky" ideas for completely new implement ions of stats/analytics in
solr that would need to be built from scratch, and then (where most of my brain storming time
comes from) on a plane trip home i started drafting up some notes of potential "small" improvements
that could be made incrementally to the StatsComponent and FacetComponent to give us more
powerful features w/o needing to completely start over from the ground up.

I discussed some of these ideas with coworkers, but never really talked about them publicly
much because a big factor in how useful most of my ideas could be are dependent on getting
distributed pivot faceting to work -- and i wasn't sure how feasible that was.  

Which is why i spent the past 3 months heads down on SOLR-2894.

Now that SOLR-2894 is starting to look very viable, i'm more confident in the basic premise
of the various ideas in this issue (and it's sub-task), and figured I should start opening
some some Jiras based on the notes i already had typed up.


> Collection of improvements to StatsComponent & FacetComponent for loose coupling
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-6348
>                 URL: https://issues.apache.org/jira/browse/SOLR-6348
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>
> This is a parent wrapper issue for a collection of small-ish, improvements that can be
made to the StatsComponent and the FacetComponent to allow them to play nicer together and
-- in combination with eachother -- provide more powerful combinations of features that will
still work well in a SolrCloud setup.
> The end goal, once all tasks are completed, is that it should be possible to specify
some query params like...
> {noformat}
> stats.field={!tag=price_stats min=true max=true}product_price
> stats.field={!tag=avg_rating mean=true}user_rating
> facet.range={!tag=price_ranges stats=rating_stats facet.range.start=...}price
> facet.pivot={!range=price_ranges stats=price_stats}store,category
> {noformat}
> And in the results you would get:
> * the min & max {{product_price}} for all matching documents
> * the mean {{user_rating}} for all matching documents
> * range facet counts over the {{price}} field
> ** for each range bucket, in addition to the normal constraint count there would also
be:
> *** the average {{user_rating}} for all documents in that range bucket.
> * pivot facets drilling down on all matching documents, first by {{store}} then by {{category}}
> ** for value/count node in the pivot tree, there would also be:
> *** the min & max {{product_price}} for all documents matching this pivot constraint
> *** range facet counts over the {{price}} field for documents matching this pivot constraint
> **** for each range bucket, in addition to the normal constraint count there would also
be:
> ***** the average {{user_rating}} for all documents in that range bucket.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message