lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cassandra Targett (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-13047) Add facet2D Streaming Expression
Date Thu, 13 Jun 2019 00:35:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862592#comment-16862592
] 

Cassandra Targett edited comment on SOLR-13047 at 6/13/19 12:34 AM:
--------------------------------------------------------------------

There were 2 pull requests created for this change - #659, which is closed, and #660 which
is still open. I presume since this is resolved that #660 can be closed?

There is also another PR #669 which is still open and has the title "Facet2D" - is that also
related to this change? 


was (Author: ctargett):
There were 2 pull requests created for this change - #659, which is closed, and #660 which
is still open. I presume since this is resolved that #660 can be closed?

> Add facet2D Streaming Expression
> --------------------------------
>
>                 Key: SOLR-13047
>                 URL: https://issues.apache.org/jira/browse/SOLR-13047
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Major
>             Fix For: 8.2
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> The current facet expression is a generic tool for creating multi-dimension aggregations.
The *facet2D* Streaming Expression has semantics specific for 2 dimensional facets which are
designed to be *pivoted* into a matrix and operated on by *Math Expressions*. 
> facet2D will use the json facet API under the covers. 
> Proposed syntax:
> {code:java}
> facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", count(*)){code}
> The example above will return tuples containing the top 300 diseases and the top ten
symptoms for each disease. 
> Using math expression the tuples can be *pivoted* into a matrix where the rows of the
matrix are the diseases, the columns of the matrix are the symptoms and the cells in the matrix
contain the counts. This matrix can then be *clustered* to find clusters of *diseases* that
are correlated by *symptoms*. 
> {code:java}
> let(a=facet2D(medrecords, q=*:*, x=diseases, y=symptoms, dimensions="300, 10", count(*)),
>     b=pivot(a, diseases, symptoms, count(*)),
>     c=kmeans(b, 10)){code}
>  
> *Implementation Note:*
> The implementation plan for this ticket is to create a new stream called Facet2DStream.
The FacetStream code is a good starting point for the new implementation and can be adapted
for the Facet2D parameters. Similar tests to the FacetStream can be added to StreamExpressionTest
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message