lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis Gove (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-8185) Add operations support to streaming metrics
Date Fri, 13 Nov 2015 01:42:11 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968357#comment-14968357
] 

Dennis Gove edited comment on SOLR-8185 at 11/13/15 1:41 AM:
-------------------------------------------------------------

Full patch. 


was (Author: dpgove):
Full patch. All tests pass.

> Add operations support to streaming metrics
> -------------------------------------------
>
>                 Key: SOLR-8185
>                 URL: https://issues.apache.org/jira/browse/SOLR-8185
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrJ
>            Reporter: Dennis Gove
>            Assignee: Dennis Gove
>            Priority: Minor
>         Attachments: SOLR-8185.patch
>
>
> Adds support for operations on stream metrics.
> With this feature one can modify tuple values before applying to the computed metric.
There are a lot of use-cases I can see with this - I'll describe one here.
> Imagine you have a RollupStream which is computing the average over some field but you
cannot be sure that all documents have a value for that field, ie the value is null. When
the value is null you want to treat it as a 0. With this feature you can accomplish that like
this
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0)),
>   count(*),
> )
> {code}
> The operations are applied to the tuple for each metric in the stream which means you
perform different operations on different metrics without being impacted by operations on
other metrics. 
> Adding to our previous example, imagine you want to also get the min of a field but do
not consider null values.
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0)),
>   min(a_i),
>   count(*),
> )
> {code}
> Also, the tuple is not modified for streams that might wrap this one. Ie, the only thing
that sees the applied operation is that particular metric. If you want to apply operations
for wrapping streams you can still achieve that with the SelectStream (SOLR-7669).
> One feature I'm investigating but this patch DOES NOT add is the ability to assign names
to the resulting metric value. For example, to allow for something like this
> {code}
> rollup(
>   search(collection1, q=*:*, fl=\"a_s,a_i,a_f\", sort=\"a_s asc\"),
>   over=\"a_s\",
>   avg(a_i, replace(null, withValue=0), as="avg_a_i_null_as_0"),
>   avg(a_i),
>   count(*, as="totalCount"),
> )
> {code}
> Right now that isn't possible because the identifier for each metric would be the same
"avg_a_i" and as such both couldn't be returned. It's relatively easy to add but I have to
investigate its impact on the SQL and FacetStream areas.
> Depends on SOLR-7669 (SelectStream)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message