spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-17997) Aggregation function for counting distinct values for multiple intervals
Date Wed, 19 Oct 2016 03:33:58 GMT

     [ https://issues.apache.org/jira/browse/SPARK-17997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-17997:
------------------------------------

    Assignee: Apache Spark

> Aggregation function for counting distinct values for multiple intervals
> ------------------------------------------------------------------------
>
>                 Key: SPARK-17997
>                 URL: https://issues.apache.org/jira/browse/SPARK-17997
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Zhenhua Wang
>            Assignee: Apache Spark
>
> This is for computing ndv's for bins in equi-height histograms. A bin consists of two
endpoints which form an interval of values and the ndv in that interval. For computing histogram
statistics, after getting the endpoints, we need an agg function to count distinct values
in each interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message