beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Hyde (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-2478) Distinct Aggregates
Date Sun, 25 Jun 2017 22:45:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062450#comment-16062450
] 

Julian Hyde commented on BEAM-2478:
-----------------------------------

Your rewrite for hierarchical calculation is slightly wrong.

{code}
select a, count(distinct b) from t group by a

becomes

select a, count(distinct_b) from (
  select a, b as distinct_b
  from t
  group by a, b)
group by a)
{code}

This correctly ignores rows where b is null.

Calcite's [AggregateExpandDistinctAggregatesRule|https://insight.io/github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/AggregateExpandDistinctAggregatesRule.java]
does this rewrite; it can also do a more complex rewrite using GROUPING SETS if there are
multiple distinct-counts in the same query. See also CALCITE-1588 for approximate distinct-count.

> Distinct Aggregates
> -------------------
>
>                 Key: BEAM-2478
>                 URL: https://issues.apache.org/jira/browse/BEAM-2478
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Jingsong Lee
>            Assignee: Tarush Grover
>
> eg: COUNT(DISTINCT empno)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message