phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1772) Add CUBE/ROLLUP/GROUPING SET operators for advanced aggregations
Date Fri, 27 Mar 2015 00:06:55 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383006#comment-14383006
] 

James Taylor commented on PHOENIX-1772:
---------------------------------------

Welcome, [~jayapriya90]. Looking forward to your contributions. Just curious - what's the
most common use case for CUBE and ROLLUP?

> Add CUBE/ROLLUP/GROUPING SET operators for advanced aggregations
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-1772
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1772
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: jayapriya surendran
>            Assignee: jayapriya surendran
>              Labels: gsoc2015, java, sql
>
> I noticed from Phoenix language documentation ( http://phoenix.apache.org/language/index.html
) that Phoenix is missing CUBE/ROLLUP and GROUPING_SET operators which are already supported
by other similar projects like Apache Pig and Apache Hive. Here is brief overview of my proposal
(the syntax that is proposed below is same as PostgreSQL https://wiki.postgresql.org/wiki/Grouping_Sets)
> *Proposed syntax for CUBE:*
> SELECT name, place, SUM(count) FROM cars GROUP BY CUBE(name, place);
> For every row that we process we need to emit 2^n combinations of rows where n corresponds
to number of aggregate columns. For the above example query, for every row we need to emit
4 rows, one for each level of aggregations {(name, place), (name, *), (*, place), (*, *)}.
> *Proposed syntax for ROLLUP:*
> SELECT name, place, SUM(count) FROM cars GROUP BY ROLLUP(name, place);
> For every row that we process we need to emit n+1 combinations of rows where n corresponds
to number of aggregate columns. For the above example query, for every row we need to emit
3 rows, one for each hierarchical level of aggregations {(name, place), (name, *), (*, *)}.
> *Propose syntax for GROUPING_SETS:*
> SELECT name, place, SUM(count) FROM cars GROUP BY GROUPING SETS(name, ());
> For every row that we process we need to emit n combinations of rows where n corresponds
to size of grouping set. For the above example query, for every row we need to emit 2 rows,
one for each specified level of aggregations {(name, *), (*, *)}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message