hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <>
Subject [jira] [Commented] (HIVE-13257) GroupBy with column alias does not support AVG
Date Fri, 11 Mar 2016 05:54:04 GMT


Ashutosh Chauhan commented on HIVE-13257:

I have done some digging into this and following is what I have found:


select col1, sum(col2) from t group by 1;

There are multiple ways on how above can be interpreted. When I read the standard, AFAICT
 constant literal is not allowed in group by expression, so this should throw an error. However,
testing this on different DBs I found different behavior.

Oracle & SQL server throws error for this.​ However, MySQL & Postgres assumes 1
in gby clause refers to position 1 in select list, so they translate this effectively into:
select col1, sum(col2) from t group by col1;

Another interpretation for this is assume its group by on constant, meaning there is one group
for full table.

Hive of course has a config variable using which you can get either behavior meaning different
result set with different value of config variable.

Other variant for this is:

select  sum(col2) from t group by 1;

MySQL & Postgres throws error for this consistent with their above interpretation. SQL
Server also throws error for this. However, Oracle magically interprets this as group by on

Yet another variant:

select 1 from t group by 1;

For this one MySQL, Postgres & Oracle can execute the query. SQL Server throws error.

So, it seems only SQL Server is compliant with standard and other DBs do different things
depending on context.
I am wondering whether Hive should change its default behavior being consistent with standard
and start throwing exception whenever there it encounters a constant literal in group by expression.
Of course, their will be a config to get back old behavior. Thoughts?

> GroupBy with column alias does not support AVG
> ----------------------------------------------
>                 Key: HIVE-13257
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.1.0
>            Reporter: Prasanth Jayachandran
> For the following query, with hive.groupby.orderby.position.alias set to true
> {code:title=Query}
> SELECT Avg(`t0`.`x_measure__0`) AS `avg_calculation_270497503505567749_ok` 
> FROM   (SELECT `store_sales`.`ss_ticket_number` AS `ss_ticket_number`, 
>                Sum(`store_sales`.`ss_net_paid`) AS `x_measure__0` 
>         FROM   `store_sales` `store_sales` 
>                JOIN `item` `item` 
>                  ON ( `store_sales`.`ss_item_sk` = `item`.`i_item_sk` ) 
>         GROUP  BY `store_sales`.`ss_ticket_number`) `t0` 
> GROUP  BY 1 
> HAVING ( Count(1) > 0 );
> {code}
> it throws the following exception
> {code:title=Exception}
> FAILED: SemanticException [Error 10128]: Line 2:7 Not yet supported place for UDAF 'Avg’
> {code}

This message was sent by Atlassian JIRA

View raw message