hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furcy Pin <furcy....@flaminem.com>
Subject Re: Hive order by not working properly
Date Wed, 07 Jan 2015 10:37:54 GMT
Hi Vengatesh,

AFAIK, in Hive, the columns used in the ORDER BY clause needs to be present
in the SELECT clause.
I guess this is because it has to be passed to the final reducer.

This is why in your case the first query works and the second does not.
The following query, however should work too:

SELECT ACCOUNTID, *concat('test_',ACCOUNTID)* from table *order by*
 ACCOUNTID;

As you already found, the correct way to write this query is to alias your
column and order with it.
Note that this works too:

SELECT *concat('test_',ACCOUNTID) as **ACCOUNTID* from table *order by*
ACCOUNTID;
I am still using 0.12 though, maybe later versions changed some things.

Hope this helps,

Furcy Pin


PS:

On the other hand, Hive has the opposite behavior in GROUP BY clauses, eg
this will not work:

SELECT *concat('test_',ACCOUNTID) as **tempColumn, COUNT(1)* from table
GROUP BY *tempColumn* *order by* tempColumn;

while this will work:

SELECT *concat('test_',ACCOUNTID) as **tempColumn, COUNT(1)* from table
GROUP BY *concat('test_',ACCOUNTID)* *order by* tempColumn;

and this will either:

SELECT *concat('test_',ACCOUNTID) as **tempColumn, COUNT(1)* from table
GROUP BY *concat('test_',ACCOUNTID)* *order by* *concat('test_',ACCOUNTID) *
;

>From my understanding, the reason is that the parsing of the query
"compiles" the GROUP BY clause before the SELECT (as it will depend on it),
so it doesn't know yet that "*concat('test_',ACCOUNTID)*" will be named
tempColumn.
On the other hand, ORDER BY is "compiled" after the SELECT so it does not
recognise "*concat('test_',ACCOUNTID)*" which has been named tempColumn.





2015-01-07 11:11 GMT+01:00 vengatesh.babu <vengatesh.babu@zohocorp.com>:

> Hi,
>
> Hive *order by* not working properly with columns involved in *functions.*
>
> *Column without function:*
>
> SELECT *ACCOUNTID* from table *order by* ACCOUNTID;
>
> (Above query works fine in both HIVE & MYSQL)
>
> *Column with function:*
>
> SELECT *concat('test_',ACCOUNTID)* from table *order by* ACCOUNTID;
>
> In mysql , Above query working fine.
>
> In hive, below exception thrown
>
> *FAILED: SemanticException [Error 10004]: Line 1:59 Invalid table alias or
> column reference 'ACCOUNTID': (possible column names are: _c0)*
>
> Temporary Solution :
>
> SELECT *concat('test_',ACCOUNTID) as tempColumn* from table *order by*
>  tempcolumn;
>
> Is there any solution available without assigning column as tempcolumn ?
>
> Thanks
>
> Vengatesh Babu K M
>
>
>
>

Mime
View raw message