hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mich Talebzadeh" <m...@peridale.co.uk>
Subject RE: ORDER BY clause in Hive
Date Tue, 31 Mar 2015 09:20:27 GMT
Thanks Lefty for the information provided.

 

My version of hive is 014.0

 

hive --version

Hive 0.14.0

 

Which should support the configuration parameter at the session level à set hive.groupby.orderby.position.alias=true

 

 

set hive.groupby.orderby.position.alias=true;

SELECT

          rs.Customer_ID

        , rs.Number_of_orders

        , rs.Total_customer_amount

        , rs.Average_order

        , rs.Standard_deviation

FROM

(

        SELECT cust_id AS Customer_ID,

        COUNT(amount_sold) AS Number_of_orders,

        SUM(amount_sold) AS Total_customer_amount,

        AVG(amount_sold) AS Average_order,

        stddev_samp(amount_sold) AS Standard_deviation

        FROM sales

        GROUP BY cust_id

        HAVING SUM(amount_sold) > 94000

        AND AVG(amount_sold) < stddev_samp(amount_sold)

) rs

ORDER BY

          -- Total_customer_amount DESC

          3 DESC

;

 

+-----------------+----------------------+---------------------------+---------------------+------------------------+--+

| rs.customer_id  | rs.number_of_orders  | rs.total_customer_amount  |  rs.average_order 
 | rs.standard_deviation  |

+-----------------+----------------------+---------------------------+---------------------+------------------------+--+

| 11407.0         | 248                  | 103412.65999999995        | 416.9865322580643 
 | 623.4797510518939      |

| 10747.0         | 256                  | 99578.08999999997         | 388.9769140624999 
 | 601.9383117167412      |

| 42167.0         | 266                  | 98585.96000000002         | 370.6239097744362 
 | 592.0790992800527      |

| 4974.0          | 235                  | 98006.16000000002         | 417.0474893617022 
 | 625.670115050053       |

| 12783.0         | 240                  | 97573.54999999996         | 406.5564583333332 
 | 591.6785002882084      |

| 6395.0          | 268                  | 97010.47999999998         | 361.97940298507456
 | 577.991447849281       |

| 2994.0          | 227                  | 94862.61000000006         | 417.89696035242315
 | 624.5379298449825      |

| 429.0           | 231                  | 94819.41000000006         | 410.4736363636366 
 | 615.0384039014772      |

| 1743.0          | 238                  | 94786.12999999993         | 398.2610504201678 
 | 582.2684502048478      |

+-----------------+----------------------+---------------------------+---------------------+------------------------+--+

9 rows selected (212.535 seconds)

 

Indeed this is the correct result ordering by column posirion

 

At the Hive server level I added the folowing properties to $HIVE_HOME/conf/ hive-site.xml

 

  <property>

    <name>hive.groupby.orderby.position.alias</name>

    <value>true</value>

    <description>Eenables using Column Position Alias in GROUP BY and ORDER BY clauses
of queries.</description>

  </property>

 

And ran the above query without session level setting and it worked

 

Regards,

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the
designated recipient only, if you are not the intended recipient, you should destroy it immediately.
Any information in this message shall not be understood as given or endorsed by Peridale Ltd,
its subsidiaries or their employees, unless expressly so stated. It is the responsibility
of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd,
its subsidiaries nor their employees accept any responsibility.

 

From: Lefty Leverenz [mailto:leftyleverenz@gmail.com] 
Sent: 31 March 2015 07:18
To: user@hive.apache.org
Subject: Re: ORDER BY clause in Hive

 

I've opened HIVE-10160 <https://issues.apache.org/jira/browse/HIVE-10160> :  Give a
warning when grouping or ordering by a constant column.

 

Thanks Gopal.




-- Lefty

 

On Tue, Mar 31, 2015 at 2:14 AM, Lefty Leverenz <leftyleverenz@gmail.com> wrote:

---------- Forwarded message ----------

From: Lefty Leverenz <leftyleverenz@gmail.com>
Date: Tue, Mar 31, 2015 at 1:47 AM
Subject: Re: ORDER BY clause in Hive

To: Mich Talebzadeh <mich@peridale.co.uk>



Hive as I see it does not support ORDER BY Column position. It only supports ORDER BY Column
name.

 

That's just in Hive release 0.10.0 and earlier.  In release 0.11.0+ you can set the configuration
parameter <https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfiguringHive>
 hive.groupby.orderby.position.alias to true, and then you can use column positions in ORDER
BY.

 

Here's the new documentation:

*	 <https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.groupby.orderby.position.alias>
hive.groupby.orderby.position.alias
*	Order By <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy#LanguageManualSortBy-SyntaxofOrderBy>
 
*	Group By <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy#LanguageManualGroupBy-GroupBySyntax>





-- Lefty


Mime
View raw message