hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <>
Subject [jira] [Commented] (HIVE-13205) Job with last_value() function keep running forever.
Date Fri, 04 Mar 2016 07:07:40 GMT


Gopal V commented on HIVE-13205:

I thought the bottleneck which was there earlier was fixed in HIVE-7344 (hive-1.0?), but the
shuffle still moves all rows at least once.

So your shuffle operation will end up being the bottleneck if id has a low nDV.

Best to print out the summary on a lower run and see if it is running 1 reducer forever or

> Job with last_value() function keep running forever.
> ----------------------------------------------------
>                 Key: HIVE-13205
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rahul
> Hi,
> I am running following query to fill all null with the last known value in the column:
> Select price,time, id,last_value(price,true) over (partition by id order by time) as
LatestPrice from table;
> For few records, the query is running successfully. But for large number of records (2
Bn), the query keep running forever.

This message was sent by Atlassian JIRA

View raw message