hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-13205) Job with last_value() function keep running forever.
Date Fri, 04 Mar 2016 07:07:40 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179481#comment-15179481
] 

Gopal V commented on HIVE-13205:
--------------------------------

I thought the bottleneck which was there earlier was fixed in HIVE-7344 (hive-1.0?), but the
shuffle still moves all rows at least once.

So your shuffle operation will end up being the bottleneck if id has a low nDV.

Best to print out the summary on a lower run and see if it is running 1 reducer forever or
not.

> Job with last_value() function keep running forever.
> ----------------------------------------------------
>
>                 Key: HIVE-13205
>                 URL: https://issues.apache.org/jira/browse/HIVE-13205
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rahul
>
> Hi,
> I am running following query to fill all null with the last known value in the column:
> Select price,time, id,last_value(price,true) over (partition by id order by time) as
LatestPrice from table;
> For few records, the query is running successfully. But for large number of records (2
Bn), the query keep running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message