hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alina Abramova (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12963) LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer
Date Sat, 30 Jan 2016 09:44:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124823#comment-15124823
] 

Alina Abramova commented on HIVE-12963:
---------------------------------------

But I see that if line with creating of genReduceSinkPlan in method genLimitMapRedPlan is
commented then finish set is sorted too. It means that we could refuse the creating of extra
job, and do sorting in the same MR job, doesn't it?

> LIMIT statement with SORT BY creates additional MR job with hardcoded only one reducer
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-12963
>                 URL: https://issues.apache.org/jira/browse/HIVE-12963
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.0.0, 1.2.1, 0.13
>            Reporter: Alina Abramova
>            Assignee: Alina Abramova
>         Attachments: HIVE-12963.1.patch
>
>
> I execute query:
> hive> select age from test1 sort by age.age  limit 10;                      
> Total jobs = 2
> Launching Job 1 out of 2
> Number of reduce tasks not specified. Estimated from input data size: 1
> Launching Job 2 out of 2
> Number of reduce tasks determined at compile time: 1
> When I have a large number of rows then the last stage of the job takes a long time.
I think we could allow to user choose number of reducers of last job or refuse extra MR job.
> The same behavior I observed with queries:
> hive> create table new_test as select age from test1 group by age.age  limit 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message