hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navis (JIRA)" <>
Subject [jira] [Commented] (HIVE-3972) Support using multiple reducer for fetching order by results
Date Wed, 13 Feb 2013 11:50:14 GMT


Navis commented on HIVE-3972:

[~ashutoshc] Above query has two RSs which means it consists of two MRs (without HIVE-2340).
And second MR still can be a target of top-K optimization. But I've realized by your comment
that this issue and HIVE-3562 are complementary and should be merged into another one. Thanks.

And.. the limit configuration on fetch task is still active, which means early-exit on fetch
task is still possible without HIVE-3562. It's merge sort on sorted streams, so it would not
demand much of memory.
> Support using multiple reducer for fetching order by results
> ------------------------------------------------------------
>                 Key: HIVE-3972
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3972.D8349.1.patch, HIVE-3972.D8349.2.patch, HIVE-3972.D8349.3.patch
> Queries for fetching results which have lastly "order by" clause make final MR run with
single reducer, which can be too much. For example, 
> {code}
> select value, sum(key) as sum from src group by value order by sum;
> {code}
> If number of reducer is reasonable, multiple result files could be merged into single
sorted stream in the fetcher level.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message