hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3972) Support using multiple reducer for fetching order by results
Date Fri, 01 Feb 2013 08:19:16 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Phabricator updated HIVE-3972:
------------------------------

    Attachment: HIVE-3972.D8349.1.patch

navis requested code review of "HIVE-3972 [jira] Support using multiple reducer for fetching
order by results".

Reviewers: JIRA

DPAL-1976 Support using multiple reducer for fetching order by results

Queries for fetching results which have lastly "order by" clause make final MR run with single
reducer, which can be too much. For example,

select value, sum(key) as sum from src group by value order by sum;

If number of reducer is reasonable, multiple result files could be merged into single sorted
stream in the fetcher level.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D8349

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MergeSortingFetcher.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/RowFetcher.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java
  ql/src/test/queries/clientpositive/orderby_query_bucketing.q
  ql/src/test/results/clientpositive/orderby_query_bucketing.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/20325/

To: JIRA, navis

                
> Support using multiple reducer for fetching order by results
> ------------------------------------------------------------
>
>                 Key: HIVE-3972
>                 URL: https://issues.apache.org/jira/browse/HIVE-3972
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-3972.D8349.1.patch
>
>
> Queries for fetching results which have lastly "order by" clause make final MR run with
single reducer, which can be too much. For example, 
> {code}
> select value, sum(key) as sum from src group by value order by sum;
> {code}
> If number of reducer is reasonable, multiple result files could be merged into single
sorted stream in the fetcher level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message