hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1239) Make the reducer limit-aware
Date Thu, 11 Mar 2010 01:09:27 GMT
Make the reducer limit-aware

                 Key: HIVE-1239
                 URL: https://issues.apache.org/jira/browse/HIVE-1239
             Project: Hadoop Hive
          Issue Type: Improvement
    Affects Versions: 0.6.0
            Reporter: Ning Zhang
             Fix For: 0.6.0

Currently if a join followed by a limit operator, the reducer still need to do a lot of work
even after the limit is reached. 

A plan could look like:

ExecReducer -> ExtractOperator -> Limit Operator -> ... 

In Hadoop 0.20, we can overwrite the reduce API to stop taking rows from the underlying file,
but for pre-0.20, it is not overwritable. What we can do is to put the limit number in the
ExecReducer metadata in the hive optimization phase. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message