hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1239) Make the reducer limit-aware
Date Thu, 11 Mar 2010 01:09:27 GMT
Make the reducer limit-aware
----------------------------

                 Key: HIVE-1239
                 URL: https://issues.apache.org/jira/browse/HIVE-1239
             Project: Hadoop Hive
          Issue Type: Improvement
    Affects Versions: 0.6.0
            Reporter: Ning Zhang
             Fix For: 0.6.0


Currently if a join followed by a limit operator, the reducer still need to do a lot of work
even after the limit is reached. 

A plan could look like:

ExecReducer -> ExtractOperator -> Limit Operator -> ... 

In Hadoop 0.20, we can overwrite the reduce API to stop taking rows from the underlying file,
but for pre-0.20, it is not overwritable. What we can do is to put the limit number in the
ExecReducer metadata in the hive optimization phase. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message