hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1276) optimize bucketing
Date Wed, 31 Mar 2010 17:15:27 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851969#action_12851969
] 

Namit Jain commented on HIVE-1276:
----------------------------------

Some nitpicks.

1. In the test, can you also select after explain and verify the results
2. You have java.util.ArrayList in the code in many places, you have already imported ArrayList
- 
     the full pathi s not needed
3. I think you should also compare the number of reducers in the parent and child reduce sink,
and only
    merge if they are same. Not sure, if this is needed. Let us discuss this more.

> optimize bucketing 
> -------------------
>
>                 Key: HIVE-1276
>                 URL: https://issues.apache.org/jira/browse/HIVE-1276
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>         Attachments: hive-1276.1.patch, hive-1276.2.patch, hive-1276.3.patch, hive-1276.4.patch
>
>
> If the query results are already clustered by the bucketing column, there is no need
for another map-reduce job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message