hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-339) [Hive] problem in count distinct in 1mapreduce job with map side aggregation
Date Wed, 11 Mar 2009 19:26:50 GMT

    [ https://issues.apache.org/jira/browse/HIVE-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681002#action_12681002
] 

Zheng Shao commented on HIVE-339:
---------------------------------

Agree.

In short, for DINSTICT aggregations in 1-map/reduce plan, we do "de-duplicate" as much as
we can in the map-phase, and the whole aggregation is done in the reduce phase.
That's why reduce phase also needs iterate().

> [Hive] problem in count distinct in 1mapreduce job with map side aggregation
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-339
>                 URL: https://issues.apache.org/jira/browse/HIVE-339
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.339.1.patch, hive.339.2.patch, hive.339.3.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message