hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-339) [Hive] problem in count distinct in 1mapreduce job with map side aggregation
Date Wed, 11 Mar 2009 18:08:50 GMT

    [ https://issues.apache.org/jira/browse/HIVE-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680958#action_12680958

Joydeep Sen Sarma commented on HIVE-339:

not able to understand. regardless of whether map-side aggregation is turned off dynamically
- the aggregation output type of the mapside will be a partial aggregate. i don't understand
how we can iterate over the results of partial aggregate.

i think problems may be getting masked because the output type of count partial aggregate
is the same as count full aggregate. if we were doing 'avg(distinct column)' - then doing
iterate over partial aggregate output _should_ cause problems. this would be a good test case
as well.

> [Hive] problem in count distinct in 1mapreduce job with map side aggregation
> ----------------------------------------------------------------------------
>                 Key: HIVE-339
>                 URL: https://issues.apache.org/jira/browse/HIVE-339
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.339.1.patch, hive.339.2.patch, hive.339.3.patch

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message