pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-4843) Turn off combiner in reducer vertex for Tez if bags are in combine plan
Date Tue, 22 Mar 2016 10:41:25 GMT

    [ https://issues.apache.org/jira/browse/PIG-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206140#comment-15206140
] 

Rohini Palaniswamy commented on PIG-4843:
-----------------------------------------

Committed to trunk. Thanks for the review Daniel.

> Turn off combiner in reducer vertex for Tez if bags are in combine plan
> -----------------------------------------------------------------------
>
>                 Key: PIG-4843
>                 URL: https://issues.apache.org/jira/browse/PIG-4843
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4843-1.patch
>
>
> {code}
> B = group A by key;
> C = foreach B {
>                                          key_value           =  A.key_value;
>                                          distinct_key_value  = DISTINCT key_value;
>                                          generate group, MIN(A.key_value) as min_value,
MAX(A.key_value) as max_value, COUNT(distinct_key_value) as distinct_values;
>                     }
> {code}
> In the above example, the combine plan holds the Distinct bag and it causes OOM when
combiner is run by the MergeManager in the reducer. We did not have this issue with mapreduce
as combiner is not running in reducer for new API till now (MAPREDUCE-5221)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message