hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Utkarsh Srivastava (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-51) Combiner gives wrong result in the presence of flattening
Date Sat, 15 Dec 2007 02:33:43 GMT

    [ https://issues.apache.org/jira/browse/PIG-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552022
] 

Utkarsh Srivastava commented on PIG-51:
---------------------------------------

Seems there is an empty tuple in your data set (due to which id  
cannot be resolved). Thats what is throwing the exception. In fact  
the combiner doesn't trigger in your query.

Is there a reason why you do one bit of filter outside, and one  
inside the foreach. Couldn't both the filters be done before  
grouping. It would be more efficient that way, plus the combiner will  
probably kick in.

Utkarsh





> Combiner gives wrong result in the presence of flattening
> ---------------------------------------------------------
>
>                 Key: PIG-51
>                 URL: https://issues.apache.org/jira/browse/PIG-51
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Utkarsh Srivastava
>            Priority: Critical
>         Attachments: combiner-flatten.patch
>
>
> If you do something like
> a = load ... as (f1,f2,f3);
> b = group a by (f1,f2);
> c = foreach b generate flatten(group), SUM(a.f3);
> The reduce side refers to field number expecting data will not have been flattened yet.
But if the combiner kicks in, it already flattens the group, leading to column references
being wrong.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message