hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-490) Combiner not used when group elements referred to in tuple notation instead of flatten.
Date Mon, 03 May 2010 17:49:56 GMT

    [ https://issues.apache.org/jira/browse/PIG-490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863427#action_12863427
] 

Olga Natkovich commented on PIG-490:
------------------------------------

Perhaps a good plan of action would be to try this out with the new optimizer framework. This
would give us a chance to experiment with using it at MR layer without moving existing optimizations
there. (That would be something we can do in 0.9.0 if the new frameworks proves to be flexible
and stable enough by that time.)

> Combiner not used when group elements referred to in tuple notation instead of flatten.
> ---------------------------------------------------------------------------------------
>
>                 Key: PIG-490
>                 URL: https://issues.apache.org/jira/browse/PIG-490
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Alan Gates
>             Fix For: 0.8.0
>
>
> Given a query like:
> {code}
> A = load 'myfile';
> B = group A by ($0, $1);
> C = foreach B generate group.$0, group.$1, COUNT(A);
> {code}
> The combiner will not be invoked.  But if the last line is changed to:
> {code}
> C = foreach B generate flatten(group), COUNT(A);
> {code}
> it will be.  The reason for the discrepancy is because the CombinerOptimizer checks that
all of the projections are simple.  If not, it does not use the combiner.  group.$0 is not
a simple projection, so this is failed.  However, this is a common enough case that the CombinerOptimizer
should detect it and still use the combiner. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message