hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter
Date Wed, 14 Oct 2009 18:37:31 GMT

    [ https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765682#action_12765682

Daniel Dai commented on PIG-1022:

Actually we cannot push the filter even before f2. Since we do not keep track of the source
of data inside tuple, so gid should be treated as a generated field of f2. However, projection
map of f2 give us the wrong result that gid is a directly mapped field of group (which is
a tuple (name, gid)), and this triggers all the subsequences. The fix for this problem is
to modify the projection map generation logic for the mapped field. 

Santhosh, do you have any comment?

> optimizer pushes filter before the foreach that generates column used by filter
> -------------------------------------------------------------------------------
>                 Key: PIG-1022
>                 URL: https://issues.apache.org/jira/browse/PIG-1022
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Thejas M Nair
>            Assignee: Daniel Dai
> grunt> l = load 'students.txt' using PigStorage() as (name:chararray, gender:chararray,
age:chararray, score:chararray);
> grunt> f = foreach l generate name, gender, age,score, '200'  as gid:chararray;
> grunt> g = group f by (name, gid);
> grunt> f2 = foreach g generate group.name as name: chararray, group.gid as gid: chararray;
> grunt> filt = filter f2 by gid == '200';
> grunt> explain filt;
> In the plan generated filt is pushed up after the load and before the first foreach,
even though the filter is on gid which is generated in first foreach.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message