pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-1916) Nested cross
Date Thu, 14 Jul 2011 06:13:00 GMT

     [ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Dai updated PIG-1916:
----------------------------

    Attachment: PIG-1916_5.patch

Change the patch slightly to fix test-patch warnings.

> Nested cross
> ------------
>
>                 Key: PIG-1916
>                 URL: https://issues.apache.org/jira/browse/PIG-1916
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Zhijie Shen
>              Labels: gsoc2011
>             Fix For: 0.10
>
>         Attachments: PIG-1916_1.patch, PIG-1916_2.patch, PIG-1916_3.patch, PIG-1916_4.patch,
PIG-1916_5.patch
>
>
> It is useful to have cross inside foreach nested statement. One typical use case for
nested foreach is after cogroup two relations, we want to flatten the records of the same
key, and do some processing. This is naturally to be achieved by cross. Eg:
> {code}
> C = cogroup user by uid, session by uid;
> D = foreach C {
>     crossed = cross user, session; -- To flatten two input bags
>     filtered = filter crossed by user::region == session::region;
>     result = foreach crossed generate processSession(user::age, user::gender, session::ip);
 --Nested foreach Jira: PIG-1631
>     generate result;
> }
> {code}
> If we don't have cross, user have to write a UDF process the bag user, session. It is
much harder than a UDF process flattened tuples. This is especially true when we have nested
foreach statement(PIG-1631).
> This is a candidate project for Google summer of code 2011. More information about the
program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message