pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bejoy KS (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2713) Pig query planner throwing parse error on Joins
Date Mon, 21 May 2012 05:28:40 GMT

    [ https://issues.apache.org/jira/browse/PIG-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279967#comment-13279967
] 

Bejoy KS commented on PIG-2713:
-------------------------------

The following script throws the above mentioned error

{code}
data_1 = LOAD '/userdata/bejoy/samples/pigissue/input1'
    as (
    clmn_1:int,
    clmn_2:int,
    clmn_3:chararray,
    clmn_4:chararray,
    unique_id:chararray,
    clmn_6:chararray,
    clmn_7:chararray,
    clmn_8:chararray,
    clmn_9:chararray,
    clmn_10:chararray,
    clmn_11:chararray,
    clmn_12:int,
    num_sessions:int,
    clmn_14:int,
    clmn_15:int
    );

    data_2 = LOAD '/userdata/bejoy/samples/pigissue/input2'
    as (
    unique_id
    );

    good_use_data = join use_data by unique_id, good_users by unique_id USING 'merge';

    top_grouping = group good_use_data all;
    top_users = foreach top_grouping generate TOP($TOP_COUNT, 2, good_use_data);

    user_lines = foreach top_users generate flatten($0);

    top_data = foreach user_lines generate use_data.unique_id, num_sessions;

    store top_data into '/userdata/bejoy/samples/pigissue/output/top_users';
{code}


If the same script is modified to use different column names (unique_id) then it works flawlessly.
Modified Script:


{code}
.
.
.
data_2 = LOAD '/userdata/bejoy/samples/pigissue/input2'
    as (
    data_2_unique_id
    );

    good_use_data = join use_data by unique_id, good_users by data_2_unique_id USING 'merge';

    top_grouping = group good_use_data all;
    top_users = foreach top_grouping generate TOP($TOP_COUNT, 2, good_use_data);

    user_lines = foreach top_users generate flatten($0);

    top_data = foreach user_lines generate unique_id, num_sessions;

    store top_data into '/userdata/bejoy/samples/pigissue/output/top_users';
{code}
                
> Pig query planner throwing parse error on Joins 
> ------------------------------------------------
>
>                 Key: PIG-2713
>                 URL: https://issues.apache.org/jira/browse/PIG-2713
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8.1, 0.9.2
>         Environment: CentOS 6
>            Reporter: Bejoy KS
>
> Pig parser is throwing an exception when two columns in a table has the same name and
when they are used as part of some projection operation after join.
> Error message
> ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, filter and Load as its
predecessor. Found :
> Error would be thrown for common join as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message