hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Santhosh Srinivasan (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (PIG-349) discrepancy in bags representation
Date Fri, 08 Aug 2008 12:25:44 GMT

    [ https://issues.apache.org/jira/browse/PIG-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620922#action_12620922
] 

sms edited comment on PIG-349 at 8/8/08 5:25 AM:
-----------------------------------------------------------------

After thinking for a while, I realized that I had implemented this in the parser and the patch
is probably not required. The unit test case for the bag schema as it exists today is at the
end of this section.

The bag schema can be expressed as shown below. The idiosyncrasy here is the naming of the
tuple inside the bag which is not required as the tuple can never be accesses directly.

{code}
c = load 'a' as (name, details: bag{mytuple: tuple(age: int, gpa)});

c = load 'a' as (name, details: bag{mytuple (age: int, gpa)});
{code}

Going back to the bag schema that failed due to the patch, it should work as mentioned in
comment 2:

{code}
A = load 'foo' as (B: bag{T: tuple(I: int)});
{code}

{code}
    @Test
    public void testQuery64() {
        buildPlan("a = load 'a' as (name: chararray, details: tuple(age, gpa), mymap: map[]);");
        buildPlan("c = load 'a' as (name, details: bag{mytuple: tuple(age: int, gpa)});");
        buildPlan("b = group a by details;");
        String query = "d = foreach b generate group.age;";
        buildPlan(query);
		buildPlan("e = foreach a generate name, details;");
		buildPlan("f = LOAD 'myfile' AS (garage: bag{tuple1: tuple(num_tools: int)}, links: bag{tuple2:
tuple(websites: chararray)}, page: bag{something_stupid: tuple(yeah_double: double)}, coordinates:
bag{another_tuple: tuple(ok_float: float, bite_the_array: bytearray, bag_of_unknown: bag{})});");
    }

    @Test
    public void testQueryFail64() {
        String query = "foreach (load 'myfile' as (col1, col2 : bag{age: int})) generate col1
;";
        try {
        	buildPlan(query);
        } catch (AssertionFailedError e) {
            assertTrue(e.getMessage().contains("Exception"));
        }
    }
{code}

      was (Author: sms):
    After thinking for a while, I realized that I had implemented this in the parser and the
patch is probably not required. The unit test case for the bag schema as it exists today is
at the end of this section.

The bag schema can be expressed as shown below. The idiosyncrasy here is the naming of the
tuple inside the bag which is not required as the tuple can never directly.

{code}
c = load 'a' as (name, details: bag{mytuple: tuple(age: int, gpa)});

c = load 'a' as (name, details: bag{mytuple (age: int, gpa)});
{code}

Going back to the bag schema that failed due to the patch, it should work as mentioned in
comment 2:

{code}
A = load 'foo' as (B: bag{T: tuple(I: int)});
{code}

{code}
    @Test
    public void testQuery64() {
        buildPlan("a = load 'a' as (name: chararray, details: tuple(age, gpa), mymap: map[]);");
        buildPlan("c = load 'a' as (name, details: bag{mytuple: tuple(age: int, gpa)});");
        buildPlan("b = group a by details;");
        String query = "d = foreach b generate group.age;";
        buildPlan(query);
		buildPlan("e = foreach a generate name, details;");
		buildPlan("f = LOAD 'myfile' AS (garage: bag{tuple1: tuple(num_tools: int)}, links: bag{tuple2:
tuple(websites: chararray)}, page: bag{something_stupid: tuple(yeah_double: double)}, coordinates:
bag{another_tuple: tuple(ok_float: float, bite_the_array: bytearray, bag_of_unknown: bag{})});");
    }

    @Test
    public void testQueryFail64() {
        String query = "foreach (load 'myfile' as (col1, col2 : bag{age: int})) generate col1
;";
        try {
        	buildPlan(query);
        } catch (AssertionFailedError e) {
            assertTrue(e.getMessage().contains("Exception"));
        }
    }
{code}
  
> discrepancy in bags representation
> ----------------------------------
>
>                 Key: PIG-349
>                 URL: https://issues.apache.org/jira/browse/PIG-349
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Santhosh Srinivasan
>            Priority: Critical
>             Fix For: types_branch
>
>         Attachments: bag_schema.patch
>
>
> Currently, when I describe a bag in AS clause of the load statement, I can place bag
of integers there. However, when describing constant bags, I can only create a bag of tuples
that contain integers.
> My understanding is that at this time we only support bags of tuples. If that's the case,
AS clause needs to match that

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message