hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyson Condie (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-299) Filter operator not included in the main predecessor plan structure
Date Wed, 09 Jul 2008 23:33:31 GMT
Filter operator not included in the main predecessor plan structure
-------------------------------------------------------------------

                 Key: PIG-299
                 URL: https://issues.apache.org/jira/browse/PIG-299
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
         Environment: N/A
            Reporter: Tyson Condie
            Priority: Blocker


Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
a = load 'input1' as (name, age, gpa);
b = filter a by age < '20';");
c = group b by (name,age);
d = foreach c {
            cf = filter b by gpa < '3.0';
            cp = cf.gpa;
            cd = distinct cp;
            co = order cd by gpa;
            generate group, flatten(co);
            };

The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor
method. Here is the explan plan print out of the inner foreach plan:

|---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
    |   |
    |   Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa:
bytearray cn: 2 Type: bytearray
    |   Input: Distinct Test-Plan-Builder-1
    |
    |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
        |
        |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa:
bytearray cn: 2 Type: bytearray
            Input: Project Test-Plan-Builder-13 Projections:  [*]  Overloaded: false|
            |---Project Test-Plan-Builder-13 Projections:  [*]  Overloaded: false FieldSchema:
cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
                Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age:
bytearray,gpa: bytearray}

As you can see the filter is only accessible via the LOProject::getExpression() method. It
is not showing up as an input operator. Focus on the projection immediately following the
filter. If I remove this projection then I get a correct plan. For example, let the inner
foreach plan be as follows:

d = foreach c {
            cf = filter b by gpa < '3.0';
            cd = distinct cf;
            co = order cd by gpa;
            generate group, flatten(co);
            };

Then I get the following (correct) explan plan output.


|---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type:
bag
    |   |
    |   Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa:
bytearray cn: 2 Type: bytearray
    |   Input: Distinct Test-Plan-Builder-1
    |
    |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray}
Type: bag
        |
        |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray}
Type: bag
            |   |
            |   LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
            |   |
            |   |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema:
 Type: Unknown
            |   |   Input: CoGroup Test-Plan-Builder-7
            |   |
            |   |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
            |
            |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema:
b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
                Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age:
bytearray,gpa: bytearray}


Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement.
Please let me know if this can be resolved.

Thanks,
Tyson



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message