hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Santhosh Srinivasan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-158) Rework logical plan
Date Wed, 30 Apr 2008 18:08:56 GMT

    [ https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593393#action_12593393
] 

Santhosh Srinivasan commented on PIG-158:
-----------------------------------------

1. The existing parser allows you to specify arbitrary number of grouping columns per group
by input. Semantically, this does not have a meaning. For example, the script below is legally
allowed but it does not have any practical meaning. As such, it will be disallowed in the
future.

{code}
a = load 'input1' as (name, age);
b = load 'input2' as (name, height);
c = cogroup a by (name, age), b by name;
{code}

2. I will explain it briefly. It will change in my next patch which has nested plans.

LOCogroup is modeled as (List<LogicalOperator> inputs, MultiMap<LogicalOperator,
ExpressionOperator> groupByCols, boolean[] inner). During parsing the list of expression
operators is computed. Since each input has a list of expression operators, we need a list
of list of expression operators which corresponds to:

{code}
ArrayList<ArrayList<ExpressionOperator>> specs = new ArrayList<ArrayList<ExpressionOperator>>();
{code}

The second part about how it is being used. I have commented it out. The diff will show it
as an addition but it is not used.

3. Thanks for pointing it out. You have caught a bug there. I will fix that. It should be

{code}
ExpressionOperator column = new LOProject(lp, new OperatorKey(scope, getNextId()), gis.get(i).op,
(i+1));
{code}

4. The cogroup should be connected as the input for foreach. I will fix that

5. The return type should be DataType.INTEGER. The LOUserFunc wraps the GFAny.getName() with
no arguments.

General observation: I need to remove commented out lines.

> Rework logical plan
> -------------------
>
>                 Key: PIG-158
>                 URL: https://issues.apache.org/jira/browse/PIG-158
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: logical_operators.patch, logical_operators_rev_1.patch, logical_operators_rev_2.patch,
logical_operators_rev_3.patch, parser_changes.patch, parser_changes_v1.patch, parser_changes_v2.patch,
ParserErrors.txt, visitorWalker.patch
>
>
> Rework the logical plan in line with http://wiki.apache.org/pig/PigExecutionModel

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message