hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
Date Thu, 14 Jan 2010 00:52:54 GMT

    [ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800054#action_12800054

Alan Gates commented on PIG-1178:

bq. in ProjectExpression, is it better to change the object variable "input" from "int" to
"LogicalRelationalOperator" to point to the operator that the project expression operates
on directly?
I want to avoid references to the actual relational operators in the expressions because it
makes patching up the plans after a transformation much easier.  If each project keeps a reference
to the relational operator, then when the plan is transformed we have to go to every project
and change its reference.  By keeping pointers only to which input number, we don't have to
make any changes in the projects after a transformation in the plan.

bq. And I don't understand why this operator needs alias it references. But if we change input
to operator object, the alias can be get from the operator.
You're right, we shouldn't double store aliases here.  We should just use the uid and the
project reference.  I'll make the change.

bq. I don't know the purpose of ColumnExpression. Is it to capture operands? It doesn't seem
to have any special features. So I am not sure if it is necessary
It's a super class for all expressions that represent a single value:  projection, constants,
and eventually, tuple and map dereferences.  I think it's useful for understanding the categorization
of expressions.  I'm not sure it adds any functionality.

> LogicalPlan and Optimizer are too complex and hard to work with
> ---------------------------------------------------------------
>                 Key: PIG-1178
>                 URL: https://issues.apache.org/jira/browse/PIG-1178
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Alan Gates
>            Assignee: Ying He
>         Attachments: expressions.patch, lp.patch, PIG_1178.patch
> The current implementation of the logical plan and the logical optimizer in Pig has proven
to not be easily extensible. Developer feedback has indicated that adding new rules to the
optimizer is quite burdensome. In addition, the logical plan has been an area of numerous
bugs, many of which have been difficult to fix. Developers also feel that the logical plan
is difficult to understand and maintain. The root cause for these issues is that a number
of design decisions that were made as part of the 0.2 rewrite of the front end have now proven
to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and
rebuild the logical plan with a simpler design that will make it much easier to maintain the
logical plan as well as extend the logical optimizer. 
> See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message