hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
Date Mon, 11 Jan 2010 21:39:54 GMT

    [ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798872#action_12798872
] 

Alan Gates commented on PIG-1178:
---------------------------------

bq. 1. Why do we need a pos arguments in PlanEdge? What's the use case for that?
One of the issues we saw frequently with the current implementation of OperatorPlan is that
for nodes with multiple inputs (or outputs), if a transformation required disconnecting one
of those inputs and connecting a new one, it often changed the order of the inputs (that is,
what had been plan.getPredecessors(op).get(1) became plan.getPredecessors(op).get(1)).  The
ability to connect a PlanEdge as a particular input or output is meant to address this.

bq.  2. Where will relational operator methods go? Such as getRequiredFields, getProjectionMap,
getRelevantInputs, pruneColumns. Are we going to solve them using uid?
They should go away.  Patching up a plan after the transform will be the responsibility of
the PlanTransformListeners.  The hypothesis is that schema plus uid will be sufficient for
these to do their jobs.

pruneColumns is a special case, but again I think that schema plus uid will be sufficient.

bq. 3. What is the functional division between Rule.match() and PatternMatchOperatorPlan.check()?
Can we wrap both logic in one class (Rule) rather than two? Leave PatternMatchOperatorPlan
simple seems to more clear to the rule writers.
I'll leave this one for Ying.

> LogicalPlan and Optimizer are too complex and hard to work with
> ---------------------------------------------------------------
>
>                 Key: PIG-1178
>                 URL: https://issues.apache.org/jira/browse/PIG-1178
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Alan Gates
>         Attachments: lp.patch
>
>
> The current implementation of the logical plan and the logical optimizer in Pig has proven
to not be easily extensible. Developer feedback has indicated that adding new rules to the
optimizer is quite burdensome. In addition, the logical plan has been an area of numerous
bugs, many of which have been difficult to fix. Developers also feel that the logical plan
is difficult to understand and maintain. The root cause for these issues is that a number
of design decisions that were made as part of the 0.2 rewrite of the front end have now proven
to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and
rebuild the logical plan with a simpler design that will make it much easier to maintain the
logical plan as well as extend the logical optimizer. 
> See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message