pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat (JIRA)" <j...@apache.org>
Subject [jira] Reopened: (PIG-859) Optimizer throw error on self-joins
Date Tue, 09 Nov 2010 01:50:13 GMT

     [ https://issues.apache.org/jira/browse/PIG-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Viraj Bhat reopened PIG-859:
----------------------------


Hi Olga,
 According to the use case of dfs.pig, we need to support this syntax. It would help the user
to avoid having to write 2 load statements, which is non-intuitive.
 If you believe that this is not required we need to document this behavior that the self-join
requires 2 load statements.
Regards
Viraj

> Optimizer throw error on self-joins
> -----------------------------------
>
>                 Key: PIG-859
>                 URL: https://issues.apache.org/jira/browse/PIG-859
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.3.0
>            Reporter: Ashutosh Chauhan
>             Fix For: 0.9.0
>
>
> Doing self-join results in exception thrown by Optimizer. Consider the following query
> {code}
> grunt> A = load 'a';
> grunt> B = Join A by $0, A by $0;
> grunt> explain B;
> 2009-06-20 15:51:38,303 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1094: Attempt to insert between two nodes that were not connected.
> Details at logfile: pig_1245538027026.log
> {code}
> Relevant stack-trace from log-file:
> {code}
> Caused by: org.apache.pig.impl.plan.optimizer.OptimizerException: ERROR
> 2047: Internal error. Unable to introduce split operators.
>         at
> org.apache.pig.impl.logicalLayer.optimizer.ImplicitSplitInserter.transform(ImplicitSplitInserter.java:163)
>         at
> org.apache.pig.impl.logicalLayer.optimizer.LogicalOptimizer.optimize(LogicalOptimizer.java:163)
>         at org.apache.pig.PigServer.compileLp(PigServer.java:844)
>         at org.apache.pig.PigServer.compileLp(PigServer.java:781)
>         at org.apache.pig.PigServer.getStorePlan(PigServer.java:723)
>         at org.apache.pig.PigServer.explain(PigServer.java:566)
>         ... 8 more
> Caused by: org.apache.pig.impl.plan.PlanException: ERROR 1094: Attempt
> to insert between two nodes that were not connected.
>         at
> org.apache.pig.impl.plan.OperatorPlan.doInsertBetween(OperatorPlan.java:500)
>         at
> org.apache.pig.impl.plan.OperatorPlan.insertBetween(OperatorPlan.java:480)
>         at
> org.apache.pig.impl.logicalLayer.optimizer.ImplicitSplitInserter.transform(ImplicitSplitInserter.java:139)
>         ... 13 more
> {code}
> A possible workaround is:
> {code}
> grunt> A = load 'a';
> grunt> B = load 'a';
> grunt> C = join A by $0, B by $0;
> grunt> explain C;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message