hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shravan Matthur Narayanamurthy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-370) dump with split does not work
Date Fri, 22 Aug 2008 13:24:44 GMT

     [ https://issues.apache.org/jira/browse/PIG-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shravan Matthur Narayanamurthy updated PIG-370:
-----------------------------------------------

    Attachment: 370-1.patch

Fixed the issue of dump and store not working. Two parts:
1) Fixed the openIterator method to use the store() codepath instead of its own to execute.
2) Because of some debug calls using getSchema method in LOSplitoutput which used an object
without checking for null, there was a NPE. Fixed it to return a null schema if there were
no predecessors of LOSplitOutput.

Also included is a test case to test the various split + dump + registerQuery(Store) + PigServer.store
combinations 

> dump with split does not work
> -----------------------------
>
>                 Key: PIG-370
>                 URL: https://issues.apache.org/jira/browse/PIG-370
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Shravan Matthur Narayanamurthy
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 370-1.patch, 370.patch
>
>
> The following script doesn't work:
> A = load 'file:/etc/passwd' using PigStorage(':');
> split A into A1 if $2<25, A2 if $2>25;
> dump A1;
> The error dump is:
> 2008-08-08 17:03:06,888 [main] WARN  org.apache.pig.PigServer - bytearray is implicitly
casted to integer under LOGreaterThan Operator
> 2008-08-08 17:03:06,889 [main] WARN  org.apache.pig.PigServer - bytearray is implicitly
casted to integer under LOLesserThan Operator
> 2008-08-08 17:03:06,970 [main] ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt
to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
multiple inputs.  This operator does not support multiple inputs.
> 2008-08-08 17:03:06,973 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException:
Unable to open iterator for alias: A1 [Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
multiple inputs.  This operator does not support multiple inputs.]
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:158)
> 	at org.apache.pig.PigServer.execute(PigServer.java:519)
> 	at org.apache.pig.PigServer.openIterator(PigServer.java:307)
> 	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
> 	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
> 	at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:92)
> 	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
> 	at org.apache.pig.Main.main(Main.java:278)
> Caused by: org.apache.pig.backend.executionengine.ExecException: Attempt to give operator
of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
multiple inputs.  This operator does not support multiple inputs.
> 	... 8 more
> Caused by: org.apache.pig.impl.plan.PlanException: Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
multiple inputs.  This operator does not support multiple inputs.
> 	at org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:169)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:81)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:1)
> 	at org.apache.pig.impl.plan.OperatorPlan.addAsLeaf(OperatorPlan.java:395)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:142)
> 	... 7 more
> 2008-08-08 17:03:06,973 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Unable
to open iterator for alias: A1 [Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
multiple inputs.  This operator does not support multiple inputs.]
> 2008-08-08 17:03:06,973 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException:
Unable to open iterator for alias: A1 [Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
multiple inputs.  This operator does not support multiple inputs.]
> The issue is that PigServer.OpenIterator doesn't send the logical plan corresponding
to A1 but sends the plan with both splitOutputs. I have a fix for this and will attach it.
We have two parallel branches which do nearly the same thing. OpenIterator & Store. However
they have individual code paths and we see the same script with a store not raise a plan exception.
For now I will just copy some code. But we need to fix this in a better way. Otherwise we
will leave some bugs that we fix for store in dump.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message