hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1145) [zebra] merge join on large table ( 100,000.000 rows zebra table) failed
Date Fri, 11 Dec 2009 16:37:20 GMT

    [ https://issues.apache.org/jira/browse/PIG-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789376#action_12789376
] 

Hadoop QA commented on PIG-1145:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12427696/PIG-1145.patch
  against trunk revision 889346.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 2 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/116/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/116/console

This message is automatically generated.

> [zebra] merge join on large table ( 100,000.000 rows zebra table) failed
> ------------------------------------------------------------------------
>
>                 Key: PIG-1145
>                 URL: https://issues.apache.org/jira/browse/PIG-1145
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0, 0.7.0
>            Reporter: Jing Huang
>            Assignee: Yan Zhou
>             Fix For: 0.6.0, 0.7.0
>
>         Attachments: PIG-1145.patch, PIG-1145.patch, PIG-1145.patch
>
>
> Pig script :
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/unsorted1' USING org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/unsorted2' USING org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/sorted11' using org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/sorted21' using org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/sorted11' using org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/sorted21' using org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> --E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as str2;
> store joina into '$outputDir/join1' using org.apache.hadoop.zebra.pig.TableStorer('');
> ~                                                                                   
                                                                           
> ~                                                                                   
                                                                           
> ~                                                  
> ======
> stacktrace:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2176: Error processing right
input during merge join at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.throwProcessingException(POMergeJoin.java:453)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNextRightInp(POMergeJoin.java:443)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:337)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:159)
Caused by: java.io.EOFException: No key-value to read at org.apache.hadoop.zebra.tfile.TFile$Reader$Scanner.checkKey(TFile.java:1590)
at org.apache.hadoop.zebra.tfile.TFile$Reader$Scanner.entry(TFile.java:1611) at org.apache.hadoop.zebra.io.ColumnGroup$Reader$TFileScanner.getKey(ColumnGroup.java:854)
at org.apache.hadoop.zebra.io.ColumnGroup$Reader$CGScanner.getCGKey(ColumnGroup.java:1035)
at org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getKey(BasicTable.java:1082) at
org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableRecordReader.java:105) at org.apache.hadoop.zebra.pig.TableLoader.getNext(TableLoader.java:414)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNextRightInp(POMergeJoin.java:415)
... 7 more 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message