pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2391) Bzip_2 test is broken
Date Tue, 06 Dec 2011 22:04:40 GMT

    [ https://issues.apache.org/jira/browse/PIG-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163897#comment-13163897
] 

Olga Natkovich commented on PIG-2391:
-------------------------------------

Hi Xuting,

Could you explain what is causing this regression? It is not obvious to me what the fix is
doing. Also, what would happen if another store function like BinStorage is used?

Thanks
                
> Bzip_2 test is broken
> ---------------------
>
>                 Key: PIG-2391
>                 URL: https://issues.apache.org/jira/browse/PIG-2391
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Olga Natkovich
>            Assignee: xuting zhao
>             Fix For: 0.10, 0.11
>
>         Attachments: PIG-2391.patch
>
>
> This test is currently commented out but if you uncomment it it fails with Pig 10 but
runs successfully with Pig 9.
> Script:
> a = load '/homes/olgan/studenttab10k' using PigStorage() as (name, age, gpa);
> store a into 'intermediate.bz';
> b = load 'intermediate.bz';
> store b into 'final.bz';
> A couple of observations:
> (1) Identical script (represented by Bzip_1 test) that has bz2 instead of bz extension
in the script succeeds in Pig 10
> (2) The problem occurs while reading intermediate.bz which has different size with Pig
9 and Pig 10
> (3) Problem can be reproduced in local mode with small subset of data in the file
> (4) The following stack trace is observed:
> 2011-12-01 13:53:12,280 [Thread-22] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local_0002
> java.lang.RuntimeException: java.io.IOException: compressedStream EOF
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:237)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.<init>(PigRecordReader.java:109)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:119)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: java.io.IOException: compressedStream EOF
>         at org.apache.tools.bzip2r.CBZip2InputStream.cadvise(CBZip2InputStream.java:92)
>         at org.apache.tools.bzip2r.CBZip2InputStream.compressedStreamEOF(CBZip2InputStream.java:96)
>         at org.apache.tools.bzip2r.CBZip2InputStream.bsR(CBZip2InputStream.java:451)
>         at org.apache.tools.bzip2r.CBZip2InputStream.initBlock(CBZip2InputStream.java:348)
>         at org.apache.tools.bzip2r.CBZip2InputStream.<init>(CBZip2InputStream.java:220)
>         at org.apache.pig.bzip2r.Bzip2TextInputFormat$BZip2LineRecordReader.<init>(Bzip2TextInputFormat.java:105)
>         at org.apache.pig.bzip2r.Bzip2TextInputFormat.createRecordReader(Bzip2TextInputFormat.java:244)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:227)
>         ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message