pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-3480) TFile-based tmpfile compression crashes in some cases
Date Tue, 24 Sep 2013 18:36:05 GMT

    [ https://issues.apache.org/jira/browse/PIG-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776602#comment-13776602

Dmitriy V. Ryaboy commented on PIG-3480:

For most of the tasks that fail, no stack trace is available on Hadoop 1 (they just die with
"nonzero status 134").

I did catch one task with a stack trace:
java.io.IOException: Error while reading compressed data at org.apache.hadoop.io.IOUtils.wrappedReadForCompressedData(IOUtils.java:205)
at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:342) at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:373)
at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:357) at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:389)
at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:420)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
at org.apache.hadoop.mapred.MapTask.run(Map

No idea if this is relevant.

This problem does happen consistently -- 100% of the time on my script that shows this problem.
Anecdotally, about 1/10 of our production scripts encounter this; I have not been able to
establish a pattern yet.
> TFile-based tmpfile compression crashes in some cases
> -----------------------------------------------------
>                 Key: PIG-3480
>                 URL: https://issues.apache.org/jira/browse/PIG-3480
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>             Fix For: 0.12
> When pig tmpfile compression is on, some jobs fail inside core hadoop internals.
> Suspect TFile is the problem, because an experiment in replacing TFile with SequenceFile

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message