pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-3480) TFile-based tmpfile compression crashes in some cases
Date Tue, 24 Sep 2013 18:36:05 GMT

    [ https://issues.apache.org/jira/browse/PIG-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776602#comment-13776602
] 

Dmitriy V. Ryaboy commented on PIG-3480:
----------------------------------------

For most of the tasks that fail, no stack trace is available on Hadoop 1 (they just die with
"nonzero status 134").

I did catch one task with a stack trace:
{code}
java.io.IOException: Error while reading compressed data at org.apache.hadoop.io.IOUtils.wrappedReadForCompressedData(IOUtils.java:205)
at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:342) at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:373)
at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:357) at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:389)
at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:420)
at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
at org.apache.hadoop.mapred.MapTask.run(Map
{code}

No idea if this is relevant.

This problem does happen consistently -- 100% of the time on my script that shows this problem.
Anecdotally, about 1/10 of our production scripts encounter this; I have not been able to
establish a pattern yet.
                
> TFile-based tmpfile compression crashes in some cases
> -----------------------------------------------------
>
>                 Key: PIG-3480
>                 URL: https://issues.apache.org/jira/browse/PIG-3480
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>             Fix For: 0.12
>
>
> When pig tmpfile compression is on, some jobs fail inside core hadoop internals.
> Suspect TFile is the problem, because an experiment in replacing TFile with SequenceFile
succeeded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message