hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Pol (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12860) TeraSort failed on erasure coding directory
Date Thu, 30 Nov 2017 14:54:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272756#comment-16272756
] 

Daniel Pol commented on HDFS-12860:
-----------------------------------

I've seen that sometimes also. Not lately. I do see an issue with Teravalidate more often.
When this happens it's usually at "map 0% reduce 0%" or "map 100% reduce 0%". Just trying
the run again, sometimes goes well. My overall failure rate is 30% (out of 30 runs). I have
DEBUG log level enabled but there's nothing relevant in the logs before this shows up or after.
17/11/30 04:19:55 INFO mapreduce.Job:  map 0% reduce 0%
17/11/30 04:20:01 INFO mapreduce.Job: Task Id : attempt_1512036058655_0003_m_000002_0, Status
: FAILED
Error: java.lang.NullPointerException
        at org.apache.hadoop.io.erasurecode.rawcoder.XORRawDecoder.doDecode(XORRawDecoder.java:83)
        at org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:106)
        at org.apache.hadoop.io.erasurecode.rawcoder.RawErasureDecoder.decode(RawErasureDecoder.java:170)
        at org.apache.hadoop.hdfs.StripeReader.decodeAndFillBuffer(StripeReader.java:423)
        at org.apache.hadoop.hdfs.StatefulStripeReader.decode(StatefulStripeReader.java:94)
        at org.apache.hadoop.hdfs.StripeReader.readStripe(StripeReader.java:382)
        at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:318)
        at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
        at java.io.DataInputStream.read(DataInputStream.java:149)
        at org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:563)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)

> TeraSort failed on erasure coding directory
> -------------------------------------------
>
>                 Key: HDFS-12860
>                 URL: https://issues.apache.org/jira/browse/HDFS-12860
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Lei (Eddy) Xu
>
> Running terasort on a cluster with 8 datanodes, 256GB data, using RS-3-2-1024k.
> The test data was generated by {{teragen}} with 32 mappers.
> The terasort benchmark fails with the following stack trace:
> {code}
> 17/11/27 14:44:31 INFO mapreduce.Job:  map 45% reduce 0%
> 17/11/27 14:44:33 INFO mapreduce.Job: Task Id : attempt_1510080297865_0160_m_000008_0,
Status : FAILED
> Error: java.lang.IllegalArgumentException
> 	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> 	at org.apache.hadoop.hdfs.util.StripedBlockUtil$VerticalRange.<init>(StripedBlockUtil.java:701)
> 	at org.apache.hadoop.hdfs.util.StripedBlockUtil.getRangesForInternalBlocks(StripedBlockUtil.java:442)
> 	at org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(StripedBlockUtil.java:311)
> 	at org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:308)
> 	at org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:391)
> 	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:813)
> 	at java.io.DataInputStream.read(DataInputStream.java:149)
> 	at org.apache.hadoop.examples.terasort.TeraInputFormat$TeraRecordReader.nextKeyValue(TeraInputFormat.java:257)
> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:562)
> 	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> 	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message