hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rushabh S Shah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.
Date Wed, 06 Apr 2016 20:43:25 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229066#comment-15229066
] 

Rushabh S Shah commented on MAPREDUCE-6633:
-------------------------------------------

Ran the failed junit failure on bith jdk7 and jdk8.
Both of them passed fine on my machine.
{noformat}
Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.54 sec <<< FAILURE!
- in org.apache.hadoop.mapreduce.tools.TestCLI
testGetJob(org.apache.hadoop.mapreduce.tools.TestCLI)  Time elapsed: 0.084 sec  <<<
FAILURE!
java.lang.AssertionError: null
	at org.junit.Assert.fail(Assert.java:86)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertTrue(Assert.java:52)
	at org.apache.hadoop.mapreduce.tools.TestCLI.testGetJob(TestCLI.java:181)
{noformat}

> AM should retry map attempts if the reduce task encounters commpression related errors.
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6633
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.7.2
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: MAPREDUCE-6633.patch
>
>
> When reduce task encounters compression related errors, AM  doesn't retry the corresponding
map task.
> In one of the case we encountered, here is the stack trace.
> {noformat}
> 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in
fetcher#29
> 	at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> 	at com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196)
> 	at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
> 	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
> 	at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
> 	at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
> 	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537)
> 	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336)
> 	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
> {noformat}
> In this case, the node on which the map task ran had a bad drive.
> If the AM had retried running that map task somewhere else, the job definitely would
have succeeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message