hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rohithsharma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-403) Node Manager throws java.io.IOException: Verification of the hashReply failed
Date Tue, 30 Jul 2013 18:05:49 GMT

    [ https://issues.apache.org/jira/browse/YARN-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724186#comment-13724186
] 

rohithsharma commented on YARN-403:
-----------------------------------

Hash verification is one authentication process at Reducer/NM for Http request/response i.e
fetching map output from NodeManager's. 

One specific scenario I have observed above exception is 
     bq. When app master is killed after map phase is completed. 2nd app master attempt start
reducer phase which intern reducers request for fetching map output. 

In the above scenario, 
1. 1st attemp AM has SecretKey.This key has sent to Node Manager during start of container(Map
Task). Node Manager stores in memory i.e *ShuffleHandler.secretManager* which will be removed
only after application is finished. 

After Map Task is completed, app master is killed.

2. When 2nd attempt app master started, new SecretKey is generated. This Secretkey is sent
along with startContainer request to NodeManager and Reducer JVM is started by NM. Fetcher
at reducer encrypt request url using SecretKey and sends fetch request to Node Manager where
Map Task has run (Old SecretKey is present). 
At NodeManager , hash verification is verified with old 1st attempt app master SecretKey.
This verification fails since keys are different.
                
> Node Manager throws java.io.IOException: Verification of the hashReply failed
> -----------------------------------------------------------------------------
>
>                 Key: YARN-403
>                 URL: https://issues.apache.org/jira/browse/YARN-403
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.2-alpha, 0.23.6
>            Reporter: Devaraj K
>            Assignee: Omkar Vinit Joshi
>
> {code:xml}
> 2013-02-09 22:59:47,490 WARN org.apache.hadoop.mapred.ShuffleHandler: Shuffle failure

> java.io.IOException: Verification of the hashReply failed
> 	at org.apache.hadoop.mapreduce.security.SecureShuffleUtils.verifyReply(SecureShuffleUtils.java:98)
> 	at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.verifyRequest(ShuffleHandler.java:436)
> 	at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:383)
> 	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
> 	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
> 	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
> 	at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:148)
> 	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
> 	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
> 	at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116)
> 	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
> 	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
> 	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
> 	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302)
> 	at org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:522)
> 	at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:506)
> 	at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443)
> 	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
> 	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
> 	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
> 	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
> 	at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
> 	at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
> 	at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
> 	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
> 	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> 	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message