hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Wenzel <mwen...@proheris.de>
Subject Mapreduce Job fails if one Node is offline?
Date Fri, 21 Oct 2016 09:28:20 GMT
I got a small cluster for testing and learning hadoop:

Node1 - Namenode + ResourceManager + JobhistoryServer
Node2 - SecondaryNamenode
Node3 - Datanode + NodeManager
Node4 - Datanode + NodeManager
Node5 - Datanode + NodeManager

My dfs.replication is set to 2.

When I kill the Datanode and Nodemanager process on Node5  I expect Hadoop still to run and
finish my mapreduce jobs successfully.
In reality the job fails because he tries to transfer blocks to Node5 which is offline. Replication
is set to 2, so I expect him to see that Node5 is offline and only take the other two Nodes
to work with.

Can someone please explain to me how Hadoop should work in this case?
If my expectation of Hadoop is correct, and someone would try to help me out, I can add logs
and configuration.

Best Regards,

View raw message