hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ja Sam <ptrstp...@gmail.com>
Subject Hadoop doesn't work after restart
Date Wed, 24 Jun 2015 09:24:54 GMT
I had a running Hadoop cluster (version from Hortonworks).
Yesterday a lot of things happened nad in some point of time we decided to
one by one reboot all datanodes. Unfortunate the operator did monitor the
namenode health monitor.

The result of above operation is that all datanodes shows as dead nodes,
all blocked are lost, ... .

In one datanode which we decided to reboot it once again to see if datanode
will log anything interesting. The log finished with informations:

INFO  ipc.Server (Server.java:run(861)) - IPC Server Responder: starting
INFO  ipc.Server (Server.java:run(688)) - IPC Server listener on 8010: starting

and hangs here. In the same time on namnode I can see only two types of

INFO  hdfs.StateChange (FSNamesystem.java:completeFile(2805)) - DIR*
completeFile: [SOME PATH] is closed by

and a lot of:

WARN  blockmanagement.BlockManager
(PendingReplicationBlocks.java:pendingReplicationCheck(249)) -
PendingReplicationMonitor timed out blk_1074405820_668233

Today we decided to restart name node and all data nodes. After restart
website: http://[server]:50070/dfshealth.jspanswers VERY slow. I don't see
any errors in log except 5 like bellow:

 ERROR datanode.DataNode (DataXceiver.java:run(225)) -
maelhd21:50010:DataXceiver error processing WRITE_BLOCK operation
src: /node1:33470 dest: /node3:50010

org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block
BP-1037132819- already
exists in state FINALIZED and thus cannot be created.

3 out of 5 nodes shows as lived, but refresh of hadoop status page takes
more than 10 minutes.

The question of course is: what should I check or do now?

p.s. I asked same question on StackOverflow:

View raw message