hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Eppinger <keppin...@adknowledge.com>
Subject Hadoop data nodes failing to start
Date Tue, 07 Apr 2009 18:04:41 GMT
Hello everyone-

So I have a 5 node cluster that I've been running for a few weeks with no problems.  Today
I decided to add nodes and double its size to 10.  After doing all the setup and starting
the cluster, I discovered that four out of the 10 nodes had failed to startup.  Specifically,
the data nodes didn't start.  The task trackers seemed to start fine.  Thinking it was something
I did incorrectly with the expansion, I then reverted back to the 5 node configuration but
I'm experiencing the same problem...with only 2 of 5 nodes starting correctly.  Here is what
I'm seeing in the hadoop-*-datanode*.log files:

2009-04-07 12:35:40,628 INFO org.apache.hadoop.dfs.DataNode: Starting Periodic block scanner.
2009-04-07 12:35:45,548 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 9269 blocks got
processed in 1128 msecs
2009-04-07 12:35:45,584 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(,
1.244-50010-1238604807366, infoPort=50075, ipcPort=50020):DataXceiveServer: Exiting due to:java.nio.channels.ClosedSelectorException
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:66)
        at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:88)
        at sun.nio.ch.Util.releaseTemporarySelector(Util.java:135)
        at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:120)
        at org.apache.hadoop.dfs.DataNode$DataXceiveServer.run(DataNode.java:997)
        at java.lang.Thread.run(Thread.java:619)

After this the data node shuts down.  This same message is appearing on all the failed nodes.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message