hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Loddengaard" <a...@cloudera.com>
Subject Re: namenode failure
Date Tue, 28 Oct 2008 10:42:35 GMT
Manually killing a process might create a situation where only a portion of
your data is written to disk, and other data in queue to be written is lost.
 This is what has most likely caused corruption in your name node.
Start by reading about bin/hadoop namenode -fsck:

<http://hadoop.apache.org/core/docs/current/commands_manual.html#fsck>

Alex

On Mon, Oct 27, 2008 at 5:36 PM, Songting Chen <ken_cst1998@yahoo.com>wrote:

> Hi,
>  I modified the classpath in hadoop-env.sh in namenode and datanodes before
> shutting down the cluster. Then problem appears: I cannot stop hadoop
> cluster at all. The stop-all.sh shows no datanode/namenode, while all the
> java processes are running.
>  So I manually killed the java process. Now the namenode seems to be
> corrupted and always stays in Safe mode, while the datanodes complain the
> following weird error:
>
> 2008-10-27 17:28:44,141 FATAL org.apache.hadoop.dfs.DataNode: Incompatible
> build versions: namenode BV = ; datanode BV = 694836
> 2008-10-27 17:28:44,244 ERROR org.apache.hadoop.dfs.DataNode:
> java.io.IOException: Incompatible build versions: namenode BV = ; datanode
> BV = 694836
>        at org.apache.hadoop.dfs.DataNode.handshake(DataNode.java:403)
>        at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:250)
>        at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:190)
>        at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2987)
>        at
> org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942)
>        at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950)
>        at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072)
>
>  My question is how to recover from such failure. And I guess the correct
> practice for changing the CLASSPATH is to shut down the cluster, apply the
> change, restart the cluster.
>
> Thanks,
> -Songting
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message