hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-107) Data-nodes should be formatted when the name-node is formatted.
Date Tue, 20 Oct 2009 22:27:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767980#action_12767980
] 

Ashutosh Chauhan commented on HDFS-107:
---------------------------------------

I saw this issue on our small 6-node cluster too. It took a while to identify the root cause
of the problem. Symptoms were same as described here. In our case we have both 18 and 20 installed
in our cluster, but we only run 20. A user saw the HDFS exception for their job, so they stopped
20 and thought of going back to 18 and tried to start it. And then they switched back to 20
again. In doing all this, version files of datanode and namenode got messed up and DNs n NN
had different set of information in their version files.  Apart from this peculiar usecase,
as things are currently in hdfs, I think even one small misstep in upgrading the cluster can
result in this bug, as is reported in previous comments. I think at the cluster startup time
namenode and datanode should also exchange information contained in version file and in case
of mismatch, they should reconcile the differences, potentially asking users input in case
choices are not safe to make.

There are few  workarounds suggested in previous comments. Which one of these is recommended
one? 


> Data-nodes should be formatted when the name-node is formatted.
> ---------------------------------------------------------------
>
>                 Key: HDFS-107
>                 URL: https://issues.apache.org/jira/browse/HDFS-107
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Konstantin Shvachko
>
> The upgrade feature HADOOP-702 requires data-nodes to store persistently the namespaceID

> in their version files and verify during startup that it matches the one stored on the
name-node.
> When the name-node reformats it generates a new namespaceID.
> Now if the cluster starts with the reformatted name-node, and not reformatted data-nodes
> the data-nodes will fail with
> java.io.IOException: Incompatible namespaceIDs ...
> Data-nodes should be reformatted whenever the name-node is. I see 2 approaches here:
> 1) In order to reformat the cluster we call "start-dfs -format" or make a special script
"format-dfs".
> This would format the cluster components all together. The question is whether it should
start
> the cluster after formatting?
> 2) Format the name-node only. When data-nodes connect to the name-node it will tell them
to
> format their storage directories if it sees that the namespace is empty and its cTime=0.
> The drawback of this approach is that we can loose blocks of a data-node from another
cluster
> if it connects by mistake to the empty name-node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message