hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-107) Data-nodes should be formatted when the name-node is formatted.
Date Mon, 13 Jun 2011 22:55:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048852#comment-13048852
] 

Todd Lipcon commented on HDFS-107:
----------------------------------

I think adding another config here is unnecessary. What's the downside of adding a "-format"
flag to the datanode, and having "start-dfs -format" pass it along?

> Data-nodes should be formatted when the name-node is formatted.
> ---------------------------------------------------------------
>
>                 Key: HDFS-107
>                 URL: https://issues.apache.org/jira/browse/HDFS-107
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.23.0
>            Reporter: Konstantin Shvachko
>         Attachments: HDFS-107-1.patch
>
>
> The upgrade feature HADOOP-702 requires data-nodes to store persistently the namespaceID

> in their version files and verify during startup that it matches the one stored on the
name-node.
> When the name-node reformats it generates a new namespaceID.
> Now if the cluster starts with the reformatted name-node, and not reformatted data-nodes
> the data-nodes will fail with
> java.io.IOException: Incompatible namespaceIDs ...
> Data-nodes should be reformatted whenever the name-node is. I see 2 approaches here:
> 1) In order to reformat the cluster we call "start-dfs -format" or make a special script
"format-dfs".
> This would format the cluster components all together. The question is whether it should
start
> the cluster after formatting?
> 2) Format the name-node only. When data-nodes connect to the name-node it will tell them
to
> format their storage directories if it sees that the namespace is empty and its cTime=0.
> The drawback of this approach is that we can loose blocks of a data-node from another
cluster
> if it connects by mistake to the empty name-node.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message