hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: Inconsistency in namenode's and datanode's namespaceID
Date Thu, 03 Jul 2008 17:57:29 GMT
Yes this is a known bug.
You should manually remove "current" directory from every data-node
after reformatting the name-node and start the cluster again.
I do not believe there is any other way.

Taeho Kang wrote:
> No, I don't think it's a bug.
> Your datanodes' data partition/directory was probably used in other HDFS
> setup and thus had other namespaceID.
> Or you could've used other partition/directory for your new HDFS setup by
> setting different values for "dfs.data.dir" on your datanode. But in this
> case, you can't access your old HDFS's data.
> On Thu, Jul 3, 2008 at 4:21 AM, Xuan Dzung Doan <doanxuandung@yahoo.com>
> wrote:
>> I was following the quickstart guide to run pseudo-distributed operations
>> with Hadoop 0.16.4. I got it to work successfully the first time. But I
>> failed to repeat the steps (I tried to re-do everything from re-formating
>> the HDFS). Then by looking at the log files of the daemons, I found out the
>> datanode failed to start because its namespaceID didn't match with the
>> namenode's. I after that found that the namespaceID is stored in the text
>> file VERSION under dfs/data/current and dfs/name/current for the datanode
>> and the namenode, respectively. The reformatting step does change
>> namespaceID of the namenode, but not for the datanode, and that's the cause
>> for the inconsistency. So after reformatting, if I manually update
>> namespaceID for the datanode, things will work totally fine again.
>> I guess there are probably others who had this same experience. Is it a bug
>> in Hadoop 0.16.4? If so, has it been taken care of in later versions?
>> Thanks,
>> David.

View raw message