hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zach Cox <zcox...@gmail.com>
Subject Re: How to restart an HDFS standby namenode dead for a very long time
Date Fri, 15 Jul 2016 11:58:51 GMT
Yes it's definitely possible we are hitting that jira. Do we need to do
anything other than rsync dfs.name.dir from the active namenode before
starting the standby namenode again?

Thanks,
Zach


On Fri, Jul 15, 2016 at 2:21 AM Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

> Seems to be you are hitting following jira.. Please refer
>
>
>
> https://issues.apache.org/jira/browse/HDFS-9917
>
>
>
>
>
>
>
>
>
> --Brahma Reddy Battula
>
>
>
> *From:* Zach Cox [mailto:zcox522@gmail.com]
> *Sent:* 14 July 2016 03:34
> *To:* user@hadoop.apache.org
> *Subject:* How to restart an HDFS standby namenode dead for a very long
> time
>
>
>
> Hi - we have an HDFS (version 2.0.0-cdh4.4.0) cluster setup in HA with 2
> namenodes and 5 journal nodes. This cluster has been somewhat neglected
> (long story) and the standby namenode process has been dead for several
> months.
>
>
>
> Recently we tried to just start the standby namenode process again, but
> several hours later the entire HDFS cluster (and HBase on top of it) was
> unavailable for several hours. As soon as we stopped the standby namenode
> process, HDFS (and HBase) started working fine again. I don't know for
> sure, but I'm guessing the standby namenode was trying to catch up on
> several months of edits from being down for so long, and just couldn't do
> it.
>
>
>
> We really need to get this standby namenode process started again, so I'm
> trying to find the right way to do it. I've tried starting it with the
> -bootstrapStandby option, but that appears broken in our HDFS version.
> Instead, we can manually rsync the files in the dfs.name.dir from the
> active namenode.
>
>
>
> I guess my question is: is there a recommended way to get this standby
> namenode resurrected successfully? And would we need to do anything other
> than rsync dfs.name.dir from the active namenode before starting the
> standby namenode again?
>
>
>
> Thanks,
>
> Zach
>
>
>

Mime
View raw message