hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From li ping <li.j...@gmail.com>
Subject Re: namenode doesn't start after reboot
Date Thu, 23 Dec 2010 12:45:47 GMT
As far as I know, setup a backup namenode dir is enough.

I haven't use the hadoop in a production environment. So, I can't tell you
what would be right way to reboot the server.

On Thu, Dec 23, 2010 at 6:50 PM, Bjoern Schiessle <bjoern@schiessle.org>wrote:

> Hi,
> On Thu, 23 Dec 2010 09:30:17 +0800 li ping wrote:
> > It seems the exception occurs during NameNode loads the editlog.
> > make sure the editlog file exists. or you can debug the application to
> > see what's wrong.
> last night I tried to fix the problem and did a big mistake. Instead of
> copying /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits and
> edits.new to a backup I moved them and later delete the only version
> hence I thought I have a copy.
> The good thing: The namenode starts again.
> The bad thing: My file system is now in an inconsistent state.
> Probably the only solution is to reformat the hdfs and start from
> scratch. Thankfully there wasn't that much data stored at the hdfs until
> now but I definitely have to make sure that this doesn't happens again:
> 1. I have set up a second dfs.name.dir which is stored at another
> computer (mounted by sshfs)
> 2. I will install a backup script similar to:
> http://blog.milford.io/2010/10/simple-hadoop-namenode-backup-script
> Do you think this should be enough to overcome such situations in the
> future? Any additional ideas how to make it more safe?
> I'm still a little bit afraid if I think about the next time I will have
> to reboot the server. Shouldn't a reboot safely stop and restart all
> Hadoop services? Is there any thing I can do to make sure that the next
> reboot will not cause the same problems?
> Thanks a lot!
> Björn


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message