hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bjoern Schiessle <bjo...@schiessle.org>
Subject Re: namenode doesn't start after reboot
Date Thu, 23 Dec 2010 10:50:04 GMT

On Thu, 23 Dec 2010 09:30:17 +0800 li ping wrote:
> It seems the exception occurs during NameNode loads the editlog.
> make sure the editlog file exists. or you can debug the application to
> see what's wrong.

last night I tried to fix the problem and did a big mistake. Instead of
copying /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits and
edits.new to a backup I moved them and later delete the only version
hence I thought I have a copy.

The good thing: The namenode starts again.
The bad thing: My file system is now in an inconsistent state.

Probably the only solution is to reformat the hdfs and start from
scratch. Thankfully there wasn't that much data stored at the hdfs until
now but I definitely have to make sure that this doesn't happens again:

1. I have set up a second dfs.name.dir which is stored at another
computer (mounted by sshfs)
2. I will install a backup script similar to:

Do you think this should be enough to overcome such situations in the
future? Any additional ideas how to make it more safe?

I'm still a little bit afraid if I think about the next time I will have
to reboot the server. Shouldn't a reboot safely stop and restart all
Hadoop services? Is there any thing I can do to make sure that the next
reboot will not cause the same problems?

Thanks a lot!

View raw message