hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers" <...@cloudera.com>
Subject Re: namenode doesn't start after reboot
Date Thu, 23 Dec 2010 17:15:41 GMT
All this aside, you really shouldn't have to "safely" stop all the Hadoop
services when you reboot any of your servers. Hadoop should be able to
survive a crash of any of the daemons. Any circumstance in which Hadoop
currently corrupts the edits log or fsimage is a serious bug, and should be
reported via JIRA.

--
Aaron T. Myers
Software Engineer, Cloudera



On Thu, Dec 23, 2010 at 7:29 AM, rahul patodi <patodirahul@gmail.com> wrote:

> Hi,
> If you want to reboot the server:
> 1. stop mapred
> 2. stop dfs
> the reboot
> when you again want to restart hadoop
> firstly start dfs then mepred.
>
> --
> *Regards*,
> Rahul Patodi
> Software Engineer,
> Impetus Infotech (India) Pvt Ltd,
> www.impetus.com
> Mob:09907074413
>
>
> On Thu, Dec 23, 2010 at 6:15 PM, li ping <li.j2ee@gmail.com> wrote:
>
> > As far as I know, setup a backup namenode dir is enough.
> >
> > I haven't use the hadoop in a production environment. So, I can't tell
> you
> > what would be right way to reboot the server.
> >
> > On Thu, Dec 23, 2010 at 6:50 PM, Bjoern Schiessle <bjoern@schiessle.org
> > >wrote:
> >
> > > Hi,
> > >
> > > On Thu, 23 Dec 2010 09:30:17 +0800 li ping wrote:
> > > > It seems the exception occurs during NameNode loads the editlog.
> > > > make sure the editlog file exists. or you can debug the application
> to
> > > > see what's wrong.
> > >
> > > last night I tried to fix the problem and did a big mistake. Instead of
> > > copying /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits and
> > > edits.new to a backup I moved them and later delete the only version
> > > hence I thought I have a copy.
> > >
> > > The good thing: The namenode starts again.
> > > The bad thing: My file system is now in an inconsistent state.
> > >
> > > Probably the only solution is to reformat the hdfs and start from
> > > scratch. Thankfully there wasn't that much data stored at the hdfs
> until
> > > now but I definitely have to make sure that this doesn't happens again:
> > >
> > > 1. I have set up a second dfs.name.dir which is stored at another
> > > computer (mounted by sshfs)
> > > 2. I will install a backup script similar to:
> > > http://blog.milford.io/2010/10/simple-hadoop-namenode-backup-script
> > >
> > > Do you think this should be enough to overcome such situations in the
> > > future? Any additional ideas how to make it more safe?
> > >
> > > I'm still a little bit afraid if I think about the next time I will
> have
> > > to reboot the server. Shouldn't a reboot safely stop and restart all
> > > Hadoop services? Is there any thing I can do to make sure that the next
> > > reboot will not cause the same problems?
> > >
> > > Thanks a lot!
> > > Björn
> > >
> > >
> > >
> >
> >
> > --
> > -----李平
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message