hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meng Mao <meng...@gmail.com>
Subject Re: desperate question about NameNode startup sequence
Date Sat, 17 Dec 2011 21:04:39 GMT
Bruce, thanks for moving this over. I wasn't aware there were new lists for
CDH.

How should I diagnose if our 2NN is working right now?

On Sat, Dec 17, 2011 at 4:00 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> The problem with checkpoint /2nn is that it happily "runs" and has no
> outward indication that it is unable to connect.
>
> Because you have a large edits file you startup will complete, however
> with that size it could take hours. It logs nothing while this is going on
> but as long as the CPU is working that means it is progressing.
>
> We have a nagios check on the size of this directory so if the edit
> rolling stops we know about it.
>
>
> On Saturday, December 17, 2011, Brock Noland <brock@cloudera.com> wrote:
> > Hi,
> >
> > Since your using CDH2, I am moving this to CDH-USER. You can subscribe
> here:
> >
> > http://groups.google.com/a/cloudera.org/group/cdh-user
> >
> > BCC'd common-user
> >
> > On Sat, Dec 17, 2011 at 2:01 AM, Meng Mao <mengmao@gmail.com> wrote:
> >> Maybe this is a bad sign -- the edits.new was created before the master
> >> node crashed, and is huge:
> >>
> >> -bash-3.2$ ls -lh /hadoop/hadoop-metadata/cache/dfs/name/current
> >> total 41G
> >> -rw-r--r-- 1 hadoop hadoop 3.8K Jan 27  2011 edits
> >> -rw-r--r-- 1 hadoop hadoop  39G Dec 17 00:44 edits.new
> >> -rw-r--r-- 1 hadoop hadoop 2.5G Jan 27  2011 fsimage
> >> -rw-r--r-- 1 hadoop hadoop    8 Jan 27  2011 fstime
> >> -rw-r--r-- 1 hadoop hadoop  101 Jan 27  2011 VERSION
> >>
> >> could this mean something was up with our SecondaryNameNode and rolling
> the
> >> edits file?
> >
> > Yes it looks like a checkpoint never completed. It's a good idea to
> > monitor the mtime on fsimage to ensure it never gets too old.
> >
> > Has a checkpoint completed since you restarted?
> >
> > Brock
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message