hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <jason.had...@gmail.com>
Subject Re: Secondary NameNodes or NFS exports?
Date Thu, 24 Dec 2009 01:07:21 GMT
I have no current solution.
When I can block a few days, I am going to instrument the code a bit more to
verify my understanding.

I believe the issue is that the time stamp is being checked against the
active edit log (the new one created then the checkpoint started) rather
than the time stamp of the rolled (old) edit log.
As long as no transactions have hit, the time stamps are the same.


On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <stas.oskin@gmail.com> wrote:

> Hi.
>
> What was your solution to this then?
>
> Regards.
>
> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <jason.hadoop@gmail.com>
> wrote:
>
> > I have dug into this more, it turns out the problem is unrelated to nfs
> or
> > solaris.
> > The issue is that if there is a meta data change, while the secondary is
> > rebuilding the fsimage, the rebuilt image is rejected.
> > On our production cluster, there is almost never a moment where there is
> > not
> > a file being created or altered, and as such the secondary is never make
> a
> > fresh fsimage for the cluster.
> >
> > I have checked this with several hadoop variants and with vanilla
> > distributions with the namenode, secondary and a datanode all running on
> > the
> > same machine.
> >
> > On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <jason.hadoop@gmail.com
> > >wrote:
> >
> > > The namenode would never accept the rebuild fsimage from the secondary,
> > so
> > > the edit logs grew with outbounds.
> > >
> > >
> > > On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <stas.oskin@gmail.com>
> > wrote:
> > >
> > >> Hi.
> > >>
> > >> You mean, you couldn't recover the NameNode from checkpoints because
> of
> > >> timestamps?
> > >>
> > >> Regards.
> > >>
> > >> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <jason.hadoop@gmail.com
> > >> >wrote:
> > >>
> > >> > We have been having some trouble with the secondary on a cluster
> that
> > >> has
> > >> > one edit log partition on an nfs server, with the namenode rejecting
> > the
> > >> > merged images due to timestamp missmatches.
> > >> >
> > >> >
> > >> > On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <stas.oskin@gmail.com>
> > >> wrote:
> > >> >
> > >> > > Hi.
> > >> > >
> > >> > > Thanks for the advice, it seems that the initial approach of
> having
> > >> > single
> > >> > > SecNameNode writing to exports is the way to go.
> > >> > >
> > >> > > By the way, I asked this already, but wanted to clarify:
> > >> > >
> > >> > > * It's possible to set how often SecNameNode checkpoints the
data
> > >> (what
> > >> > is
> > >> > > the setting by the way)?
> > >> > >
> > >> > > * It's possible to let NameNode write to exports as well together
> > with
> > >> > > local
> > >> > > disk, which ensures the latest possible meta-data in case of
disk
> > >> crash
> > >> > > (compared to pereodic check-pointing), but it's going to slow
down
> > the
> > >> > > operations due to network read/writes.
> > >> > >
> > >> > > Thanks again.
> > >> > >
> > >> > > On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> > >> > > <patrickangeles@gmail.com>wrote:
> > >> > >
> > >> > > > From what I understand, it's rather tricky to set up multiple
> > >> secondary
> > >> > > > namenodes. In either case, running multiple 2ndary NNs doesn't
> get
> > >> you
> > >> > > > much.
> > >> > > > See this thread:
> > >> > > >
> > >> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> > >> > > >
> > >> > > > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> > stas.oskin@gmail.com>
> > >> > > wrote:
> > >> > > >
> > >> > > > > To clarify, it's either let single SecNameNode to write
to
> > >> multiple
> > >> > NFS
> > >> > > > > exports, or actually have multiple SecNameNodes.
> > >> > > > >
> > >> > > > > Thanks again.
> > >> > > > >
> > >> > > > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> > stas.oskin@gmail.com
> > >> >
> > >> > > > wrote:
> > >> > > > >
> > >> > > > > > Hi.
> > >> > > > > >
> > >> > > > > > I'm want to keep a checkpoint data on several
separate
> > machines
> > >> for
> > >> > > > > backup,
> > >> > > > > > and deliberating between exporting these machines
disks via
> > NFS,
> > >> or
> > >> > > > > actually
> > >> > > > > > running Secondary Name Nodes there.
> > >> > > > > >
> > >> > > > > > Can anyone advice what would be better in my case?
> > >> > > > > >
> > >> > > > > > Regards.
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > >> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > >> > www.prohadoopbook.com a community for Hadoop Professionals
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > www.prohadoopbook.com a community for Hadoop Professionals
> > >
> >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message