hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Loddengaard" <a...@cloudera.com>
Subject Re: HDFS NameNode and HA: best strategy?
Date Fri, 14 Nov 2008 18:46:16 GMT
The image and edits files are copied to the secondary namenode periodically,
so if you provision a new namenode from the secondary namenode, then your
new namenode may be lacking state that the original namenode had.  You
should grab from the namenode NFS mount, not from the secondary namenode
image and edits files.
As for a script to do this, I'm not aware of one.  However, it should be as
easy as a SCP or a RSYNC, a call to start-all.sh, etc.

Alex

On Fri, Nov 14, 2008 at 10:20 AM, Bill Au <bill.w.au@gmail.com> wrote:

> There is a "secondary" NameNode which performs periodic checkpoints:
>
> http://wiki.apache.org/hadoop/FAQ?highlight=(secondary)#7
>
> Are there any instructions out there on how to copy the FS image and edits
> log from the secondary NameNode to a new machine when the original NameNode
> fails?
>
> Bill
>
> On Fri, Nov 14, 2008 at 12:50 PM, Alex Loddengaard <alex@cloudera.com
> >wrote:
>
> > HDFS does have a single point of failure, and there is no way around this
> > in
> > its current implementation.  The namenode keeps track of a FS image and
> and
> > edits log.  It's common for these to be stored both on the local disk and
> > on
> > a NFS mount.  In the case when the namenode fails, a new machine can be
> > provisioned to be the namenode by loading the backed-up image and edits
> > files.
> > Can you say more about how you'll use HDFS?  It's not a very latent file
> > system, so it shouldn't be used to serve images, videos, etc in a web
> > environment.  It's most common use is to be the basis of batch Map/Reduce
> > jobs.
> >
> > Alex
> >
> > On Thu, Nov 13, 2008 at 5:18 PM, S. L. <slunati@gmail.com> wrote:
> >
> > > Hi list
> > > I am kind of new to Hadoop but have some good background. I am
> seriously
> > > considering adopting Hadoop and especially HDFS first to be able to
> store
> > > various files (in the low hundreds thousands at first) on a few nodes
> in
> > a
> > > manner where I don't need a RAID system or a SAN. HDFS seems a perfect
> > fit
> > > for the job...
> > >
> > > BUT
> > >
> > > from what I learn in the past couple days it seems that the single
> point
> > of
> > > failure in HDFS is the NameNode. So I was wondering if anyone in the
> list
> > > that did deploy HDFS in a production environment on what is their
> > strategy
> > > for High Availability of the system... Having the NameNode unavailable
> is
> > > basically bringing the whole HDFS system offline. So what are the
> scripts
> > > or
> > > other techniques recommended to add H.A to HDFS !
> > >
> > > Thank !
> > >
> > > -- S.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message