hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: Adding hard-disks to an existing HDFS cluster
Date Mon, 01 Mar 2010 19:00:19 GMT
On Mon, Mar 1, 2010 at 8:19 AM, Marc Farnum Rendino <mvgfr1@gmail.com> wrote:
> On Sun, Feb 28, 2010 at 5:27 PM, Eli Collins <eli@cloudera.com> wrote:
>
>> dfs.name.dir (where the NN
>> stores its metadata) should have multiple directories on different
>> disks to guard against the failure of any single disk. Many people
>> also use RAIDed disks and include an NFS mount in dfs.name.dir to have
>> additional, reliable copies of this data.
>
>
> Cool; so just to make sure I understand:
>
> The OS presents a single directory to the NameNode; that single directory
> can have other measures under it, like RAID.
>
> Right?

Yes, it's good to have multiple directories as well as may each or at
least some of the directories reliable, eg below
/data/<N>/dfs/namenode are local disks and /mnt/filer-hdfs is a
reliable NFS filer.

<name>dfs.name.dir</name>
<value>/data/1/dfs/namenode,/data/2/dfs/namenode,/mnt/filer-hdfs/dfs/namenode</value>

You also of course want to run a secondary namenode (2NN), though
that's more to keep the edits log size (and restart time) manageable
rather than for reliability purposes. Having a 2NN on a separate host
does act as a backup of your NN metadata, though it will always be out
of date. There's work on trunk to add a backup name node which gets a
stream of edits from the NN so it has an update copy of the metadata.

 Thanks,
Eli


>
> Thanks,
>
> Marc
>

Mime
View raw message