hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: Federated Namespaces - VM
Date Tue, 14 Jan 2014 18:57:15 GMT
This is my understanding and i can be wrong:  :)

you do not really need a different hardware instance unless your each
namespace is highly busy like a single namespace hdfs cluster.

you can setup  multiple namenodes on a single machine with different config
and different namenode directories and log directories.
But then that particular machine if down meaning all your namespaces will
be down which is not a good situation in client facing cluster.

In my experience (couple of years back), any hadoop cluster on a virtual
cluster is not optimal compared to real machine. This may have changed in
last two years as virtualization has been extensively developed as well.

so at the end its more of a day to day monitoring of how your clusters are
getting utilized and then think which one can be co-hosted and which need
to be given a full hardware instance

On Wed, Jan 15, 2014 at 12:14 AM, Devin Suiter RDX <dsuiter@rdx.com> wrote:

> Hi,
> I just want to throw out a discussion topic on federation.
> Reading *The Definitive Guide* on HDFS, it sounds like when federating,
> every distinct namespace needs a distinct namenode machine instance.
> This means if a company wanted three namespaces, say retail, commercial,
> government, they would have to have a host machine (or machine pair for
> high-availability) for each one, so 3 (pair) namenode hosts?
> What if a company was hosting client data? Say they had 20 clients
> accessing a cluster. 20 namespaces minimum, would mean 20 servers just for
> namenodes?
> At what point in this situation would it become practical to begin
> virtualizing namenodes on a high-powered virtualization cluster? I think
> there would be some calculation that would go into as to the expected size
> of the namespace partition vs. block density vs. memory...there would also
> be the obvious question of resource contention and overall system drag
> caused by that...
> What do other community members think?
> *Devin Suiter*
> Jr. Data Solutions Software Engineer
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com

Nitin Pawar

View raw message