hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Region Server Count
Date Thu, 04 Mar 2010 04:59:56 GMT
@Steve Severance

It seems your email bounced on almost everyone's email account. I was
able to retrieve it but you should see if there's anything wrong by
contacting infra@apache.org

So your email was:

From: "Severance, Steve" <sseverance@ebay.com>
To: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org>
Date: Wed, 17 Feb 2010 14:46:28 -0700
Subject: Region Server Count
I am in the process of constructing a hadoop cluster with
approximately 400 nodes. We are going to be running HBase for storing
structured data and specifically making heavy use of row versioning.
HBase data will primarily be consumed by map reduce jobs. Should I
install a region server on every data node?


The answer is yes, you want to have a region server on every node or
you will encounter issues where the nodes hosting region servers will
be filled with data and the other nodes will be drastically less used.
The reason is that HDFS always writes the first replica on the local
datanode (if it exists). This gives better locality but in those
special cases where writers aren't present on all slave nodes you get
a very unbalanced cluster.


View raw message