hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Esteban Gutierrez <este...@cloudera.com>
Subject Re: HBase on HDFS: proper way to setup
Date Thu, 20 Aug 2015 15:45:55 GMT
Hello E,

You got it right :) HBase will work much more efficiently if RegionServers
are co-located with the DataNodes, e.g. 1:1 ratio and thats what in most
deployments HBase ops do. However, I've seen deployments where ops choose
to deploy less RegionServers than DataNodes or vice versa, but there are
more caveats of having less RSs than DNs specially due re-balancing of HDFS
blocks or when a RS goes down, etc. and that deployment mode usually causes
more problems. Deploying multiple RSs on top of a single DN node is
possible but it depends on your workload and if the effort to get it
"right" is worth.


Cloudera, Inc.

On Thu, Aug 20, 2015 at 8:32 AM, MrE <eleroy@msn.com> wrote:

> Hello,
> I'm new to HBase, so pardon the stupid question.
> Hbase is meant to run on HDFS I presume, although it is not the default on
> the 'single host' setup.
> My question is: assuming I have a HDFS cluster setup for storage (just
> What is the rule of thumb for deployment of HBase instances: should I have
> a
> HBase instance on each HDFS node?
> I assume the HBase instances should be close to the data to avoid network
> latencies, but do I need a HBase instance on each datanode?
> Is it any useful to have more HBase nodes than HDFS nodes?
> All the basic tutorials explain setting up HBase on local fs, and then
> explain that to setup as a cluster 'just point to HDFS' for storage, but I
> haven't found clear explanation of how all these nodes should be arranged
> together to be efficient.
> Thanks for the help.
> E
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/HBase-on-HDFS-proper-way-to-setup-tp4074047.html
> Sent from the HBase User mailing list archive at Nabble.com.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message