hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Running HBase on Yarn … HoYa ?
Date Wed, 18 Sep 2013 20:21:28 GMT
Right now you are going to have to run HBase outside YARN, and on those
nodes with HBase configure YARN to offer less capacity -CPU and RAM- than
your (static) HBase demands will be.

The Hoya stuff is still immature -which currently offers the advantage that
I can make big changes to bits of the code and its persistent json
specifications without worrying about breaking things. It's also been
tracking the latest versions of Hadoop 2.1; which itself is in the final
stages of being ready for production.

What it is working towards is
 -being able to specify what you want in terms of an Hbase cluster:
version, node resources to request off YARN, have Hoya bring up the
cluster, and keep that cluster up
 -having YARN explicitly support long-lived services (see
https://issues.apache.org/jira/browse/YARN-896 ; other applications like
Samza share these needs)

There's another problem one that is common to HBase & YARN or
HBase-on-YARN, which is IO contention. We can use YARN's cgroup scheduling
to restrict the CPU and RAM load that a YARN container can use -so stop MR
jobs to cause HBase to swap out or be overloaded CPU-wise. What that
doesn't address is disk IO bandwidth, because that goes through the HDFS
datanodes -and that is shared across all processes as well as remote ones.

Apart from that, some slides are up

and the code is online

I'd welcome more participation in this,


> From: Michael Segel <michael_segel@hotmail.com>
> Date: Mon, Sep 16, 2013 at 8:07 AM
> Subject: Running HBase on Yarn … HoYa ?
> To: "user@hbase.apache.org" <user@hbase.apache.org>
> Hi,
> I'm going to post this to the YARN Google Groups as well since this
> problem intersects across both efforts…
> So, running HBase on Yarn…
> While its possible to bring up a Yarn cluster and manually start HBase
> outside of Yarn, it seems that this would end up causing some massive
> issues…
> It seems that Yarn needs to know about all of the resources on Hadoop
> cluster so that it can allocate resources. Running anything outside of Yarn
> on the cluster, may cause Yarn to oversubscribe resources.
> Is anyone seriously playing with Yarn and HBase?
> HOYA seems not fully baked and I am making a lot of assumptions about Yarn
> and HBase on Yarn.
> Thoughts?
> -Mike

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message