hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4329) Use NextGen Hadoop to deploy HBase
Date Mon, 05 Sep 2011 02:41:09 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096986#comment-13096986

Arun C Murthy commented on HBASE-4329:

bq. Why you say that? (I don't disagree but a list of why's would help figure what the fit
criteria for closing this issue are).

Stack, first up, I didn't mean to start to flame - I'm sure you know that. :)

FWIW, talking to folks around, isolation and support for prioritization to ensure a single
user/application cannot *hog* a HBase cluster (or parts thereof) is something I've heard as
concern. This dovetails very well with our experience running both HDFS and MapReduce at scale,
as a shared resource. Again, this isn't to claim it's a solved problem in Hadoop core, just
something we've focussed on, for a while now.

Hence, my thinking was we could use YARN as an intermediate solution. I discussed this idea
with Andrew at the Summit and he didn't give me the impression that I was off my rocker, maybe
he was just being polite and has a great poker face! 

Thanks for pointing me to HBASE-4120, that seems related - I wasn't aware. It's a lot to digest,
I'll try to spend some time on it. If the HBase community decides to focus on the multi-tenancy/isolation
problem (via HBASE-4120 etc.) - great! We can close this discussion. If not, I'd like to brainstorm
with you guys for an intermediate solution. 

It really depends where you guys want to focus your energies.

bq. Meantime, where I work, mapreduce is the problem (smile). We're messing with cgroup containing
mapreduce so it doesn't steal resources from hdfs (and hbase).

I'm sure - MR needs more work, I'm painfully aware of this! :)

We plan to go the cgroups route sometime right after we ship 0.23, we could share notes and

bq. You want us to get into the nextgen mr container because then there is one place to go
to do accounting?

The idea is that *iff* the HBase community wants to use this an an intermediate solution,
using the RM will ensure the resource usage of HBase is accounted for w.r.t to the applications/queues/organizations

> Use NextGen Hadoop to deploy HBase
> ----------------------------------
>                 Key: HBASE-4329
>                 URL: https://issues.apache.org/jira/browse/HBASE-4329
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Arun C Murthy
> Currently (circa 2011), with due respect, it's not practical to run shared, multi-tenant
HBase clusters on the largest Hadoop installs (of 4000+ nodes).
> As an interim, I'd like to brainstorm using NextGen Hadoop (MAPREDUCE-279) to deploy
HBase for focussed sets of applications/users/organizations. Thus, one could deploy a smaller
instance of HBase (100s of nodes) in a large Hadoop cluster and use it for a set of applications.
> The other advantage is that the resource usage of HBase (master, region-server etc.)
is accounted for in the overall utilization of the cluster and, conceivably, aid in resource
tracking, capacity planning etc.
> ----
> Thoughts?

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message