hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Herberts <mathias.herbe...@gmail.com>
Subject RE: Shared HDFS for HBase and MapReduce
Date Wed, 06 Jun 2012 07:19:02 GMT
We run M/R jobs that query HBase in a pool with a limited number of mapper
slots, works like a charm to have both RT and batch queries on HBase
On Jun 6, 2012 6:23 AM, "Vladimir Rodionov" <vrodionov@carrieriq.com> wrote:

> You can share HBase and MR if you run MR jobs only to process data off
> HBase and do not use HBase for real-time queries
> It is not generally advisable to share live (real-time) HBase cluster and
> run MR jobs at the same time as since HDFS can get easily saturated
> by MR jobs and you will have much worse HBase query latency and overall
> query throughput.
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
> ________________________________________
> From: saint.ack@gmail.com [saint.ack@gmail.com] On Behalf Of Stack [
> stack@duboce.net]
> Sent: Tuesday, June 05, 2012 9:07 PM
> To: dev@hbase.apache.org
> Cc: hbase-dev@hadoop.apache.org
> Subject: Re: Shared HDFS for HBase and MapReduce
> On Tue, Jun 5, 2012 at 8:29 PM, Atif Khan <atif_ijaz_khan@hotmail.com>
> wrote:
> > My first thoughts were to create a single HDFS cluster, and then point
> the
> > MapReduce and HBase servers to use the common HDFS installation.
>  However,
> > Cloudera's Dos and Don'ts page
> > (http://www.cloudera.com/blog/2011/04/hbase-dos-and-donts/) insists that
> > MapReduce and HBase should not share an HDFS cluster.  Rather they should
> > have their own individual clusters.  I don't understand this
> recommendation,
> > as it would result in moving data around from one HDFS cluster to another
> > when running MapReduce over HBase.
> >
> It starts out "Be careful when running mixed workloads on an HBase
> cluster."  Does your use case fit the case described: "...SLAs on
> hbase access" and at the same time running heavy mapreduce jobs on
> same cluster?  If so, you may want to do the suggested two clusters.
> I'd suggest you start w/ all on the one cluster and see how you do.
> That post is > a year old.  HBase has gotten steadily better since.
> St.Ack
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message