hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Czech <e...@nextbigsound.com>
Subject Re: Managing MapReduce jobs with concurrent client reads
Date Fri, 07 Sep 2012 15:28:25 GMT
Neither right now -- I'm just assuming that it would be a problem
since I would definitely have to support both in a hypothetical
HBase+Hadoop installment that isn't actually built yet.

Did you ever try corralling those jobs by just reducing the number of
available map/reduce tasks or did you find that that isn't a reliable
throttling mechanism?

Also, is replication to that batch cluster done via HBase replication
or some other approach?

On Thu, Sep 6, 2012 at 4:08 PM, Stack <stack@duboce.net> wrote:
> On Wed, Sep 5, 2012 at 6:25 AM, Eric Czech <eric@nextbigsound.com> wrote:
> > Hi everyone,
> >
> > Does anyone have any recommendations on how to maintain low latency for
> > small, individual reads from HBase while MapReduce jobs are being run?  Is
> > replication a good way to handle this (i.e. run small, low-latency queries
> > against a replicated copy of the data and run the MapReduce jobs on the
> > master copy)?
> MapReduce is blowing your caches or higher i/o is sending up latency
> when you have cache miss?  Or its using all the CPU?
> Dependent on how its impinges, you could trying corralling mapreduce
> (cgroups/jail) or go to an extreme and keep a low latency OLTP cluster
> running well-known, well-behaved mapreduce jobs replicating into a
> batch cluster where mapreduce is allowed free rein (This is what we do
> where I work.  We also cgroup mapreduce cluster even on our batch
> cluster so random big MR doesn't make the pagers go off during sleepy
> time).
> St.Ack

View raw message