hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin O'Dell" <ke...@rocana.com>
Subject Re: how to optimize for heavy writes scenario
Date Fri, 17 Mar 2017 21:14:12 GMT
Hey Hef,

  What is the memstore size setting(how much heap is it allowed) that you
have on that cluster?  What is your region count per node?  Are you writing
evenly across all those regions or are only a few regions active per region
server at a time?  Can you paste your GC settings that you are currently
using?

On Fri, Mar 17, 2017 at 3:30 PM, Stack <stack@duboce.net> wrote:

> On Fri, Mar 17, 2017 at 9:31 AM, Hef <hef.online@gmail.com> wrote:
>
> > Hi group,
> > I'm using HBase to store large amount of time series data, the usage case
> > is heavy on writes then reads. My application stops at writing 600k
> > requests per second and I can't tune up for better tps.
> >
> > Hardware:
> > I have 6 Region Servers, each has 128G memory, 12 HDDs, 2cores with
> > 24threads,
> >
> > Schema:
> > The schema for these time series data is similar as OpenTSDB that the
> data
> > points of a same metric within an hour are store in one row, and there
> > could be maximum 3600 columns per row.
> > The cell is about 70bytes on its size, including the rowkey, column
> > qualifier, column family and value.
> >
> > HBase config:
> > CDH 5.6 HBase 1.0.0
> >
>
> Can you upgrade? There's a big diff between 1.2 and 1.0.
>
>
> > 100G memory for each RegionServer
> > hbase.hstore.compactionThreshold = 50
> > hbase.hstore.blockingStoreFiles = 100
> > hbase.hregion.majorcompaction disable
> > hbase.client.write.buffer = 20MB
> > hbase.regionserver.handler.count = 100
> >
>
> Could try halving the handler count.
>
>
> > hbase.hregion.memstore.flush.size = 128MB
> >
> >
> > Why are you flushing? If it is because you are hitting this flush limit,
> can you try upping it?
>
>
>
> > HBase Client:
> > write in BufferedMutator with 100000/batch
> >
> > Inputs Volumes:
> > The input data throughput is more than 2millions/sec from Kafka
> >
> >
> How is the distribution? Evenly over the keyspace?
>
>
> > My writer applications are distributed, how ever I scaled them up, the
> > total write throughput won't get larger than 600K/sec.
> >
>
>
> Tell us more about this scaling up? How many writers?
>
>
>
> > The severs have 20% CPU usage and 5.6 wa,
> >
>
> 5.6 is high enough. Is the i/o spread over the disks?
>
>
>
> > GC  doesn't look good though, it shows a lot 10s+.
> >
> >
> What settings do you have?
>
>
>
> > In my opinion,  1M/s input data will result in only  70MByte/s write
> > throughput to the cluster, which is quite a small amount compare to the 6
> > region servers. The performance should not be bad like this.
> >
> > Is anybody has idea why the performance stops at 600K/s?
> > Is there anything I have to tune to increase the HBase write throughput?
> >
>
>
> If you double the clients writing do you see an up in the throughput?
>
> If you thread dump the servers, can you tell where they are held up? Or if
> they are doing any work at all relative?
>
> St.Ack
>



-- 
Kevin O'Dell
Field Engineer
850-496-1298 | Kevin@rocana.com
@kevinrodell
<http://www.rocana.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message