hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: how to optimize for heavy writes scenario
Date Fri, 17 Mar 2017 19:30:24 GMT
On Fri, Mar 17, 2017 at 9:31 AM, Hef <hef.online@gmail.com> wrote:

> Hi group,
> I'm using HBase to store large amount of time series data, the usage case
> is heavy on writes then reads. My application stops at writing 600k
> requests per second and I can't tune up for better tps.
>
> Hardware:
> I have 6 Region Servers, each has 128G memory, 12 HDDs, 2cores with
> 24threads,
>
> Schema:
> The schema for these time series data is similar as OpenTSDB that the data
> points of a same metric within an hour are store in one row, and there
> could be maximum 3600 columns per row.
> The cell is about 70bytes on its size, including the rowkey, column
> qualifier, column family and value.
>
> HBase config:
> CDH 5.6 HBase 1.0.0
>

Can you upgrade? There's a big diff between 1.2 and 1.0.


> 100G memory for each RegionServer
> hbase.hstore.compactionThreshold = 50
> hbase.hstore.blockingStoreFiles = 100
> hbase.hregion.majorcompaction disable
> hbase.client.write.buffer = 20MB
> hbase.regionserver.handler.count = 100
>

Could try halving the handler count.


> hbase.hregion.memstore.flush.size = 128MB
>
>
> Why are you flushing? If it is because you are hitting this flush limit,
can you try upping it?



> HBase Client:
> write in BufferedMutator with 100000/batch
>
> Inputs Volumes:
> The input data throughput is more than 2millions/sec from Kafka
>
>
How is the distribution? Evenly over the keyspace?


> My writer applications are distributed, how ever I scaled them up, the
> total write throughput won't get larger than 600K/sec.
>


Tell us more about this scaling up? How many writers?



> The severs have 20% CPU usage and 5.6 wa,
>

5.6 is high enough. Is the i/o spread over the disks?



> GC  doesn't look good though, it shows a lot 10s+.
>
>
What settings do you have?



> In my opinion,  1M/s input data will result in only  70MByte/s write
> throughput to the cluster, which is quite a small amount compare to the 6
> region servers. The performance should not be bad like this.
>
> Is anybody has idea why the performance stops at 600K/s?
> Is there anything I have to tune to increase the HBase write throughput?
>


If you double the clients writing do you see an up in the throughput?

If you thread dump the servers, can you tell where they are held up? Or if
they are doing any work at all relative?

St.Ack

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message