hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: HBase - Performance issue
Date Sat, 06 Sep 2014 20:54:44 GMT
What type of drives. controllers, and network bandwidth do you have? 

Just curious.


On Sep 6, 2014, at 7:37 PM, kiran <kiran.sarvabhotla@gmail.com> wrote:

> Also the hbase version is 0.94.1
> 
> 
> On Sun, Sep 7, 2014 at 12:00 AM, kiran <kiran.sarvabhotla@gmail.com> wrote:
> 
>> Lars,
>> 
>> We are facing a similar situation on the similar cluster configuration...
>> We are having high I/O wait percentages on some machines in our cluster...
>> We have short circuit reads enabled but still we are facing the similar
>> problem.. the cpu wait goes upto 50% also in some case while issuing scan
>> commands with multiple threads.. Is there a work around other than applying
>> the patch for 0.94.4 ??
>> 
>> Thanks
>> Kiran
>> 
>> 
>> On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl <larsh@apache.org> wrote:
>> 
>>> You may have run into https://issues.apache.org/jira/browse/HBASE-7336
>>> (which is in 0.94.4)
>>> (Although I had not observed this effect as much when short circuit reads
>>> are enabled)
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>> From: kzurek <kzurek@proximetry.pl>
>>> To: user@hbase.apache.org
>>> Cc:
>>> Sent: Wednesday, April 24, 2013 3:12 AM
>>> Subject: HBase - Performance issue
>>> 
>>> The problem is that when I'm putting my data (multithreaded client,
>>> ~30MB/s
>>> traffic outgoing) into the cluster the load is equally spread over all
>>> RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When
>>> I've added similar, mutlithreaded client that Scans for, let say, 100 last
>>> samples of randomly generated key from chosen time range, I'm getting high
>>> CPU wait time (20% and up) on two (or more if there is higher number of
>>> threads, default 10) random RegionServers. Therefore, machines that held
>>> those RS are getting very hot - one of the consequences is that number of
>>> store file is constantly increasing, up to the maximum limit. Rest of the
>>> RS
>>> are having 10-12% CPU wait time and everything seems to be OK (number of
>>> store files varies so they are being compacted and not increasing over
>>> time). Any ideas? Maybe  I could prioritize writes over reads somehow? Is
>>> it
>>> possible? If so what would be the best way to that and where it should be
>>> placed - on the client or cluster side)?
>>> 
>>> Cluster specification:
>>> HBase Version    0.94.2-cdh4.2.0
>>> Hadoop Version    2.0.0-cdh4.2.0
>>> There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes
>>> Other settings:
>>> - Bloom filters (ROWCOL) set
>>> - Short circuit turned on
>>> - HDFS Block Size: 128MB
>>> - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB
>>> - Java Heap Size of HBase RegionServer in Bytes: 12 GiB
>>> - Java Heap Size of HBase Master in Bytes: 4 GiB
>>> - Java Heap Size of DataNode in Bytes: 1 GiB (default)
>>> Number of regions per RegionServer: 19 (total 114 regions on 6 RS)
>>> Key design: <UUID><TIMESTAMP> -> UUID: 1-10M, TIMESTAMP: 1-N
>>> Table design: 1 column family with 20 columns of 8 bytes
>>> 
>>> Get client:
>>> Multiple threads
>>> Each thread have its own tables instance with their Scanner.
>>> Each thread have its own range of UUIDs and randomly draws beginning of
>>> time
>>> range to build rowkey properly (see above).
>>> Each time Scan requests same amount of rows, but with random rowkey.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>> 
>>> 
>> 
>> 
>> --
>> Thank you
>> Kiran Sarvabhotla
>> 
>> -----Even a correct decision is wrong when it is taken late
>> 
>> 
> 
> 
> -- 
> Thank you
> Kiran Sarvabhotla
> 
> -----Even a correct decision is wrong when it is taken late


Mime
View raw message