hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vrodio...@carrieriq.com>
Subject RE: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)
Date Sat, 01 Feb 2014 03:30:19 GMT
>> #3 I'm not sure I understand this suggestion - are you saying doing region
>> custom region splitting?  Each region is fully compacted so there is only
>> one HFile.  The queries I do are: "get me the most recent versions, up to
>> 200".  However I need to store more versions, because I may ask "get me the
>> most recent versions, up to 200 that I would have seen yesterday".

I am afraid, your only option is SSD.

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Jan Schellenberger [leipzig3@gmail.com]
Sent: Friday, January 31, 2014 6:38 PM
To: user@hbase.apache.org
Subject: RE: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)

Thank you.  I will have to test these things one at a time.

I re-enabled compression (SNAPPY for now) and changed the block encoding to
FAST_DIFF.

#1 I will try GZ encoding.
#2 The block cache size is already at .4. I will try to increase it a bit
more but I will never get the whole set into memory.
I will disable bloom filter.

#4 I will investigate this.  I thought I read somewhere that cloudera 4.3
has this shortcut enabled by default but I will try to verify.

#3 I'm not sure I understand this suggestion - are you saying doing region
custom region splitting?  Each region is fully compacted so there is only
one HFile.  The queries I do are: "get me the most recent versions, up to
200".  However I need to store more versions, because I may ask "get me the
most recent versions, up to 200 that I would have seen yesterday".


#5 HDFS short circuit is already enabled already by default.
#6 yes SSD would clearly be better.

#7 The average result of the get is fairly small.  no more than 1kB I'd say.
We do hit each key with roughly the same probability.



I'm concerned about the block cache... It sounds like the improper blocks
are being cached.  i thought there was a preference to cache index and bloom
blocks.

I'm currently* running 60 queries/second* one node and it's reading
blockCacheHitRatio=29 and blockCacheHitCachingRatio=65% (not sure what's the
difference).

I also see rootIndexSize=122k totalStaticIndexSize=88MB and
totalstaticBloomSize=80MB (will disable bloomfilters in next run of this).
hdfslocality=100%





--
View this message in context: http://apache-hbase.679495.n3.nabble.com/Slow-Get-Performance-or-how-many-disk-I-O-does-it-take-for-one-non-cached-read-tp4055545p4055554.html
Sent from the HBase User mailing list archive at Nabble.com.

Confidentiality Notice:  The information contained in this message, including any attachments
hereto, may be confidential and is intended to be read only by the individual or entity to
whom this message is addressed. If the reader of this message is not the intended recipient
or an agent or designee of the intended recipient, please note that any review, use, disclosure
or distribution of this message or its attachments, in any form, is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com
and delete or destroy any copy of this message and its attachments.

Mime
View raw message