hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weihua JIANG <weihua.ji...@gmail.com>
Subject Re: How to speedup Hbase query throughput
Date Tue, 26 Apr 2011 05:36:52 GMT
I use two machines (each with 30 threads) to act as clients. Both
servers and clients are connected via giganet.

Thanks
Weihua

2011/4/26 Chris Tarnas <cft@tarnas.org>:
> For your query tests, are they all from a single thread? Have you tried reading from
multiple threads/processes in parallel - that sounds more like your use case.
>
> -chris
>
>
>
> On Apr 25, 2011, at 10:04 PM, Weihua JIANG <weihua.jiang@gmail.com> wrote:
>
>> The query is all random read. The scenario is that a user want to
>> query his own monthly bill report, e.g. to query what happened on his
>> bill in March, or Feb, etc. Since every user may want to do so, we
>> can't predict who will be the next to ask for such monthly bill
>> report.
>>
>> 2011/4/26 Stack <stack@duboce.net>:
>>>> Currently, to store bill records, we can achieve about 30K record/second.
>>>>
>>>
>>> Can you use bulk load?  See http://hbase.apache.org/bulk-loads.html
>>>
>>>> However, the query performance is quite poor. We can only achieve
>>>> about 600~700 month_report/second. That is, each region server can
>>>> only serve query for about 100 row/second. Block cache hit ratio is
>>>> about 20%.
>>>>
>>>
>>> This is random accesses?  Why random accesses and not scans?
>>>
>>>
>>>> Do you have any advice on how to improve the query performance?
>>>>
>>>
>>> See above cited performance section from website book.
>>>
>>>
>>>> Below is some metrics info reported by region server:
>>>> 2011-04-26T10:56:12 hbase.regionserver:
>>>> RegionServer=regionserver50820, blockCacheCount=40969,
>>>> blockCacheEvictedCount=216359, blockCacheFree=671152504,
>>>> blockCacheHitCachingRatio=20, blockCacheHitCount=67936,
>>>> blockCacheHitRatio=20, blockCacheMissCount=257675,
>>>> blockCacheSize=2743351688, compactionQueueSize=0,
>>>> compactionSize_avg_time=0, compactionSize_num_ops=7,
>>>> compactionTime_avg_time=0, compactionTime_num_ops=7, flushQueueSize=0,
>>>> flushSize_avg_time=0, flushSize_num_ops=0, flushTime_avg_time=0,
>>>> flushTime_num_ops=0, fsReadLatency_avg_time=46,
>>>> fsReadLatency_num_ops=257905, fsSyncLatency_avg_time=0,
>>>> fsSyncLatency_num_ops=1726, fsWriteLatency_avg_time=0,
>>>> fsWriteLatency_num_ops=0, memstoreSizeMB=0, regions=169,
>>>> requests=82.1, storefileIndexSizeMB=188, storefiles=343, stores=169
>>>> 2011-04-26T10:56:22 hbase.regionserver:
>>>> RegionServer=regionserver50820, blockCacheCount=42500,
>>>> blockCacheEvictedCount=216359, blockCacheFree=569659040,
>>>> blockCacheHitCachingRatio=20, blockCacheHitCount=68418,
>>>> blockCacheHitRatio=20, blockCacheMissCount=259206,
>>>> blockCacheSize=2844845152, compactionQueueSize=0,
>>>> compactionSize_avg_time=0, compactionSize_num_ops=7,
>>>> compactionTime_avg_time=0, compactionTime_num_ops=7, flushQueueSize=0,
>>>> flushSize_avg_time=0, flushSize_num_ops=0, flushTime_avg_time=0,
>>>> flushTime_num_ops=0, fsReadLatency_avg_time=44,
>>>> fsReadLatency_num_ops=259547, fsSyncLatency_avg_time=0,
>>>> fsSyncLatency_num_ops=1736, fsWriteLatency_avg_time=0,
>>>> fsWriteLatency_num_ops=0, memstoreSizeMB=0, regions=169,
>>>> requests=92.2, storefileIndexSizeMB=188, storefiles=343, stores=169
>>>> 2011-04-26T10:56:32 hbase.regionserver:
>>>> RegionServer=regionserver50820, blockCacheCount=39238,
>>>> blockCacheEvictedCount=221509, blockCacheFree=785944072,
>>>> blockCacheHitCachingRatio=20, blockCacheHitCount=69043,
>>>> blockCacheHitRatio=20, blockCacheMissCount=261095,
>>>> blockCacheSize=2628560120, compactionQueueSize=0,
>>>> compactionSize_avg_time=0, compactionSize_num_ops=7,
>>>> compactionTime_avg_time=0, compactionTime_num_ops=7, flushQueueSize=0,
>>>> flushSize_avg_time=0, flushSize_num_ops=0, flushTime_avg_time=0,
>>>> flushTime_num_ops=0, fsReadLatency_avg_time=39,
>>>> fsReadLatency_num_ops=261070, fsSyncLatency_avg_time=0,
>>>> fsSyncLatency_num_ops=1746, fsWriteLatency_avg_time=0,
>>>> fsWriteLatency_num_ops=0, memstoreSizeMB=0, regions=169,
>>>> requests=128.77777, storefileIndexSizeMB=188, storefiles=343,
>>>> stores=169
>>>>
>>>
>>> This is hard to read but I don't see anything obnoxious.
>>>
>>>
>>>>
>>>> And we also tried to disable block cache, it seems the performance is
>>>> even a little bit better. And it we use the configuration 6 DN servers
>>>> + 3 RS servers, we can get better throughput at about 1000
>>>> month_report/second.  I am confused. Can any one explain the reason?
>>>>
>>>
>>> Sounds like you are doing all random reads?  Do you have to?
>>>
>>> St.Ack
>>>
>

Mime
View raw message