hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: how to explain read/write performance change after modifying the hfile.block.cache.size?
Date Thu, 20 Nov 2014 20:37:13 GMT
The indices are always cached. 

Cheers

On Nov 20, 2014, at 12:33 PM, "Kevin O'dell" <kevin.odell@cloudera.com> wrote:

> I am also under the impression that HBase reads should basically not work
> with block cache set to 0 since we store the indexes in block cache right?
> 
> On Thu, Nov 20, 2014 at 3:31 PM, lars hofhansl <larsh@apache.org> wrote:
> 
>> That would explain it if memstores are flushed due to global memory
>> pressure.
>> 
>> But cache and memstore size are (unfortunately) configured independently.
>> The memstore heap portion would be 40% (by default) in either case.So this
>> is a bit curious still.
>> Ming, can you tell us more details?- RAM on the boxes- heap setup for the
>> region servers- any other relevant settings on hbase-site.xml- configs on
>> the table/column family you're writing to (like bloom filters, etc).
>> 
>> That would help us diagnose this.
>> 
>> -- Lars
>> 
>>      From: Ted Yu <yuzhihong@gmail.com>
>> To: "user@hbase.apache.org" <user@hbase.apache.org>
>> Sent: Thursday, November 20, 2014 9:32 AM
>> Subject: Re: how to explain read/write performance change after modifying
>> the hfile.block.cache.size?
>> 
>> When block cache size increases from 0 to 0.4, the amount of heap given to
>> memstore decreases. This would slow down the writes.
>> Please see:
>> http://hbase.apache.org/book.html#store.memstore
>> 
>> For your second question, see this thread:
>> 
>> http://search-hadoop.com/m/DHED4TEvBy1/lars+hbase+hflush&subj=Re+Clarifications+on+HBase+Durability
>> 
>> Cheers
>> 
>> On Thu, Nov 20, 2014 at 8:05 AM, Liu, Ming (HPIT-GADSC) <ming.liu2@hp.com>
>> wrote:
>> 
>>> Hello, all,
>>> 
>>> I am playing with YCSB to test HBase performance. I am using HBase
>> 0.98.5.
>>> I tried to adjust the hfile.block.cache.size to see the difference, when
>> I
>>> set hfile.block.cache.size to 0, read performance is very bad, but write
>>> performance is very very very good....; when I set hfile.block.cache.size
>>> to 0.4, read is better, but write performance drop dramatically. I
>> disable
>>> the client side writebuffer already.
>>> This is hard to understand for me:
>>> The HBase guide just said hfile.block.cache.size setting is about how
>> much
>>> memory used as block cache used by StoreFile. I have no idea of how HBase
>>> works internally. Typically, it is easy to understand that increase the
>>> size of cache should help the read, but why it will harm the write
>>> operation? The write performance down from 30,000 to 4,000 for your
>>> reference, just by changing the hfile.block.cache.size from 0 to 0.4.
>>> Could anyone give me a brief explanation about this observation or give
>> me
>>> some advices about what to study to understand what is block cache used
>> for?
>>> 
>>> Another question: HBase write will first write to WAL then to memstore.
>>> Will the write to WAL go to disk directly before hbase write memstore, a
>>> sync operation or it is possible that write to WAL is still buffered
>>> somewhere when hbase put the data into the memstore?
>>> 
>>> Reading src code may cost me months, so a kindly reply will help me a
>>> lot... ...
>>> Thanks very much!
>>> 
>>> Best Regards,
>>> Ming
> 
> 
> 
> -- 
> Kevin O'Dell
> Systems Engineer, Cloudera

Mime
View raw message