hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [DISCUSS] Should flush decisions be made based on data size (key-value only) or based on heap size (including metadata overhead)?
Date Wed, 05 Jul 2017 16:56:40 GMT
On Wed, Jul 5, 2017 at 6:30 AM, Eshcar Hillel <eshcar@yahoo-inc.com.invalid>

> Hi All,
> I opened a new Jira https://issues.apache.org/jira/browse/HBASE-18294 to
> discuss this question.
> Flush decisions are taken at the region level and also at the region
> server level - there is the question of when to trigger a flush and then
> which region/store to flush.Regions track both their data size (key-value
> size only) and their total heap occupancy (including index and additional
> metadata).One option (which was the past policy) is to trigger flushes and
> choose flush subjects based on regions heap size - this gives a better
> estimation for sysadmin of how many regions can a RS carry.Another option
> (which is the current policy) is to look at the data size - this gives a
> better estimation of the size of the files that are created by the flush.

Sounds like we should be doing the former, heap occupancy. An
OutOfMemoryException puts a nail in any benefit other accountings might


> I see this is as critical to HBase performance and usability, namely
> meeting the user expectation from the system, hence I would like to hear as
> many voices as possible.Please join the discussion in the Jira and let us
> know what you think.
> Thanks,Eshcar

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message