hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eshcar Hillel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy
Date Thu, 08 Feb 2018 08:41:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356649#comment-16356649

Eshcar Hillel commented on HBASE-18294:

{quote} the general agreement we made is for on heap cases, we must continue to check for
128 MB limit against the memstore heap size. Not just data size. Also we have agreed that
for off heap also, we will consider the off heap size + heap overhead.
>From the beginning I aimed to have as symmetric behavior as possible of on-heap and off-heap
cases, so I don't believe I agreed on having two different computations. One way to make it
symmetric is to compare the two counters against two thresholds. Another way to unify it is
to always consider the sum of off-heap and on-heap sizes at the region level. We still need
to manage two separate counters since the global bounds are different.
bq.  Ideally checking the data size alone here would have been the best way. I mean for any
decision per region level.
You keep saying that but it seems to be based more on intuition rather than on experiments.
While considering both data and heap overhead for region level flush have shown to improve
the performance significantly.
bq.When the size breach is because of off heap size, we have to select regions having maximum
data size and when breach because of on heap size limit, select the regions with more heap
Again, Why? you say we should have different decision making but you don't explain why, and
don't have numbers to support your claims.
I argue that unless shown there is a great performance benefit in making different rules,
on-heap and off heap should follow the same set of rules, embedding them with their respective

So, I will make a new patch, leave only one flush size configuration property (remove off-heap
flush size), flush size at the region level will always consider on-heap+off-heap size. The
rest will be similar to the current patch.
Patch will be ready in a few days.

> Reduce global heap pressure: flush based on heap occupancy
> ----------------------------------------------------------
>                 Key: HBASE-18294
>                 URL: https://issues.apache.org/jira/browse/HBASE-18294
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Eshcar Hillel
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>         Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.01.patch,
HBASE-18294.01.patch, HBASE-18294.02.patch, HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch,
HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, HBASE-18294.08.patch, HBASE-18294.09.patch,
HBASE-18294.10.patch, HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, HBASE-18294.13.patch,
HBASE-18294.15.patch, HBASE-18294.16.patch, HBASE-18294.master.01.patch
> A region is flushed if its memory component exceed a threshold (default size is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the store to
another threshold (that can be configured with hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size (key-value only)
to the threshold where it should compare the heap size (which includes index size, and metadata).

This message was sent by Atlassian JIRA

View raw message