hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Wagle (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
Date Wed, 17 Feb 2016 18:40:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150949#comment-15150949

Siddharth Wagle commented on HBASE-15249:

{quote} What does the math look like for region splits {quote}
Ref: AMBARI-13039. We use the _memstore.lowerLimit_ and _memstore.flush.size_ to calculate
memory available to the memstore and number of max-value on regions. Then we calculate lexically
equidistant split points based on the services deployed by Ambari (from a static list of metrics
that we mined from a deployed cluster) for the large tables.

{quote}You need to run normalizer?{quote}
In a stable state it seems normalizer works well for us managing the region boundaries. We
do give user the option to disable this with a configuration setting in AMS (precautionary
tactic from our end). All in all, we can definitely live without the normalizer this was not
available to us until very recently, the pre-splitting pre-dates normalizer setting in AMS.
The best use case for normalizer use for us is this: Ambari user can lets say add a service
example: KAFKA that starts writing a ton of metrics and introduces a skew where previous splits
become irrelevant.

[~stack] / [~anoop.hbase] Thanks for feedback.

> Provide lower bound on number of regions in region normalizer for pre-split tables
> ----------------------------------------------------------------------------------
>                 Key: HBASE-15249
>                 URL: https://issues.apache.org/jira/browse/HBASE-15249
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt
> AMS (Ambari Metrics System) developer found the following scenario:
> Metrics table was pre-split with many regions on large cluster (1600 nodes).
> After some time, AMS stopped working because region normalizer merged the regions into
few big regions which were not able to serve high read / write load.
> This is a big problem since the write requests flood the regions faster than the splits
can happen resulting in poor performance.
> We should consider setting reasonable lower bound on region count.
> If the table is pre-split, we can use initial region count as the lower bound.

This message was sent by Atlassian JIRA

View raw message