hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
Date Wed, 17 Feb 2016 01:13:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149636#comment-15149636

stack commented on HBASE-15249:

bq.  AMS (Ambari Metrics system), creates tables with pre-splits based on the knowledge of
how many daemons will be writing metrics to HBase and the memory available to RS.

What does the math look like [~swagle]?

bq. ....and this count dropped shortly after the system came online. 

You need to run normalizer? Its a new feature that is not yet in any shipping version of hbase
and it is off by default. An important service like AMS might try and do without it, at least
at first? Could you settle for a less aggressive set of initial splits that is somewhat a
factor of how many servers there are involved? e.g. cluster node count/ 10? As is, the default
is to split aggressively at first so regions fan out over the cluster. That was not working
for you?

> Provide lower bound on number of regions in region normalizer for pre-split tables
> ----------------------------------------------------------------------------------
>                 Key: HBASE-15249
>                 URL: https://issues.apache.org/jira/browse/HBASE-15249
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt
> AMS (Ambari Metrics System) developer found the following scenario:
> Metrics table was pre-split with many regions on large cluster (1600 nodes).
> After some time, AMS stopped working because region normalizer merged the regions into
few big regions which were not able to serve high read / write load.
> This is a big problem since the write requests flood the regions faster than the splits
can happen resulting in poor performance.
> We should consider setting reasonable lower bound on region count.
> If the table is pre-split, we can use initial region count as the lower bound.

This message was sent by Atlassian JIRA

View raw message