hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evgeny Ryabitskiy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1442) Large number of regions slows down start and stop of HBase clusters
Date Wed, 20 May 2009 18:07:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711262#action_12711262
] 

Evgeny Ryabitskiy commented on HBASE-1442:
------------------------------------------

In future we can move Region's assignment to HRSs to make everything scalable.

 * But question: how to choose that HRS that should make decisions about  assignment?

One easy way is to put assignment managing on HRS that holds META regions.
So each HRS that holds at least one METARegion should manage assignment for all User Regions
in that META.

 * Ok, so HRS that holds META manages Region's assignment in that META, but how to decide
where to assign that Regions? to which HRS?

We can use Hash Ring for METARegions to get maping METARegion -> HRS. Like in every DHT
but simple variant without replication.


just thoughts that comes after my small research :) Not sure if it already appeared here...

> Large number of regions slows down start and stop of HBase clusters
> -------------------------------------------------------------------
>
>                 Key: HBASE-1442
>                 URL: https://issues.apache.org/jira/browse/HBASE-1442
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Lars George
>
> A cluster with tables that have thousands of regions will take a long time to start and
shut down. 
> During the start the META table is scanned which takes a long time. Also at least minor
compaction are performed that add to the delay as by default there is a 20sec wait time between
the each compaction. 
> Also region assignment may be not right assigning 10 regions at a time while unbalancing
the servers.
> Shutting a large cluster down also takes a long time before everything is persisted and
shut down orderly. Times in excess of 10 minutes have been noted. It needs to be investigated
where this time is spent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message