hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS
Date Tue, 10 Oct 2017 07:05:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198278#comment-16198278
] 

ramkrishna.s.vasudevan commented on HBASE-18946:
------------------------------------------------

I now got the real issue why this happens. Need to check in other branches which is not PRocV2.
Suppose we create a table with 20 regions and say replica as 3. Now we create 60 Assign procedures.
Now all these assign procedures are added to a queue 'pendingAssignQueue' in AM. 
There is a thread that executes the assignment of the regions in this queue. 
So when ever all the 60 regions are added to this queue and the assignment thread assigns
them we have no problem. The Stochastic LB uses the replica concept to ensure the regions
are assigned properly. The Cluster and Cost functions created per 'roundRobinAssignment' ensures
that happens.
But when the multi threaded model executes differently like the 60 regions are executed with
45 and 15 regions each then we end up in this issue every time. Because the roundRobinAssignment
Cluster creation is not global and it is per assignment. 

> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
>                 Key: HBASE-18946
>                 URL: https://issues.apache.org/jira/browse/HBASE-18946
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha-3
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0-beta-1
>
>         Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the default LB
Stocahstic load balancer assigns replica regions to the same RS. This happens when we have
3 RS checked in and we have a table with 3 replicas. When a RS goes down then the replicas
being assigned to same RS is acceptable but the case when we have enough RS to assign this
behaviour is undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message