hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huaxiang sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS
Date Thu, 19 Oct 2017 17:59:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211449#comment-16211449
] 

huaxiang sun commented on HBASE-18946:
--------------------------------------

Thanks [~ram_krish]. One possible slowdown here with the approach is that if queueAll() queues
more than assignDispatchWaitQueueMaxSize regions, with the current logic, it still needs to
wait a bit, please see

https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java#L1639.

The previous logic is that when the first region is queued, it starts to wait assignDispatchWaitMillis
to start the real work. With the patch, the whole batch is added at once, it skipped the addFirstOne
logic. I think it can be changed to avoid this case.

{code}
  private HashMap<RegionInfo, RegionStateNode> waitOnAssignQueue() {
    HashMap<RegionInfo, RegionStateNode> regions = null;

    assignQueueLock.lock();
    try {
      if (pendingAssignQueue.isEmpty() && isRunning()) {
        assignQueueFullCond.await();
      }

      if (!isRunning()) return null;
      +if (pendingAssignQueue.size() < assignDispatchWaitQueueMaxSize) {
      +  assignQueueFullCond.await(assignDispatchWaitMillis, TimeUnit.MILLISECONDS);
      +}
      -assignQueueFullCond.await(assignDispatchWaitMillis, TimeUnit.MILLISECONDS);
      regions = new HashMap<RegionInfo, RegionStateNode>(pendingAssignQueue.size());
      for (RegionStateNode regionNode: pendingAssignQueue) {
        regions.put(regionNode.getRegionInfo(), regionNode);
      }
      pendingAssignQueue.clear();
    } catch (InterruptedException e) {
      LOG.warn("got interrupted ", e);
      Thread.currentThread().interrupt();
    } finally {
      assignQueueLock.unlock();
    }
    return regions;
  }

{code}

> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
>                 Key: HBASE-18946
>                 URL: https://issues.apache.org/jira/browse/HBASE-18946
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha-3
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-18946.patch, HBASE-18946.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the default LB
Stocahstic load balancer assigns replica regions to the same RS. This happens when we have
3 RS checked in and we have a table with 3 replicas. When a RS goes down then the replicas
being assigned to same RS is acceptable but the case when we have enough RS to assign this
behaviour is undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message