hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4053) Most of the regions were added into AssignmentManager#servers twice
Date Wed, 06 Jul 2011 16:33:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060673#comment-13060673
] 

jiraposter@reviews.apache.org commented on HBASE-4053:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/782/#review969
-----------------------------------------------------------


This change seems reasonable. One concern I have regarding the change from List to Set is
HRegionInfo's hashcode (used by Object.hashCode) takes into account the value of its 'offLine'
member but its compareTo() method (used by Object.equals) does not. Perhaps you can address
this inconsistency as part of this change?

- Andrew


On 2011-07-06 14:27:36, Ted Yu wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/782/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-07-06 14:27:36)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  When master fails over, we should check whether hris contains the region addToServers()
is trying to add.
bq.  But ArrayList is not the best data structure to perform search of specific HRegionInfo.
Maybe we should consider replacing it with e.g. ConcurrentSkipListSet
bq.  
bq.  Also removes bulkAssignUserRegions() which is no longer called.
bq.  
bq.  
bq.  This addresses bug HBASE-4053.
bq.      https://issues.apache.org/jira/browse/HBASE-4053
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1142537 
bq.  
bq.  Diff: https://reviews.apache.org/r/782/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran test suite.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ted
bq.  
bq.



> Most of the regions were added into AssignmentManager#servers twice
> -------------------------------------------------------------------
>
>                 Key: HBASE-4053
>                 URL: https://issues.apache.org/jira/browse/HBASE-4053
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: Jieshan Bean
>             Fix For: 0.90.4
>
>         Attachments: 4053.txt, HBase-4053-90.patch, surefire-report.html
>
>
> Here's the scenario of how did the problem happened:
> 1. When HMaster start, all regionservers checkin ok, and count of regions out on cluster
is 10083, which is the actual region number count.
> 2. Then OpenedRegionHandler#process received zookeeper's events, and added 9923 regions
to the hris list.
>    but the 9923 regions already exists, force added.
> 3. The LoadBalancer get the wrong Region numbers of 20006 (10083 + 9923).
> AssignmentManager#addToServers method:
> private void addToServers(final HServerInfo hsi, final HRegionInfo hri) {
>   List<HRegionInfo> hris = servers.get(hsi);
>   if (hris == null) {
>     hris = new ArrayList<HRegionInfo>();
>     servers.put(hsi, hris);
>   }
>   hris.add(hri); // Same region was double added here
> }
> logs:
> 2011-06-27 16:13:06,845 INFO org.apache.hadoop.hbase.master.ServerManager: Exiting wait
on regionserver(s) to checkin; count=3, stopped=false, count of regions out on cluster=10083
> 2011-06-27 16:13:17,334 INFO org.apache.hadoop.hbase.master.AssignmentManager: Failed-over
master needs to process 9923 regions in transition
> 2011-06-27 16:21:45,135 DEBUG org.apache.hadoop.hbase.master.LoadBalancer: Balance parameter:
numRegions=20006, numServers=3, max=6669, min=6668

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message