hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4273) java.lang.NullPointerException when a table is being disabled and HMaster restarts
Date Mon, 29 Aug 2011 23:17:37 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093297#comment-13093297
] 

Ming Ma commented on HBASE-4273:
--------------------------------

Regarding why regionLocation could be null, it comes from createTable failure.

Here are the scenarios. Let us assume we are dealing with large number of regions, thus createTable,
disableTable, enableTable can take a long time and HMaster can restart in the middle. Also
as part of the fix in hbase-3229, the table state is set to ENABLING at the beginning of the
operation and ENABLED at the end of the operation. Previously, the table state is set to ENABLED
at the beginning of the operation.


t1: Application calls createTable. table's state is set to ENABLING ( or ENABLED without hbase-3229
).
t2: In createTable, HMaster updates .META. with regioninfo, and null regionLocation.
t3: In createTable, before regions assignment start or finishes, HMaster shutdown. That mean
certain regions will have null regionLocation.
t4: HMaster restarts or the other HMaster takes over. The table's state is ENABLING ( or ENABLED
without hbase-3229 ). AssignmentManager will continue to process to enable table ( or invoke
AssignmentManager.assignUserRegions without hbase-3229 )
t5: Application calls disableTable before all the regions are fully assigned. So there are
still regions with null regionLocation. table's state is set to DISABLING.
t6: Before disableTable operation finishes, HMaster restarts.

In other words
1. With latest chunk, region could have null regionLocation while the table state is DISABLING
or ENABLING.
2. Without fix of hbase-3229, region could have null regionLocation while the table state
is DISABLING or ENABLING or ENABLED.




Regarding how the system handles null regionLocation, it seems to be ok.

1. As long as there is an entry in zookeeper for this RIT, eventually it should be taken care
of by AssignmentManager.processRegionInTransition and RS.
2. There is a chance that such region doesn't have an entry in zookeeper, for example, before
createTable starts the bulk assignment process HMaster retarts. With the fix of hbase-3229,
the table will be in ENABLING state and thus will eventually gets to ENABLED state, the region
will be assigned in the process. Prior to fix of hbase-3229, the table could be in ENABLED
state with such region.



Couple suggestions of how to fix the issue. #1 should be enough. This issue raises other questions,
thus #2, #3, #4.

1. In AssignmentManager.rebuildUserRegions, remove the following lines inside "if (regionLocation
== null)" block.

        if (false == checkIfRegionBelongsToDisabled(regionInfo)
            && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
          regions.put(regionInfo, regionLocation);
        }

2. Application currently can disableTable while the table is in ENABLING state. That could
cause some issues. The system will try to unassign regions while regions are being assigned.
Can we only allow application disableTable when the table is in ENABLED state, enableTable
when the table is in DISABLED state?

3. After HMaster finishes initialization, it sets initialized==true. Before initialization
is done, application can still access HMaster given isMasterRunning() returns true. Is it
by design or should we wait until HMaster.isInitialized() returns true? Couple services need
to be initialized before HMaster can accept requests.

4. Enhance hbck to report null regionLocation and consistency validation between .META. state
and zookeper state.



Comments?
 

> java.lang.NullPointerException when a table is being disabled and HMaster restarts
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-4273
>                 URL: https://issues.apache.org/jira/browse/HBASE-4273
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>
> This bug occurs in following scenario. 
> 1. For some reason, the regionLocation isn't set in .META. table for some regions. Perhaps
createTable didn't complete successfully.
> 1. The table of those regions is being disabled.
> 2. HMaster restarted.
> 3. At HMaster startup, it tries to transition from disabling to disabled state. It got
the following exception.
> java.lang.NullPointerException: Passed server is null
>         at
> org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.
> java:581)
>         at
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager
> .java:1093)
>         at
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager
> .java:1040)
>         at
> org.apache.hadoop.hbase.master.handler.DisableTableHandler$BulkDisabler$1.r
> un(DisableTableHandler.java:132)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j
> ava:886)
> In AssignmentManager.rebuildUserRegions, it added such regions to its regions list,
>       if (regionLocation == null) {
>         // Region not being served, add to region map with no assignment
>         // If this needs to be assigned out, it will also be in ZK as RIT
>         // add if the table is not in disabled and enabling state
>         if (false == checkIfRegionBelongsToDisabled(regionInfo)
>             && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>           regions.put(regionInfo, regionLocation);
>         }
> Perhaps, it should be
>       if (regionLocation == null) {
>         // Region not being served, add to region map with no assignment
>         // If this needs to be assigned out, it will also be in ZK as RIT
>         // add if the table is not in disabled and enabling state
>         if (true == checkIfRegionBelongsToEnabled(regionInfo) {
>           regions.put(regionInfo, regionLocation);
>         }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message