hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miklos Kurucz (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-2497) ProcessServerShutdown throws NullPointerException for offline regions
Date Wed, 28 Apr 2010 13:17:36 GMT
ProcessServerShutdown throws NullPointerException for offline regions
---------------------------------------------------------------------

                 Key: HBASE-2497
                 URL: https://issues.apache.org/jira/browse/HBASE-2497
             Project: Hadoop HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.20.3
            Reporter: Miklos Kurucz


When a regionsserver dies the master can run into the following bug.

2010-04-27 17:20:37,303 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessServerShutdown
of dell106.cluster,60020,1272377612991
2010-04-27 17:20:37,303 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process
shutdown of server dell106.cluster,60020,1272377612991: logSplit: true, rootRescanned: true,
numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
2010-04-27 17:20:01,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Log split
complete, meta reassignment and scanning:
2010-04-27 17:20:01,653 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanRootRegion:
process server shutdown scanning root region on 10.1.3.124
2010-04-27 17:20:01,664 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: process
server shutdown scanning root region on 10.1.3.124 finished master
2010-04-27 17:20:01,683 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions:
process server shutdown scanning .META.,,1 on 10.1.3.104:60020
2010-04-27 17:20:18,087 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions:
Exception in RetryableMetaOperation:
2010-04-27 17:20:18,118 WARN org.apache.hadoop.hbase.master.HMaster: Adding to delayed queue:
ProcessServerShutdown of dell106.cluster,60020,1272377612991
java.lang.RuntimeException: java.lang.NullPointerException
        at org.apache.hadoop.hbase.master.RetryableMetaOperation.doWithRetries(RetryableMetaOperation.java:100)
        at org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:345)
        at org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:509)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:448)
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:487)
        at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:461)
        at org.apache.hadoop.hbase.master.ProcessServerShutdown.scanMetaRegion(ProcessServerShutdown.java:147)
        at org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions.call(ProcessServerShutdown.java:264)
        at org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions.call(ProcessServerShutdown.java:250)
        at org.apache.hadoop.hbase.master.RetryableMetaOperation.doWithRetries(RetryableMetaOperation.java:69)
        ... 3 more


The problem is in ProcessServerShutdown.java at line 148-149:

146	        String serverAddress = 
147	          Bytes.toString(values.getValue(CATALOG_FAMILY, SERVER_QUALIFIER));
148	        long startCode =
149	          Bytes.toLong(values.getValue(CATALOG_FAMILY, STARTCODE_QUALIFIER));
150	        String serverName = null;
151	        if (serverAddress != null && serverAddress.length() > 0) {
152	          serverName = HServerInfo.getServerName(serverAddress, startCode);
153	        }

It should be modified to:

146	        String serverAddress = 
147	          Bytes.toString(values.getValue(CATALOG_FAMILY, SERVER_QUALIFIER));
150	        String serverName = null;
151	        if (serverAddress != null && serverAddress.length() > 0) {
148	          long startCode =
149	            Bytes.toLong(values.getValue(CATALOG_FAMILY, STARTCODE_QUALIFIER));
152	          serverName = HServerInfo.getServerName(serverAddress, startCode);
153	        }

As Bytes.toLong cannot handle the null pointer returned by getValue for missing  STARTCODE_QUALIFIER
of offline regions in META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message