Return-Path: Delivered-To: apmail-hadoop-hbase-issues-archive@minotaur.apache.org Received: (qmail 94937 invoked from network); 28 Apr 2010 14:44:53 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 Apr 2010 14:44:53 -0000 Received: (qmail 24931 invoked by uid 500); 28 Apr 2010 14:44:53 -0000 Delivered-To: apmail-hadoop-hbase-issues-archive@hadoop.apache.org Received: (qmail 24900 invoked by uid 500); 28 Apr 2010 14:44:53 -0000 Mailing-List: contact hbase-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hbase-issues@hadoop.apache.org Received: (qmail 24892 invoked by uid 99); 28 Apr 2010 14:44:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Apr 2010 14:44:53 +0000 X-ASF-Spam-Status: No, hits=-1360.6 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Apr 2010 14:44:52 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3SEiWME012328 for ; Wed, 28 Apr 2010 14:44:32 GMT Message-ID: <26602008.58271272465872029.JavaMail.jira@thor> Date: Wed, 28 Apr 2010 10:44:32 -0400 (EDT) From: "Jean-Daniel Cryans (JIRA)" To: hbase-issues@hadoop.apache.org Subject: [jira] Commented: (HBASE-2497) ProcessServerShutdown throws NullPointerException for offline regions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861818#action_12861818 ] Jean-Daniel Cryans commented on HBASE-2497: ------------------------------------------- Isn't this the same as HBASE-2479 or/and HBASE-2428? > ProcessServerShutdown throws NullPointerException for offline regions > --------------------------------------------------------------------- > > Key: HBASE-2497 > URL: https://issues.apache.org/jira/browse/HBASE-2497 > Project: Hadoop HBase > Issue Type: Bug > Components: master > Affects Versions: 0.20.3 > Reporter: Miklos Kurucz > Attachments: pss_diff.txt > > > When a regionsserver dies the master can run into the following bug. > 2010-04-27 17:20:37,303 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessServerShutdown of dell106.cluster,60020,1272377612991 > 2010-04-27 17:20:37,303 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of server dell106.cluster,60020,1272377612991: logSplit: true, rootRescanned: true, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1 > 2010-04-27 17:20:01,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete, meta reassignment and scanning: > 2010-04-27 17:20:01,653 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanRootRegion: process server shutdown scanning root region on 10.1.3.124 > 2010-04-27 17:20:01,664 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: process server shutdown scanning root region on 10.1.3.124 finished master > 2010-04-27 17:20:01,683 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions: process server shutdown scanning .META.,,1 on 10.1.3.104:60020 > 2010-04-27 17:20:18,087 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions: Exception in RetryableMetaOperation: > 2010-04-27 17:20:18,118 WARN org.apache.hadoop.hbase.master.HMaster: Adding to delayed queue: ProcessServerShutdown of dell106.cluster,60020,1272377612991 > java.lang.RuntimeException: java.lang.NullPointerException > at org.apache.hadoop.hbase.master.RetryableMetaOperation.doWithRetries(RetryableMetaOperation.java:100) > at org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:345) > at org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:509) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:448) > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:487) > at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:461) > at org.apache.hadoop.hbase.master.ProcessServerShutdown.scanMetaRegion(ProcessServerShutdown.java:147) > at org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions.call(ProcessServerShutdown.java:264) > at org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions.call(ProcessServerShutdown.java:250) > at org.apache.hadoop.hbase.master.RetryableMetaOperation.doWithRetries(RetryableMetaOperation.java:69) > ... 3 more > The problem is in ProcessServerShutdown.java at line 148-149: > 146 String serverAddress = > 147 Bytes.toString(values.getValue(CATALOG_FAMILY, SERVER_QUALIFIER)); > 148 long startCode = > 149 Bytes.toLong(values.getValue(CATALOG_FAMILY, STARTCODE_QUALIFIER)); > 150 String serverName = null; > 151 if (serverAddress != null && serverAddress.length() > 0) { > 152 serverName = HServerInfo.getServerName(serverAddress, startCode); > 153 } > It should be modified to: > 146 String serverAddress = > 147 Bytes.toString(values.getValue(CATALOG_FAMILY, SERVER_QUALIFIER)); > 150 String serverName = null; > 151 if (serverAddress != null && serverAddress.length() > 0) { > 148 long startCode = > 149 Bytes.toLong(values.getValue(CATALOG_FAMILY, STARTCODE_QUALIFIER)); > 152 serverName = HServerInfo.getServerName(serverAddress, startCode); > 153 } > As Bytes.toLong cannot handle the null pointer returned by getValue for missing STARTCODE_QUALIFIER of offline regions in META. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.