Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3040417299 for ; Fri, 27 Mar 2015 19:47:54 +0000 (UTC) Received: (qmail 75051 invoked by uid 500); 27 Mar 2015 19:47:54 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 74998 invoked by uid 500); 27 Mar 2015 19:47:54 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 74986 invoked by uid 99); 27 Mar 2015 19:47:53 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Mar 2015 19:47:53 +0000 Date: Fri, 27 Mar 2015 19:47:53 +0000 (UTC) From: "Jimmy Xiang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384489#comment-14384489 ] Jimmy Xiang commented on HBASE-13337: ------------------------------------- For graceful shutdown, no need for log splitting. Those regions on the dead server are still re-assigned by SSH if the master is not restarted. > Table regions are not assigning back, after restarting all regionservers at once. > --------------------------------------------------------------------------------- > > Key: HBASE-13337 > URL: https://issues.apache.org/jira/browse/HBASE-13337 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 2.0.0 > Reporter: Y. SREENIVASULU REDDY > Priority: Blocker > Fix For: 2.0.0 > > > Regions of the table are continouly in state=FAILED_CLOSE. > {noformat} > Region State RIT time (ms) > 8f62e819b356736053e06240f7f7c6fd t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113929 > caf59209ae65ea80fca6bdc6996a7d68 t1,dddddddd,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM2,16040,1427362533691 113929 > db52a74988f71e5cf257bbabf31f26f3 t1,44444444,1427362431330.db52a74988f71e5cf257bbabf31f26f3. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM3,16040,1427362533691 113920 > 43f3a65b9f9ff283f598c5450feab1f8 t1,88888888,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113920 > {noformat} > *Steps to reproduce:* > 1. Start HBase cluster with more than one regionserver. > 2. Create a table with precreated regions. (lets say 15 regions) > 3. Make sure the regions are well balanced. > 4. Restart all the Regionservers process at once across the cluster, except HMaster process > 5. After restarting the Regionservers, successfully will connect to the HMaster. > *Bug:* > But no regions are assigning back to the Regionservers. > *Master log shows as follows:* > {noformat} > 2015-03-26 15:05:36,201 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} > 2015-03-26 15:05:36,202 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_OPEN&sn=VM1,16040,1427362531818 > 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Force region state offline {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} > 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} > 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_CLOSE > 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=1 of 10 > 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=2 of 10 > 2015-03-26 15:05:36,249 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=3 of 10 > 2015-03-26 15:05:36,249 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=4 of 10 > 2015-03-26 15:05:36,249 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=5 of 10 > 2015-03-26 15:05:36,250 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=6 of 10 > 2015-03-26 15:05:36,250 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=7 of 10 > 2015-03-26 15:05:36,250 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=8 of 10 > 2015-03-26 15:05:36,251 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=9 of 10 > 2015-03-26 15:05:36,251 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=10 of 10 > 2015-03-26 15:05:36,251 WARN [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Failed to open/close 8f62e819b356736053e06240f7f7c6fd on VM1,16040,1427362531818, set to FAILED_CLOSE > 2015-03-26 15:05:36,251 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=FAILED_CLOSE, ts=1427362536251, server=VM1,16040,1427362531818} > 2015-03-26 15:05:36,251 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,55555555,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=FAILED_CLOSE > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)