Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7BD56107C8 for ; Fri, 23 Jan 2015 08:40:35 +0000 (UTC) Received: (qmail 85256 invoked by uid 500); 23 Jan 2015 08:40:35 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 85207 invoked by uid 500); 23 Jan 2015 08:40:35 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 85195 invoked by uid 99); 23 Jan 2015 08:40:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Jan 2015 08:40:35 +0000 Date: Fri, 23 Jan 2015 08:40:35 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-12791) HBase does not attempt to clean up an aborted split when the regionserver shutting down MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288975#comment-14288975 ] Hudson commented on HBASE-12791: -------------------------------- FAILURE: Integrated in HBase-1.1 #100 (See [https://builds.apache.org/job/HBase-1.1/100/]) HBASE-12791 HBase does not attempt to clean up an aborted split when the regionserver shutting down-addendum(Rajeshbabu) (rajeshbabu: rev d21ea4e57071115deb597e31163a28264d47f89f) * hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java > HBase does not attempt to clean up an aborted split when the regionserver shutting down > --------------------------------------------------------------------------------------- > > Key: HBASE-12791 > URL: https://issues.apache.org/jira/browse/HBASE-12791 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.98.0 > Reporter: Rajeshbabu Chintaguntla > Assignee: Rajeshbabu Chintaguntla > Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: HBASE-12791.patch, HBASE-12791_98.patch, HBASE-12791_98_v2.patch, HBASE-12791_98_v3.patch, HBASE-12791_addendum.patch, HBASE-12791_branch1.patch, HBASE-12791_branch1_v2.patch, HBASE-12791_branch1_v3.patch, HBASE-12791_v2.patch, HBASE-12791_v3.patch, HBASE-12791_v4.patch, HBASE-12791_v4.patch, HBASE-12791_v5.patch, HBASE-12791_v6.patch, HBASE-12791_v6.patch > > > HBase not cleaning the daughter region directories from HDFS if region server shut down after creating the daughter region directories during the split. > Here the logs. > -> RS shutdown after creating the daughter regions. > {code} > 2014-12-31 09:05:41,406 DEBUG [regionserver60020-splits-1419996941385] zookeeper.ZKAssign: regionserver:60020-0x14a9701e53100d1, quorum=localhost:2181, baseZNode=/hbase Transitioned node 80c665138d4fa32da4d792d8ed13206f from RS_ZK_REQUEST_REGION_SPLIT to RS_ZK_REQUEST_REGION_SPLIT > 2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Closing t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.: disabling compactions & flushes > 2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Updates disabled for region t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. > 2014-12-31 09:05:41,516 INFO [StoreCloserThread-t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.-1] regionserver.HStore: Closed f > 2014-12-31 09:05:41,518 INFO [regionserver60020-splits-1419996941385] regionserver.HRegion: Closed t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. > 2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for table t dd9731ee43b104da565257ca1539aa8c > 2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Instantiated t,,1419996941401.dd9731ee43b104da565257ca1539aa8c. > 2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for table t 2e40a44511c0e187d357d651f13a1dab > 2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Instantiated t,row2,1419996941401.2e40a44511c0e187d357d651f13a1dab. > Wed Dec 31 09:06:30 IST 2014 Terminating regionserver > 2014-12-31 09:06:30,465 INFO [Thread-8] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@42d2282e > {code} > -> Skipping rollback if RS stopped or stopping so we end up in dirty daughter regions in HDFS. > {code} > 2014-12-31 09:07:49,547 INFO [regionserver60020-splits-1419996941385] regionserver.SplitRequest: Skip rollback/cleanup of failed split of t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. because server is stopped > java.io.InterruptedIOException: Interrupted after 0 tries on 350 > at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:156) > {code} > Because of this hbck always showing inconsistencies. > {code} > ERROR: Region { meta => null, hdfs => hdfs://localhost:9000/hbase/data/default/t/2e40a44511c0e187d357d651f13a1dab, deployed => } on HDFS, but not listed in hbase:meta or deployed on any region server > ERROR: Region { meta => null, hdfs => hdfs://localhost:9000/hbase/data/default/t/dd9731ee43b104da565257ca1539aa8c, deployed => } on HDFS, but not listed in hbase:meta or deployed on any region server > {code} > If we try to repair then we end up in overlap regions in hbase:meta. and both daughter regions and parent are online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)