Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E206F74B6 for ; Fri, 9 Dec 2011 03:10:05 +0000 (UTC) Received: (qmail 12432 invoked by uid 500); 9 Dec 2011 03:10:05 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 12226 invoked by uid 500); 9 Dec 2011 03:10:05 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 12213 invoked by uid 99); 9 Dec 2011 03:10:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Dec 2011 03:10:05 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Dec 2011 03:10:04 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F2FFE108AB3 for ; Fri, 9 Dec 2011 03:09:43 +0000 (UTC) Date: Fri, 9 Dec 2011 03:09:43 +0000 (UTC) From: "chunhui shen (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <388050250.56225.1323400183996.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <699480670.56223.1323400183602.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4988) MetaServer crash cause all splitting regionserver abort MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165805#comment-13165805 ] chunhui shen commented on HBASE-4988: ------------------------------------- logs {code} 2011-12-07 17:49:17,737 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined writetest,28TPVACCO3EI47TH472E1997TX1ZDFQ7XUCMBA2LUKOD7G0U3NQ2L2FG0ILRGZ5ETHFESE5QIMFN8ONUDUXB80G7MEK58G7YM4EG,1323251351741.6399c204b8d45568a782fd0157d6700d.; next sequenceid=3483318538 2011-12-07 17:49:17,737 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for writetest,2FB\xC0EE\xC2LDFG\xC8\xB6GV\xCE\xC6F4<\xBBE\xC87BM\xC0\xCD\xC3\xC8A\xB3\xCE\xD5G\xCBI\xBA\xBB\xCB\xD7R\xD2=\xC5>2U;P\xD2D\xCD\xBA\xC6\xC6A\xC1KI\xCDND\xC8\xCEKG\xC3\xCC\xCD\xB4\xC1=\xD0\xC4\xD2FSSPE\xD0V\xCE5@\xBCCN\xC4\xCB\xBE7L\xC8E\xC1\xBD\xCFH,1323251351741.a639e2eda8b2de9ca368c1a13ebbcb44. because Region has references on open; priority=16, compaction queue size=1 2011-12-07 17:49:17,737 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of .META.,,1 at address=dw83.kgb.sqa.cm4:60020; java.io.EOFException 2011-12-07 17:49:17,737 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Current cached META location is not valid, resetting 2011-12-07 17:49:17,740 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of .META.,,1 at address=dw83.kgb.sqa.cm4:60020; java.net.ConnectException: Connection refused 2011-12-07 17:49:17,740 WARN org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running rollback of failed split of writetest,28TPVACCO3EI47TH472E1997TX1ZDFQ7XUCMBA2LUKOD7G0U3NQ2L2FG0ILRGZ5ETHFESE5QIMFN8ONUDUXB80G7MEK58G7YM4EG,1323240352298.c7bde4437e5b12bc7226485dcbc2700b.; Timed out (2147483647ms) 2011-12-07 17:49:17,740 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=dw87.kgb.sqa.cm4,60020,1323244700069, load=(requests=393, regions=12, usedHeap=742, maxHeap=15872): Abort; we got an error after point-of-no-return {code} > MetaServer crash cause all splitting regionserver abort > ------------------------------------------------------- > > Key: HBASE-4988 > URL: https://issues.apache.org/jira/browse/HBASE-4988 > Project: HBase > Issue Type: Bug > Reporter: chunhui shen > > If metaserver crash now, > All the splitting regionserver will abort theirself. > Becasue the code > {code} > this.journal.add(JournalEntry.PONR); > MetaEditor.offlineParentInMeta(server.getCatalogTracker(), > this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); > {code} > If the JournalEntry is PONR, split's roll back will abort itselef. > It is terrible in huge putting environment when metaserver crash -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira