Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 42880 invoked from network); 18 May 2009 18:38:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 May 2009 18:38:08 -0000 Received: (qmail 69746 invoked by uid 500); 18 May 2009 18:38:07 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 69684 invoked by uid 500); 18 May 2009 18:38:07 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 69674 invoked by uid 99); 18 May 2009 18:38:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 May 2009 18:38:07 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 May 2009 18:38:05 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 8DCB029A0011 for ; Mon, 18 May 2009 11:37:45 -0700 (PDT) Message-ID: <1819185920.1242671865573.JavaMail.jira@brutus> Date: Mon, 18 May 2009 11:37:45 -0700 (PDT) From: "stack (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-1421) Processing a regionserver message -- OPEN, CLOSE, SPLIT, etc. -- and if we're carrying more than one message in payload, if exception, all messages that follow are dropped on floor In-Reply-To: <1810337789.1242246045556.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710438#action_12710438 ] stack commented on HBASE-1421: ------------------------------ MSG_REPORT_PROCESS_OPEN doesn't seem to do anything master-side any more (I thought it used to update timers on master-side)? This means that if the open message is lost, then we don't try open again seemingly? Seems broke. Otherwise, I went through the processing of messages returned by the regionserver and tried remove all places where we threw unchecked exceptions in particular. I also changed the process message signatures so they don't throw even IOExceptions. Instead we just log warnings since most of the time these are non-fatal anyway and even if they are damaging, we probably want to keep going with a warning log rather than throw an exception that can possibly do even more damage. > Processing a regionserver message -- OPEN, CLOSE, SPLIT, etc. -- and if we're carrying more than one message in payload, if exception, all messages that follow are dropped on floor > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: HBASE-1421 > URL: https://issues.apache.org/jira/browse/HBASE-1421 > Project: Hadoop HBase > Issue Type: Bug > Reporter: stack > Priority: Blocker > Fix For: 0.20.0 > > Attachments: 1421.patch > > > Just saw this in pset cluster. Marking blocker. > We had an incidence of HBASE-1344 on our 0.19.x era hbase cluster. The report from the regionserver was carrying at least two open messages. The first provoked the exception, the second open message was never processed. Regionserver thought it had successfully opened region. Master didn't know anything about it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.