Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC11F1057E for ; Fri, 14 Mar 2014 02:11:49 +0000 (UTC) Received: (qmail 14388 invoked by uid 500); 14 Mar 2014 02:11:47 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 14356 invoked by uid 500); 14 Mar 2014 02:11:47 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 14343 invoked by uid 99); 14 Mar 2014 02:11:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Mar 2014 02:11:46 +0000 Date: Fri, 14 Mar 2014 02:11:46 +0000 (UTC) From: "Lars Hofhansl (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-9740) A corrupt HFile could cause endless attempts to assign the region without a chance of success MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-9740: --------------------------------- Fix Version/s: (was: 0.94.18) 0.94.19 > A corrupt HFile could cause endless attempts to assign the region without a chance of success > --------------------------------------------------------------------------------------------- > > Key: HBASE-9740 > URL: https://issues.apache.org/jira/browse/HBASE-9740 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.16 > Reporter: Aditya Kishore > Assignee: Ping > Fix For: 0.94.19 > > Attachments: HBase-9749_0.94_v2.patch, HBase-9749_0.94_v3.patch, patch-9740_0.94.txt > > > As described in HBASE-9737, a corrupt HFile in a region could lead to an assignment storm in the cluster since the Master will keep trying to assign the region to each region server one after another and obviously none will succeed. > The region server, upon detecting such a scenario should mark the region as "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper which should indicate the Master to stop assigning the region until the error has been resolved (via an HBase shell command, probably "assign"?) -- This message was sent by Atlassian JIRA (v6.2#6252)