Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3B93D10240 for ; Sun, 5 May 2013 13:02:08 +0000 (UTC) Received: (qmail 70581 invoked by uid 500); 5 May 2013 13:02:07 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 69995 invoked by uid 500); 5 May 2013 13:02:02 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 69967 invoked by uid 99); 5 May 2013 13:02:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 May 2013 13:02:01 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 209.85.220.47 as permitted sender) Received: from [209.85.220.47] (HELO mail-pa0-f47.google.com) (209.85.220.47) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 05 May 2013 13:01:56 +0000 Received: by mail-pa0-f47.google.com with SMTP id kl13so1607661pab.20 for ; Sun, 05 May 2013 06:01:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:x-mailer:from:subject:date :to; bh=KJzXkobGpGBJ2t0odRcIx/Lg5Aov2eP2ITwe5UDdB8I=; b=nKugUBOBDwR9P+//El9i5q170wJcQPQMwOGUuLltF5GknVt56WykO/ajIC0yBl9q2a +Y7xIZ66owRPbudhpKeXGAPV++0brwstVBEFj/6D3IcHh8HS96EfpnmAp6H5OWdI2UY+ +NsfwMj+Rzo0qvfWtxAeA6L9IR5l0CithksZsWSRWi7lvMJtUNokxe9ZLevA6fZMzeI7 +JnNihlExd0eO0hTqQ2NQoK3/yFec9H3uM0koEjf85NO3OpJDjdKgP3inXuXIglMFbGZ Z2EFq0J5igEfrDxYbJ/uCFdphNfffcfKCOEz3vbTIBsTdBtNJN4wtuYjWAHREBSXN4NK rSqw== X-Received: by 10.66.155.102 with SMTP id vv6mr22593681pab.64.1367758895755; Sun, 05 May 2013 06:01:35 -0700 (PDT) Received: from [192.168.0.14] (c-24-130-233-55.hsd1.ca.comcast.net. [24.130.233.55]) by mx.google.com with ESMTPSA id qb1sm19623948pbb.33.2013.05.05.06.01.32 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 05 May 2013 06:01:34 -0700 (PDT) References: Mime-Version: 1.0 (1.0) In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-Id: <3B2631CD-5068-4228-899A-EFDE8B98F934@gmail.com> Cc: "dev@hbase.apache.org" X-Mailer: iPhone Mail (10B146) From: Ted Yu Subject: Re: Corrupted log file Date: Sun, 5 May 2013 06:01:30 -0700 To: "dev@hbase.apache.org" X-Virus-Checked: Checked by ClamAV on apache.org Looks like breaking the loop is better choice.=20 Cheers On May 5, 2013, at 4:38 AM, Jean-Marc Spaggiari wr= ote: > Ok, I will open a JIRA for that later today... >=20 > On the RS side, should we break the loop? Or kill the server? Because not= > being able to read the log might end with rs inconsistencies? >=20 > JM > Le 1 mai 2013 14:51, "Ted Yu" a =C3=A9crit : >=20 >> Ideally HBCK should sideline corrupted log file so that region server can= >> start. >>=20 >> Cheers >>=20 >> On Wed, May 1, 2013 at 11:48 AM, Nick Dimiduk wrote:= >>=20 >>> Detecting the condition, printing the warning, and breaking the loop >> sounds >>> like an urgent bandaid solution to me. >>>=20 >>> -n >>>=20 >>> On Tue, Apr 30, 2013 at 11:51 AM, Jean-Marc Spaggiari < >>> jean-marc@spaggiari.org> wrote: >>>=20 >>>> When a log file (into /hbase/.logs) is corrupted, HBase is not able to >>>> start because it tries to read it again and again. >>>>=20 >>>> Also, there is nothing into HBCK to detect that. >>>>=20 >>>> Should we have something to check that? Like in hbck, we can simply try= >>> to >>>> open the log file, and read it? To report the warning? >>>>=20 >>>> JM >>=20