Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8E0EADFA5 for ; Sun, 1 Jul 2012 01:55:32 +0000 (UTC) Received: (qmail 41038 invoked by uid 500); 1 Jul 2012 01:55:30 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 40984 invoked by uid 500); 1 Jul 2012 01:55:30 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 40975 invoked by uid 99); 1 Jul 2012 01:55:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jul 2012 01:55:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chilinglam@gmail.com designates 209.85.214.169 as permitted sender) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jul 2012 01:55:25 +0000 Received: by obhx4 with SMTP id x4so2066325obh.14 for ; Sat, 30 Jun 2012 18:55:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:in-reply-to:mime-version:content-transfer-encoding :content-type:message-id:cc:x-mailer:from:subject:date:to; bh=4XKSlxOPsSvdPPz2mcjsmFxCuA/FNFadJTjo3jLHitQ=; b=b+m1HkRGs7OSzThAKVExIn4xuUWEkoWw34XV+9+YnnrvOkNjy0ieSeM5/wNXcSDW+D 0K543I1xKSnBfWNy+LpNNYSQ0OX7pED54K5Rdnqwj0vfjOxS6sfUt3Oe7BcYQ0qtR1vz 3/k3SG0i/6Ca1bc7BrFN5Bdp3z19bwkhmL4tsma3kPXaGnhrTammQfE1DT2o2q0w2fOK 6pMFjvh8yS7q8tCurFk6LpbUJhCs7K+uvFKEcges42E60M/Mfi9qeZ2F9YF/uHD5z6s/ r+mBdxJ6nEZ4yhw+bCmBEep0rweNT2njyezD0UK0KN7FH2YXNwVVKmjo4JKaXDw1psiB +rcQ== Received: by 10.50.169.38 with SMTP id ab6mr2423128igc.46.1341107704803; Sat, 30 Jun 2012 18:55:04 -0700 (PDT) Received: from [192.168.2.15] (bas3-cooksville17-3096513825.dsl.bell.ca. [184.145.13.33]) by mx.google.com with ESMTPS id f8sm2381898ign.0.2012.06.30.18.55.02 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 30 Jun 2012 18:55:03 -0700 (PDT) References: In-Reply-To: Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <9B9DCFA6-BA17-496C-B475-432C6D47684D@gmail.com> Cc: "user@hbase.apache.org" X-Mailer: iPad Mail (9B206) From: Jerry Lam Subject: Re: Recovering corrupt HLog files Date: Sat, 30 Jun 2012 21:55:00 -0400 To: "user@hbase.apache.org" X-Virus-Checked: Checked by ClamAV on apache.org This is interesting because I saw this happens in the past. Is walplayer can= be back ported to 0.90.x?=20 Best Regards, Jerry=20 Sent from my iPad On 2012-06-30, at 16:34, Li Pi wrote: > Nope. It came out in 0.94 otoh. >=20 > On Sat, Jun 30, 2012 at 12:29 PM, Bryan Beaudreault < > bbeaudreault@hubspot.com> wrote: >=20 >> I should have mentioned in my initial email that I am operating on HBase >> 0.90.4. Is WALPlayer available in this version? I am having trouble >> finding it or anything similar. >>=20 >> On Sat, Jun 30, 2012 at 1:14 PM, Li Pi wrote: >>=20 >>> WALPlayer will look at the timestamp. Replaying an older edit that has >>> since been overwritten shouldn't change anything. >>>=20 >>> On Sat, Jun 30, 2012 at 9:49 AM, Bryan Beaudreault < >>> bbeaudreault@hubspot.com >>>> wrote: >>>=20 >>>> They are all pretty large, around 40+mb. Will the walplayer be smart >>>> enough to only write edits that still look relevant (i.e. based on >>>> timestamps of the edits vs timestamps of the versions in hbase)? >> Writes >>>> have been coming in since we recovered. >>>>=20 >>>> On Sat, Jun 30, 2012 at 11:05 AM, Stack wrote: >>>>=20 >>>>> On Sat, Jun 30, 2012 at 8:38 AM, Bryan Beaudreault >>>>> wrote: >>>>>> 12/06/30 00:00:48 INFO wal.HLogSplitter: Got while parsing hlog >>>>>>=20 >>>>>=20 >>>>=20 >>>=20 >> hdfs://my-namenode-ip-addr:8020/hbase/.logs/my-rs-ip-addr,60020,133866771= 9591/my-rs-ip-addr%3A60020.1340935453874. >>>>>> Marking as corrupted >>>>>>=20 >>>>>=20 >>>>> What size do these logs have? >>>>>=20 >>>>>> We are back to stable operating now, and in trying to research >> this I >>>>> found >>>>>> the hdfs://my-namenode-ip-addr:8020/hbase/.corrupt directory. >> There >>>> are >>>>> 20 >>>>>> files listed there. >>>>>>=20 >>>>>=20 >>>>> Ditto. >>>>>=20 >>>>>> What are our options for tracking down and potentially recovering >> any >>>>> data >>>>>> that was lost. Or how can we even tell what was lost, if any? >> Does >>>> the >>>>>> existence of these files pretty much guarantee data lost? There >>> doesn't >>>>>> seem to be much documentation on this. =46rom reading it seems like >> it >>>>> might >>>>>> be possible that part of each of these files was recovered. >>>>>>=20 >>>>>=20 >>>>> If size > 0, could try walplaying them: >>>>> http://hbase.apache.org/book.html#walplayer >>>>>=20 >>>>> St.Ack >>>>>=20 >>>>=20 >>>=20 >>=20