Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4F7A92E25 for ; Wed, 27 Apr 2011 15:29:29 +0000 (UTC) Received: (qmail 52641 invoked by uid 500); 27 Apr 2011 15:29:27 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 52613 invoked by uid 500); 27 Apr 2011 15:29:27 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 52605 invoked by uid 99); 27 Apr 2011 15:29:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Apr 2011 15:29:27 +0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,MISSING_HEADERS,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jonathan.bender@gmail.com designates 209.85.160.41 as permitted sender) Received: from [209.85.160.41] (HELO mail-pw0-f41.google.com) (209.85.160.41) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Apr 2011 15:29:22 +0000 Received: by pwi10 with SMTP id 10so971724pwi.14 for ; Wed, 27 Apr 2011 08:29:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:cc:content-type; bh=BzVPNTh9rwpZGvmDbO+Xq5rH4YcwnAtlrDyPzN3OSwc=; b=UnuQYQa5HABpeo/+5HcckNLWrluMOBBaUnTMEsbmuS2Le2Hx7VpTmoqf3o8fNx8lgL Hflu9lKNlHDKpPujXWNS8m4aj+37sJvdo8uBGtyoSVp3B26zGvBR+yxqTKX3/U61rlvZ UdmI621Xjrjx6+RA6uDNlL/t8LxwYlMiijsyc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:cc :content-type; b=D00VXzjJ1piXPgCEvVDTfX/IRRvVhNZzaLnD5hPLfhZYeu3RVRKBTCWIvCY4f/7k9I 13WTH3iyebVJN1uuvlsnrieMi6Exv/n257QTgDJCF1bSZWN9vzbT3FVUpyrVBuMi8nQS JhHEutEzz+DQc34oJ05xdVKObNI8GM1tvfeVc= Received: by 10.68.23.133 with SMTP id m5mr2282450pbf.73.1303918141091; Wed, 27 Apr 2011 08:29:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.50.5 with HTTP; Wed, 27 Apr 2011 08:28:41 -0700 (PDT) In-Reply-To: References: From: Jonathan Bender Date: Wed, 27 Apr 2011 08:28:41 -0700 Message-ID: Subject: Re: HDFS reports corrupted blocks after HBase reinstall Cc: user@hbase.apache.org Content-Type: multipart/alternative; boundary=bcaec53149f79dadab04a1e819c4 --bcaec53149f79dadab04a1e819c4 Content-Type: text/plain; charset=ISO-8859-1 So it's definitely a case of HDFS not being able to recover the image. Maybe this is better directed toward another list, but has anyone had issues with this, or any suggestions for trying to eradicate this? 2011-04-26 17:15:56,898 INFO org.apache.hadoop.hdfs.server.common.Storage: Recovering storage directory /var/lib/hadoop-0.20/cache/hadoop/dfs/name from failed checkpoint. 2011-04-26 17:15:56,905 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 204 2011-04-26 17:15:57,020 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2011-04-26 17:15:57,021 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 26833 loaded in 0 seconds. 2011-04-26 17:15:57,257 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Invalid opcode, reached end of edit log Number of transactions found 528 2011-04-26 17:15:57,258 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits of size 1049092 edits # 528 loaded in 0 seconds. 2011-04-26 17:15:57,265 ERROR org.apache.hadoop.hdfs.server.common.Storage: Unable to save image for /var/lib/hadoop-0.20/cache/hadoop/dfs/name java.io.IOException: saveLeases found path /hbase/base_tmp/.logs/ sv004.my.domain.com,60020,1302882411768/sv004.my.domain.com%3A60020.1302882412951 but no matching entry in namespace. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:5153) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1071) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1170) at org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1118) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:347) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:321) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:267) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:461) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1202) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1211) 2011-04-26 17:15:57,273 WARN org.apache.hadoop.hdfs.server.common.Storage: FSImage:processIOError: removing storage: /var/lib/hadoop-0.20/cache/hadoop/dfs/name 2011-04-26 17:15:57,274 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 1553 msecs On Tue, Apr 26, 2011 at 5:19 PM, Jonathan Bender wrote: > Wow, this is more intense than I thought...as soon as I load HBase again, > my HDFS filesystem reverts back to an older snapshot essentially. As in, I > don't see any of the changes I had made since that time, in the hbase table > or otherwise. > > I'm using CDH3 beta 4, which I believe stores its local hbase data in a > different directory--not entirely sure where though. > > I'm not entirely sure what happened to mess this up, but it seems pretty > serious. > > On Tue, Apr 26, 2011 at 5:07 PM, Himanshu Vashishtha < > hvashish@cs.ualberta.ca> wrote: > >> Could it be the /tmp/hbase- directory that is playing the culprit. >> just a wild guess though. >> >> >> On Tue, Apr 26, 2011 at 5:56 PM, Jean-Daniel Cryans wrote: >> >>> Unless HBase was running when you wiped that out (and even then), I >>> don't see how this could happen. Could you match those blocks to the >>> files using fsck and figure when the files were created and if they >>> were part of the old install? >>> >>> Thx, >>> >>> J-D >>> >>> On Tue, Apr 26, 2011 at 4:53 PM, Jonathan Bender >>> wrote: >>> > Hi all, I'm having a strange error which I can't exactly figure out. >>> > >>> > After wiping my /hbase HDFS directory to do a fresh install, I am >>> getting >>> > "MISSING BLOCKS" in this /hbase directory, which cause HDFS to start up >>> in >>> > safe mode. This doesn't happen until I start my region servers, so I >>> have a >>> > feeling there is some kind of corrupted metadata that is being loaded >>> from >>> > these region servers. >>> > >>> > Is there a graceful way to wipe the HBase directory clean? Any local >>> > directories on the region servers /master / ZK server that I should be >>> > wiping as well? >>> > >>> > Cheers, >>> > Jon >>> > >>> >> >> > --bcaec53149f79dadab04a1e819c4--