Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5CD496C70 for ; Mon, 4 Jul 2011 17:09:53 +0000 (UTC) Received: (qmail 4892 invoked by uid 500); 4 Jul 2011 17:09:51 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 4820 invoked by uid 500); 4 Jul 2011 17:09:51 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 4811 invoked by uid 99); 4 Jul 2011 17:09:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jul 2011 17:09:50 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of saint.ack@gmail.com designates 209.85.216.169 as permitted sender) Received: from [209.85.216.169] (HELO mail-qy0-f169.google.com) (209.85.216.169) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jul 2011 17:09:45 +0000 Received: by qyk32 with SMTP id 32so1306058qyk.14 for ; Mon, 04 Jul 2011 10:09:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=3V/i9q4j+Pj7W0MKXgXqXZRhYscoM/cBgpdpTx8PoiQ=; b=bYrDSSSHZVbgR77XwZ942t5bsycz7i5nco0CTDFxXE+vWzafd4bvu2fILuN8CRpDN7 OeOb98QPiSBwBo9iC34ydxmjM4AOlqSBbDkO75xWyquJSRNtpYGvFFiEn/kCPBKAVpNg kbMTC2VjOj29LQJhXss6YgYonhYpshKWP2LMA= MIME-Version: 1.0 Received: by 10.224.49.80 with SMTP id u16mr4929742qaf.165.1309799364566; Mon, 04 Jul 2011 10:09:24 -0700 (PDT) Sender: saint.ack@gmail.com Received: by 10.224.37.141 with HTTP; Mon, 4 Jul 2011 10:09:24 -0700 (PDT) In-Reply-To: References: <1309678769.35571.YahooMailNeo@web65504.mail.ac4.yahoo.com> <1309680507.17345.YahooMailNeo@web65503.mail.ac4.yahoo.com> Date: Mon, 4 Jul 2011 10:09:24 -0700 X-Google-Sender-Auth: 0jcsw9VjiCHHHIGOKDLFjQIf2VM Message-ID: Subject: Re: hbck -fix From: Stack To: user@hbase.apache.org Cc: Andrew Purtell Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Sun, Jul 3, 2011 at 10:12 AM, Wayne wrote: > HBase needs to evolve a little more before organizations > like ours can just "use it" without having to become experts. I'd agree with this. In its current state, at least a part-time, seasoned operations engineer (per Andrew's description) is necessary if a substantial production deploy. I don't think that an onerous expectation for a critical piece of infrastructure. It'd certainly broaden our appeal though if we could get into the mysql calibre of ease-of-use.... That said, the issue you ran into where an 'incident' make it so a 'smart' fellow was unable to reconstitute his store needs addressing. We'll work on this. St.Ack > I have to say the community behind HBase is fantastic and goes above and > beyond to help greenies like ourselves be successful. With just a little > more polish around the edges I think it can and will really > become successful for a much wider audience. Thanks for everyones help. > > > On Sun, Jul 3, 2011 at 4:08 AM, Andrew Purtell wrot= e: > >> I shorthanded this a bit: >> >> > Certainly a seasoned operations engineer would be a good investment fo= r >> anyone. >> >> >> Let's try instead: >> >> Certainly a seasoned operations engineer [with Java experience] would be= a >> good investment for anyone [running Hadoop based systems]. >> >> I'm not sure what I wrote earlier adequately conveyed the thought. >> >> >> =A0 - Andy >> >> >> >> >> > From: Andrew Purtell >> > To: "user@hbase.apache.org" >> > Cc: >> > Sent: Sunday, July 3, 2011 12:39 AM >> > Subject: Re: hbck -fix >> > >> > Wayne, >> > >> > Did you by chance have your NameNode configured to write the edit log = to >> only >> > one disk, and in this case only the root volume of the NameNode host? = As >> I'm >> > sure you are now aware, the NameNode's edit log was corrupted, at leas= t >> the >> > tail of it anyway, when the volume upon which it was being written was >> filled by >> > an errant process. The HDFS NameNode has a special critical role and i= t >> really >> > must be treated with the utmost care. It can and should be configured = to >> write >> > the fsimage and edit log to multiple local dedicated disks. And, user >> processes >> > should never run on it. >> > >> > >> >> =A0Hope has long since flown out the window. I just changed my opinio= n of >> what >> >> =A0it takes to manage hbase. A Java engineer is required on staff. >> > >> > Perhaps. >> > >> > Certainly a seasoned operations engineer would be a good investment fo= r >> anyone. >> > >> >> =A0Having >> >> =A0RF=3D3 in HDFS offers no insurance against hbase lossing its shirt= and >> having >> >> =A0.META. getting corrupted. >> > >> > This is a valid point. If HDFS loses track of blocks containing META >> table data >> > due to fsimage corruption on the NameNode, having those blocks on 3 >> DataNodes is >> > of no use. >> > >> > >> > I've done exercises in the past like delete META on disk and recreate = it >> > with the earlier set of utilities (add_table.rb). This always "worked = for >> > me" when I've tried it. >> > >> > >> > Results from torture tests that HBase was subjected to in the timefram= e >> leading >> > up to 0.90 also resulted in better handling of .META. table related >> errors. They >> > are fortunately demonstrably now rare. >> > >> > >> > Clearly however there is room for further improvement here. >> > I will work on https://issues.apache.org/jira/browse/HBASE-4058 and >> hopefully >> > produce a unit test that fully exercises the ability of HBCK to >> reconstitute >> > META and gives >> > reliable results that can be incorporated into the test suite. My conc= ern >> here >> > is getting repeatable results demonstrating HBCK weaknesses will be >> challenging. >> > >> > >> > Best regards, >> > >> > >> > =A0 =A0 =A0 =A0- Andy >> > >> > Problems worthy of attack prove their worth by hitting back. - Piet He= in >> (via >> > Tom White) >> > >> > >> > ----- Original Message ----- >> >> =A0From: Wayne >> >> =A0To: user@hbase.apache.org >> >> =A0Cc: >> >> =A0Sent: Saturday, July 2, 2011 9:55 AM >> >> =A0Subject: Re: hbck -fix >> >> >> >> =A0It just returns a ton of errors (import: command not found). Our >> cluster is >> >> =A0hosed anyway. I am waiting to get it completely re-installed from >> scratch. >> >> =A0Hope has long since flown out the window. I just changed my opinio= n of >> what >> >> =A0it takes to manage hbase. A Java engineer is required on staff. I = also >> >> =A0realized now a backup strategy is more important than for a RDBMS. >> Having >> >> =A0RF=3D3 in HDFS offers no insurance against hbase lossing its shirt= and >> having >> >> =A0.META. getting corrupted. I think I just found the achilles heel. >> >> >> >> >> >> =A0On Sat, Jul 2, 2011 at 12:40 PM, Ted Yu wrot= e: >> >> >> >>> =A0 Have you tried running check_meta.rb with --fix ? >> >>> >> >>> =A0 On Sat, Jul 2, 2011 at 9:19 AM, Wayne wrote: >> >>> >> >>> =A0 > We are running 0.90.3. We were testing the table export not >> > realizing >> >> =A0the >> >>> =A0 > data goes to the root drive and not HDFS. The export filled th= e >> >> =A0master's >> >>> =A0 > root partition. The logger had issues and HDFS got corrupted >> >>> =A0 > ("java.io.IOException: >> >>> =A0 > Incorrect data format. logVersion is -18 but writables.length = is >> >> =A00"). We >> >>> =A0 > had >> >>> =A0 > to run hadoop fsck -move to fix the corrupted hdfs files. Were >> > were >> >> =A0able >> >>> =A0 to >> >>> =A0 > get hdfs running without issues but hbase ended up with the >> > region >> >>> =A0 issues. >> >>> =A0 > >> >>> =A0 > We also had another issue making it worse with Ganglia. We had >> > moved >> >> =A0the >> >>> =A0 > Ganglia host to the master server and Ganglia took up so many >> >> =A0resources >> >>> =A0 > that >> >>> =A0 > it actually caused timeouts talking to the master and most nod= es >> > ended >> >> =A0up >> >>> =A0 > shutting down. I guess Ganglia is a pig in terms or resources.= .. >> >>> =A0 > >> >>> =A0 > I just tried to manually edit the .META. table removing the >> > remnants >> >> =A0of >> >>> =A0 the >> >>> =A0 > old table but the shell went haywire on me and turned to contr= ol >> >>> =A0 > characters..??...I ended up corrupting the whole thing and had= to >> > >> >> =A0delete >> >>> =A0 > all >> >>> =A0 > tables...we have just not had a good week. >> >>> =A0 > >> >>> =A0 > I will add comments to HBASE-3695 in terms of suggestions. >> >>> =A0 > >> >>> =A0 > Thanks. >> >>> =A0 > >> >>> =A0 > On Fri, Jul 1, 2011 at 4:55 PM, Stack >> > wrote: >> >>> =A0 > >> >>> =A0 > > What version of hbase are you on Wayne? >> >>> =A0 > > >> >>> =A0 > > On Fri, Jul 1, 2011 at 8:32 AM, Wayne >> > >> >> =A0wrote: >> >>> =A0 > > > I ran the hbck command and found 14 inconsistencies. >> > There >> >> =A0were files >> >>> =A0 > in >> >>> =A0 > > > hdfs not used for region >> >>> =A0 > > >> >>> =A0 > > These are usually harmless. =A0Bad accounting on our part. >> > Need to >> >> =A0plug >> >>> =A0 > the >> >>> =A0 > > hole. >> >>> =A0 > > >> >>> =A0 > > >, regions with the same start key, a hole in the >> >>> =A0 > > > region chain, and a missing start region with an empty >> > key. >> >>> =A0 > > >> >>> =A0 > > These are pretty serious. >> >>> =A0 > > >> >>> =A0 > > How'd the master running out of root partition do this? >> > >> >> =A0I'd be >> >>> =A0 > > interested to know. >> >>> =A0 > > >> >>> =A0 > > > We are not in production so we have the luxury to start >> > >> >> =A0again, but >> >>> =A0 the >> >>> =A0 > > > damage to our confidence is severe. Is there work going >> > on >> >> =A0to improve >> >>> =A0 > > hbck >> >>> =A0 > > > -fix to actually be able to resolve these types of >> > issues? >> >> =A0Do we need >> >>> =A0 > to >> >>> =A0 > > > expect to run a production hbase cluster to be able to >> > move >> >> =A0around >> >>> =A0 and >> >>> =A0 > > > rebuild the region definitions and the .META. table by >> > hand? >> >> =A0Things >> >>> =A0 > just >> >>> =A0 > > got >> >>> =A0 > > > a lot scarier fast for us, especially since we were >> > hoping >> >> =A0to go into >> >>> =A0 > > > production next month. Running out of disk space on the >> > >> >> =A0master's root >> >>> =A0 > > > partition can bring down the entire cluster? This is >> >> =A0scary... >> >>> =A0 > > > >> >>> =A0 > > >> >>> =A0 > > Understood. >> >>> =A0 > > >> >>> =A0 > > St.Ack >> >>> =A0 > > >> >>> =A0 > >> >>> >> >> >> >----- Original Message ----- >> >> >