Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of bharathv@cloudera.com
 designates 74.125.82.169 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAE24rAd8a7EdsaEcm7REqk5noYSN4pS-ZhzzCUz22kQYxs8B-w@mail.gmail.com>
References: 
 <CAE24rAdx=EkHWbV+kBvwwPv+udnJE9DZcxehfJHW5VY91pxH6Q@mail.gmail.com>
 <CAFWiQHbTVk0Z2MwS=R2SjNz461pmY62nH-6EsP6VC=8PZvVcgg@mail.gmail.com>
 <CAE24rAd8a7EdsaEcm7REqk5noYSN4pS-ZhzzCUz22kQYxs8B-w@mail.gmail.com>
From: Bharath Vissapragada <bharathv@cloudera.com>
Date: Mon, 10 Mar 2014 09:00:59 +0530
Message-ID: 
 <CAFWiQHa2roVSneEBwh_O3R8rAg=LqwHYAbi=vdSmuWHYK7AjNw@mail.gmail.com>
Subject: Re: Distributed log splitting failing after cluster outage.
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=f46d043c801eb4669704f4383c2c

--f46d043c801eb4669704f4383c2c
Content-Type: text/plain; charset=ISO-8859-1

Glad to know everything is up. We faced this issue too, I'm not really sure
whats the exact cause of this.


On Mon, Mar 10, 2014 at 4:12 AM, David Koch <ogdude@googlemail.com> wrote:

> Actually, all the files were 0-sized so that's in the end we deleted those
> files and HBase started up.
>
>
> On Sun, Mar 9, 2014 at 7:33 PM, Bharath Vissapragada
> <bharathv@cloudera.com>wrote:
>
> > Check if there are an 0 sized wals in /hbase/.logs and sideline them and
> > restart. That could help. As Ted mentioned the actual problematic log
> names
> > are in the RS logs that got the task assigned.
> >
> >
> > On Fri, Mar 7, 2014 at 12:43 AM, David Koch <ogdude@googlemail.com>
> wrote:
> >
> > > Hello,
> > >
> > > Our HBase cluster had an unexpected shut-down and while trying to bring
> > it
> > > back up we the Master gets stuck with the following message:
> > >
> > > Failed splitting of [ list of <host_name>,<port>,<tmst> ]
> > > java.io.IOException: error or interrupted while splitting logs in [
> list
> > of
> > > <host_name>,<port>,<tmst> ]
> > > Task = installed = 10 done = 0 error = 10
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:282)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:300)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:242)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.HMaster.splitLogAfterStartup(HMaster.java:661)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:580)
> > > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:396)
> > > at java.lang.Thread.run(Thread.java:724)
> > >
> > > What can I do to get the cluster operational again. There was no data
> > > ingestion going on since quite some hours before the crash so maybe
> > > clearing out /hbase/.logs/ could be an option.
> > >
> > > Thanks,
> > >
> > > /David
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>


-- 
Bharath Vissapragada
<http://www.cloudera.com>

--f46d043c801eb4669704f4383c2c--