hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bharath Vissapragada <bhara...@cloudera.com>
Subject Re: Distributed log splitting failing after cluster outage.
Date Mon, 10 Mar 2014 03:30:59 GMT
Glad to know everything is up. We faced this issue too, I'm not really sure
whats the exact cause of this.


On Mon, Mar 10, 2014 at 4:12 AM, David Koch <ogdude@googlemail.com> wrote:

> Actually, all the files were 0-sized so that's in the end we deleted those
> files and HBase started up.
>
>
> On Sun, Mar 9, 2014 at 7:33 PM, Bharath Vissapragada
> <bharathv@cloudera.com>wrote:
>
> > Check if there are an 0 sized wals in /hbase/.logs and sideline them and
> > restart. That could help. As Ted mentioned the actual problematic log
> names
> > are in the RS logs that got the task assigned.
> >
> >
> > On Fri, Mar 7, 2014 at 12:43 AM, David Koch <ogdude@googlemail.com>
> wrote:
> >
> > > Hello,
> > >
> > > Our HBase cluster had an unexpected shut-down and while trying to bring
> > it
> > > back up we the Master gets stuck with the following message:
> > >
> > > Failed splitting of [ list of <host_name>,<port>,<tmst> ]
> > > java.io.IOException: error or interrupted while splitting logs in [
> list
> > of
> > > <host_name>,<port>,<tmst> ]
> > > Task = installed = 10 done = 0 error = 10
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:282)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:300)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:242)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.HMaster.splitLogAfterStartup(HMaster.java:661)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:580)
> > > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:396)
> > > at java.lang.Thread.run(Thread.java:724)
> > >
> > > What can I do to get the cluster operational again. There was no data
> > > ingestion going on since quite some hours before the crash so maybe
> > > clearing out /hbase/.logs/ could be an option.
> > >
> > > Thanks,
> > >
> > > /David
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message