hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanish Kushary <himan...@gmail.com>
Subject Re: HBase Not Starting after improper shutdown
Date Tue, 24 May 2011 14:19:23 GMT
The Region Server logs also shows the same -ROOT- Region not online error.

On Mon, May 23, 2011 at 1:10 PM, Bill Graham <billgraham@gmail.com> wrote:

> Is there anything meaningful in the RS logs? I've seen situations like this
> where a RS is failing to start due to issues reading the WAL. If this is
> the
> case it would list which WAL is problematic, which is zero-length in my
> experience, so I delete it from HDFS and things start up.
>
>
> On Mon, May 23, 2011 at 9:16 AM, Himanish Kushary <himanish@gmail.com
> >wrote:
>
> > Both the Master and hbck command prints
> >
> > org.apache.hadoop.hbase.NotServingRegionException:
> > org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
> > -ROOT-,,0
> >
> > After the master thread exits due to the Heap Space error the hbck
> command
> > throws:
> >
> > org.apache.hadoop.hbase.MasterNotRunningException
> >
> > Is there anyway to fix this kind of issue.We are keeping the datanodes up
> > to
> > see whether the under replicated blocks may be recovered.Does improper
> > shutdown of the hadoop/hbase services cause this kind of issues? What
> > happens in case of disaster recovery situation, how are those situaltions
> > handled ?
> >
> > Thanks
> >
> >
> > On Mon, May 23, 2011 at 11:36 AM, Stack <stack@duboce.net> wrote:
> >
> > > What does hbase hbck say?  (http://hbase.apache.org/book.html#hbck).
> > >
> > > What does the master log have in it?  Anything of interest.
> > >
> > > St.Ack
> > >
> > > On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <himanish@gmail.com>
> > > wrote:
> > > > Pressed the send button too soon...
> > > >
> > > > Also here is the output from hadoop fsck
> > > >
> > > > *Status: HEALTHY*
> > > > * Total size: 37678848280 B*
> > > > * Total dirs: 941*
> > > > * Total files: 902 (Files currently being written: 1)*
> > > > * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total
> > open
> > > > file blocks (not validated): 1)*
> > > > * Minimally replicated blocks: 1141 (100.0 %)*
> > > > * Over-replicated blocks: 0 (0.0 %)*
> > > > * Under-replicated blocks: 906 (79.40403 %)*
> > > > * Mis-replicated blocks: 0 (0.0 %)*
> > > > * Default replication factor: 2*
> > > > * Average block replication: 2.0*
> > > > * Corrupt blocks: 0*
> > > > * Missing replicas: 1886 (82.646805 %)*
> > > > * Number of data-nodes: 2*
> > > > * Number of racks: 1*
> > > > *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
> > > > *
> > > > *
> > > > *
> > > > *
> > > > *The filesystem under path '/' is HEALTHY*
> > > >
> > > >
> > > > Could anybody please help on how to recover from this scenario .
> > > >
> > > > Thanks
> > > >
> > > >
> > > > On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <
> himanish@gmail.com
> > > >wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Our hbase/hadoop servers machines were shutdown without bringing the
> > > hadoop
> > > >> and hbase services down properly.Now when we try to bring up hbase
> we
> > > get
> > > >> the following error in the master log:
> > > >>
> > > >> org.apache.hadoop.hbase.NotServingRegionException: Region is not
> > online:
> > > >> -ROOT-,,0
> > > >>
> > > >> Hadoop services (namenode,jobtracker,datanode etc) have come up
> > properly
> > > >> and we are able to see the files in HDFS. But HBase Master keeps on
> > > throwing
> > > >> this exception and then finally throws a Java Heap Space error.
> > > >>
> > > >> Note: We have two datanodes, replication set to 2 and around 900
> > blocks
> > > are
> > > >> shown as under-replicated.
> > > >>
> > > >> ---------------------------------
> > > >> Thanks & Regards
> > > >> Himanish
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks & Regards
> > > > Himanish
> > > >
> > >
> >
> >
> >
> > --
> > Thanks & Regards
> > Himanish
> >
>



-- 
Thanks & Regards
Himanish

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message