hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Time Less <timelessn...@gmail.com>
Subject Root Region Not Online after rolling RS Restart
Date Fri, 15 Mar 2013 01:19:48 GMT
We have a 15-node HBase cluster with RS on same nodes as HDFS DN. We do a
full restart of HBase[1]. Sometimes this works. But sometimes several of
the RS have this in their logs:

regionserverHostname: 2013-03-12 16:48:03,396 DEBUG
locateRegionInMeta parentTable=-ROOT-,
metaLocation={region=-ROOT-,,0.70236052, hostname=hbaseMasterHostname,
port=60020}, attempt=25 of 100 failed; retrying after sleep of 32000
because: org.apache.hadoop.hbase.NotServingRegionException: Region is not
online: -ROOT-,,0"

The HMaster will be failing to find -ROOT- region[2] and will be stalled
starting up.

The above counter from the logs will continue to increment to attempt
100/100, then go back down to attempt 1/100 again. This will continue
forever until we delete the stale ZK entry /hbase/root-region-server. As
soon as we do, all RS get back to normal, HBase Master comes up, and life
is good.

I searched JIRA and mailing lists and didn't find what appeared to be a
precise match. Does anyone have matching experience?

HBase version: 0.92.1 (CDH4).

[1] Stop Thrift. Stop HBase Master. Stop all RS. Stop Zookeeper. Reverse
this order for starting.
[2] I forget the precise verbiage from the HBase web UI. I will discover it
next time this happens if it's important, but it seems rather generic.

*Tim Ellis: *Fifth Sigma, Inc. Multimedia and Technology++

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message