hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MiMills <mikecmi...@hotmail.com>
Subject Re: Corrupted META?
Date Thu, 02 Jul 2015 18:12:28 GMT
We got it back up and running late last night with no loss of data.

We did a few things and I'll try and list them here for others. Some items
may have not helped, but we did them as we were trying to resolve the issue
and they caused no harm.

* From Samir's list:
1. Shutdown your cluster (master and regionservers 
2. Clean your zk data (if you have separated zk cluster you can execute 
"hbase zkcli" and then in shell "rmr /hbase", if hbase manages zk then you 
will need to clear zk data dir) 
3. run hdfs fsck /hbase to confirm that hdfs data in not corrupted 
4. Start only hbase master and watch logs for errors 
5. If master start successfully start regionservers one by one 
6. If all  regionservers started correctly run hbase hbck   to check for 
inconsistencies if they are reported try running hbase hbck -fix (or 
-repair) 

* Except hbase hbck - repair (and other similar arguments) did not work
because of the meta region stuck in transition. We would get errors when we
ran it. But doing the above got the region servers up and running. Sometimes
we had to run "./hbase-daemon.sh start regionserver" twice to get it up and
running.
* We then used the hbase shell "assign" command to move the hbase meta
region out of transition: assign 'region_name'
* We then ran "./bin/hbase hbck -fixAssignments" to get hbase to recognize
our existing regions. We did this while the cluster was not accepting any
requests.

I think the region servers would not start and were being marked as dead
during startup because there was no meta region. The meta region couldn't be
moved because there were no running region servers. So it was a catch-22.
Doing the above at least got the region servers running so that the meta
region could be moved out of transition.

Here's the relevant Master log entries:
2015-07-01 04:39:34,480 WARN  [MASTER_META_SERVER_OPERATIONS-master:60000-0]
master.AssignmentManager: Can't move 1588230740, there is no destination
server available.
2015-07-01 04:39:34,480 WARN  [MASTER_META_SERVER_OPERATIONS-master:60000-0]
master.AssignmentManager: Unable to determine a plan to assign {ENCODED =>
1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
2015-07-01 04:39:35,480 WARN  [MASTER_META_SERVER_OPERATIONS-master:60000-0]
master.AssignmentManager: Can't move 1588230740, there is no destination
server available.
2015-07-01 04:39:35,481 WARN  [MASTER_META_SERVER_OPERATIONS-master:60000-0]
master.AssignmentManager: Unable to determine a plan to assign {ENCODED =>
1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
2015-07-01 04:39:36,379 ERROR [RpcServer.handler=6,port=60000]
master.HMaster: Region server server2.corp.gs.com,60020,1435743503791
reported a fatal error:
ABORTING region server server1.corp.gs.com,60020,1435743483790:
org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
currently processing server1.corp.gs.com,60020,1435743483790 as dead server
	at
org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:339)
	at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:254)
	at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1343)
	at
org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:5087)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)

Cause:
org.apache.hadoop.hbase.YouAreDeadException:
org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
currently processing server1.corp.gs.com,60020,1435743483790 as dead server

Thanks for your help!




--
View this message in context: http://apache-hbase.679495.n3.nabble.com/Corrupted-META-tp4072787p4072863.html
Sent from the HBase User mailing list archive at Nabble.com.

Mime
View raw message