hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Master down log.
Date Tue, 24 Jul 2012 10:52:03 GMT

My cluster got some troubles last night and at the end, all the
servers went down. Hadoop is still running, but HBase is not.

I have no clue what the root cause is. I looked at the logs on the
master side, and the fist line when it started to go down was:
2012-07-24 01:20:13,227 INFO
org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
ephemeral node deleted, processing expiration

And then everything has started to die.

At the end, on the master side, I have this in the out file:

hbase@node3:~/hbase-0.94.0$ cat logs/hbase-hbase-master-node3.out
Exception in thread "master-node3,60000,1342789522486"
        at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:749)
        at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:726)
        at org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:276)
        at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:240)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:370)
        at java.lang.Thread.run(Thread.java:722)

I think this one need to be addressed.

I looked at my zookeeper logs and I have one entry every 2 seconds. So
I think something is missconfigured and I will look at it. So the goal
of this post is just to report the error above and see if this should
be fixed by adding a null check on the related code.


View raw message