hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dgoldenberg123 <dgoldenberg...@gmail.com>
Subject Error "Primary master encountered unexpected exception while trying to recover from ZooKeeper session expiry"
Date Thu, 16 Jul 2015 14:35:05 GMT
Could someone elaborate on what this error means?

We run into a periodic shutdown of HBase (0.98.9 built for Hadoop 2) while
inserting records into it under load and the stack trace below appears to be
reflective of the cause.

Looking at HMaster.java, what does this error imply and are there ways to
fix it or work around it?

  private boolean abortNow(final String msg, final Throwable t) {
    if (!this.isActiveMaster) {
      return true;
    }
    if (t != null && t instanceof KeeperException.SessionExpiredException) {
      try {
        LOG.info("Primary Master trying to recover from ZooKeeper session "
+
            "expiry.");
        return !tryRecoveringExpiredZKSession();
      } catch (Throwable newT) {
        LOG.error("Primary master encountered unexpected exception while " +
            "trying to recover from ZooKeeper session" +
            " expiry. Proceeding with server abort.", newT);
      }
    }
    return true;
  }


Is https://issues.apache.org/jira/browse/HBASE-4479 related at all (marked
fixed as of 0.92.0)?

Any insight would be greatly appreciated.

ERROR main-EventThread master.HMaster: Primary master encountered unexpected
exception while trying to recover from ZooKeeper session expiry. Proceeding
with server abort.
java.util.concurrent.ExecutionException: java.io.IOException: error or
interrupted while splitting logs in
hdfs://acme-server.com:9000/tmp/hbase-root/hbase/WALs/acme-server,60088,1436822380393-splitting
Task = installed 
= 1 done = 0 error = 1
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hbase.master.HMaster.tryRecoveringExpiredZKSession(HMaster.java:2498)
at org.apache.hadoop.hbase.master.HMaster.abortNow(HMaster.java:2526)
at org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:2431)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:403)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:321)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Caused by: java.io.IOException: error or interrupted while splitting logs in
hdfs://acme-server.acme.com:9000/tmp/hbase-root/hbase/WALs/acme-server,60088,1436822380393-splitting
Task = installed = 1 done = 0 error = 1
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:359)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:416)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:308)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:299)
at
org.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1178)
at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:1113)
at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:978)
at org.apache.hadoop.hbase.master.HMaster.access$300(HMaster.java:286)
at org.apache.hadoop.hbase.master.HMaster$3.call(HMaster.java:2482)
at org.apache.hadoop.hbase.master.HMaster$3.call(HMaster.java:2470)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2015-07-14 17:51:54,433 FATAL main-EventThread master.HMaster:
master:57118-0x14e89499bbd0000-0x14e89499bbd0000-0x14e89499bbd0000-0x14e89499bbd0000,
quorum=localhost:2181, baseZNode=/hbase master:57118-0x14e89499bbd0000-
0x14e89499bbd0000-0x14e89499bbd0000-0x14e89499bbd0000 received expired from
ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:403)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:321)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2015-07-14 17:51:54,433 INFO main-EventThread master.HMaster: Aborting
2015-07-14 17:51:54,433 INFO main-EventThread zookeeper.ClientCnxn:
EventThread shut down
2015-07-14 17:51:54,434 INFO acme-server,57118,1436822379834-BalancerChore
balancer.BalancerChore: acme-server,57118,1436822379834-BalancerChore
exiting
2015-07-14 17:51:54,435 INFO
acme-server,57118,1436822379834-ClusterStatusChore
balancer.ClusterStatusChore:
acme-server,57118,1436822379834-ClusterStatusChore exiting



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/Error-Primary-master-encountered-unexpected-exception-while-trying-to-recover-from-ZooKeeper-session-tp4073279.html
Sent from the HBase User mailing list archive at Nabble.com.

Mime
View raw message