impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tarmstr...@apache.org
Subject [6/7] incubator-impala git commit: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
Date Tue, 13 Jun 2017 23:17:33 GMT
IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

Occasionally we'd see HBase fail to startup properly on CentOS 7
clusters. The symptom was that HBase would not open the required nodes
in zookeeper, signaling its readiness.

As a workaround, this change includes waiting for the Zookeeper nodes
into the retry logic.

Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Reviewed-on: http://gerrit.cloudera.org:8080/7159
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: anujphadke <aphadke@cloudera.com>
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Lars Volker <lv@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/467ccd19
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/467ccd19
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/467ccd19

Branch: refs/heads/master
Commit: 467ccd19508eca0733cb061497a3c2ceca3ea849
Parents: 7a0ee68
Author: Lars Volker <lv@cloudera.com>
Authored: Mon Jun 12 15:46:25 2017 -0700
Committer: Lars Volker <lv@cloudera.com>
Committed: Tue Jun 13 05:57:49 2017 +0000

----------------------------------------------------------------------
 testdata/bin/run-hbase.sh | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/467ccd19/testdata/bin/run-hbase.sh
----------------------------------------------------------------------
diff --git a/testdata/bin/run-hbase.sh b/testdata/bin/run-hbase.sh
index 2a51105..f264b65 100755
--- a/testdata/bin/run-hbase.sh
+++ b/testdata/bin/run-hbase.sh
@@ -111,20 +111,27 @@ for ((i=1; i <= HBASE_START_RETRY_ATTEMPTS; ++i)); do
     if ! ${HBASE_HOME}/bin/start-hbase.sh 2>&1 | tee -a ${HBASE_LOGDIR}/hbase-startup.out
     then
       echo "HBase Master startup failed"
-    elif ! ${HBASE_HOME}/bin/local-regionservers.sh start 2 3 2>&1 | \
+      continue
+    fi
+    if ! ${HBASE_HOME}/bin/local-regionservers.sh start 2 3 2>&1 | \
         tee -a ${HBASE_LOGDIR}/hbase-rs-startup.out
     then
       echo "HBase regionserver startup failed"
-    else
-      break
+      continue
+    fi
+    if ! ${CLUSTER_BIN}/check-hbase-nodes.py; then
+      echo "HBase nodes did not come online"
+      continue
     fi
+    # If we made it to here, HBase started up correctly so we can stop the retry logic.
+    break
   else
     # In the last iteration, it's fine for errexit to do its thing.
     ${HBASE_HOME}/bin/start-hbase.sh 2>&1 | tee -a ${HBASE_LOGDIR}/hbase-startup.out
     ${HBASE_HOME}/bin/local-regionservers.sh start 2 3 2>&1 | \
         tee -a ${HBASE_LOGDIR}/hbase-rs-startup.out
+    ${CLUSTER_BIN}/check-hbase-nodes.py
   fi
 
 done
-${CLUSTER_BIN}/check-hbase-nodes.py
 echo "HBase startup scripts succeeded"


Mime
View raw message