impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Internal Jenkins (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-1996: Start HBase per directions in documentation; Implement HBase startup retry
Date Tue, 19 Apr 2016 01:46:46 GMT
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-1996: Start HBase per directions in documentation; Implement HBase
startup retry

IMPALA-1996: Start HBase per directions in documentation; Implement HBase startup retry

I. Start HBase per directions

1. mentions a
'regionservers' file that points to a list of hosts on which to start
HBase RegionServers. When HBase starts in our mini-cluster there are
messages printed like this:

cat: /home/mikeb/Impala/fe/src/test/resources/regionservers: No such file or directory

The presence of this file now starts a single RegionServer and takes the
place of RegionServer 1 in the "additional region servers" startup, a
separate call.

2. The additional RegionServers are started but now we only start 2 from
index 2. See

There are still 3 total RegionServers using the same ports as before. We
are simply configuring our settings as directed in the documentation.

There were mentions in testdata/bin/ of a "hbase race". One
possible such bug is
which has been fixed for a while. I've removed the check to wait for
that Master, though I have not removed the Python script that does the
waiting. We could remove that later after we let this patch bake.

Also, has been marked
"not a problem", so I've removed references to that.

II. Implement HBase start retry

If starting either HBase Master or additional RegionServers fails, kill
all of HBase and try again.  Do this for some number of attempts.

In order to keep errexit ("set -e") happy, we expect the possibility of
some of the startup attempts failing. We use control flow in those
cases. In the last case, errexit can fail on our behalf.

There is some code duplication here, but because Bash can't give us a
stack trace on failure, and only a line number, I chose not to use
functions to handle reuse. We don't really have functions anywhere else
at the moment, either.


It's pretty difficult to try to trigger a real "HBase fails to start"
situation. I tested my changes by faking HBase failures, both when
starting up the Master and first RegionServer, and also starting
subsequent RegionServers.

Multiple private builds have passed.

Change-Id: Ib1d055a8a9098ce24e2f31b969501b6e090eab19
Reviewed-by: Michael Brown <>
Tested-by: Internal Jenkins
A fe/src/test/resources/regionservers
M testdata/bin/
2 files changed, 36 insertions(+), 12 deletions(-)

  Michael Brown: Looks good to me, approved
  Internal Jenkins: Verified

To view, visit
To unsubscribe, visit

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib1d055a8a9098ce24e2f31b969501b6e090eab19
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Brown <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Michael Brown <>

View raw message