impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster
Date Mon, 31 Oct 2016 23:50:52 GMT
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster

Patch Set 7:

File bin/

PS7, Line 17: This script is a setup script
How about simply "this is a setup script"

PS7, Line 21: The script expects a setup Impala development environment.
How about:
This script should be executed from a machine that has the Impala development environment
set up.

PS7, Line 22: appropriately
It's not clear what this means. Maybe expand on this a little?

PS7, Line 47: import logging
import statements should be above from x import y

PS7, Line 111: ,

PS7, Line 116: that,

PS7, Line 127: not
having a not here makes it harder to reason about, how about:
options.gateway if options.gateway else cm_host

PS7, Line 130: password=options.cm_pass)
Is this indent correct according to the style guide? I thought it was supposed to be double
the regular indent (so 8 spaces)?

PS7, Line 204: Only pick the first cluster configured
This comment does not really add anything, it's obvious what the code is doing. Maybe explain
WHY we are getting only the first?

PS7, Line 209: for service in cluster.get_all_services():
Just an observation: it looks like we don't check that all services we need are present. For
example if IMPALA is not in the list, this will not get caught in this method. If Impala is
not present, there will be an exception on line 275.

Maybe instead of iterating way, we can first construct a dictionary with service.type as key
and service as value. Then pull out one service after another to populate result. If a service
is missing it will be caught earlier this way.

What do you think?

Line 281:         # See
It looks like the issue is resolved, is this still necessary?

PS7, Line 284: self.options.snapshot_file is not None:
How about removing the not and switching the if and else clause?

PS7, Line 338: "database_name"
Why would it ever be called "database_name"? Where is this constant coming from?

PS7, Line 432: wmay

PS7, Line 451: # "--skip_local_tests",
why is this commented out?
File testdata/bin/

Line 163:   set -e
Don't we already set -eou at the top of the script? Either remove or add a comment why we
need to set -e again.

Line 492:     run-step "Loading nested data" load-nested.log \
The command is very similar in both cases. The only difference is --cm-host. How about factoring
that out?
File testdata/bin/

Line 411:     if os.getenv("REMOTE_LOAD"):
base_load_dir = os.getenv("REMOTE_LOAD") or os.getenv("IMPALA_HOME")
File testdata/bin/

Line 301:   parser.add_argument("-s", "--source-db", default="tpch_parquet")
don't we also need cm-host parameter here?
File testdata/bin/

Line 20: set -e
we are already setting -euo pipefail below

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <>
Gerrit-Reviewer: David Knupp <>
Gerrit-Reviewer: Harrison Sheinblatt <>
Gerrit-Reviewer: Martin Grund <>
Gerrit-Reviewer: Michael Brown <>
Gerrit-Reviewer: Taras Bobrovytsky <>
Gerrit-HasComments: Yes

View raw message