impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu
Date Wed, 14 Sep 2016 20:47:13 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-3739: Enable stress tests on Kudu

Patch Set 1:

Commit Message:

PS1, Line 13: D
> ds
File testdata/bin/

PS1, Line 50: with
> IIRC this syntax breaks on py 2.4, which we shouldn't be using for these te
Hm, I've seen other scripts (e.g.  already using the same syntax. Maybe Michael
has a recommendation here.

PS1, Line 96: 'tpch', 'tpcds', 'TPCDS', 'TPCH'
> are both cases necessary?
I just added it for usability in case someone decides to specify the workload in upper case.

PS1, Line 100:   parser.add_argument("-b", "--buckets", default="9",
             :       help="Number of buckets to partition Kudu tables (only for hash-based).")
> Seems fine for now, but maybe we could have #buckets as a multiple of the #
Left a TODO for now, so we can revisit later depending on how we can to test this.
File testdata/datasets/tpcds/tpcds_kudu_template.sql:

Line 1: ---- Template SQL statements to create and load TPCDS tables in
> can you explain a bit about how you picked the PKs? While we probably need 
Good points. In general, I followed the spec in setting the PK columns. Added a TODO to have
two different variables for buckets one for fact and one for dimension tables.

PS1, Line 2: KUDU.
> prev line
File testdata/datasets/tpch/tpch_kudu_template.sql:

Line 1: ---- Template SQL statements to create and load TPCH tables in
> remove the tpch tables in tpch_schema_template.sql?
Added a TODO to do this in a follow up patch.

PS1, Line 2: KUDU
> prev line
File tests/stress/

PS1, Line 900: engine=''
> I wasn't sure what engine meant until I looked at the usage. I'm wondering 
Yeah, I over-generalized this one. Changed it to something more explicit. Done

PS1, Line 1382:   if not args.tpcds_db and not args.tpch_db and not args.random_db \
              :       and not args.tpch_nested_db and not args.tpch_kudu_db \
              :       and not args.tpcds_kudu_db and not args.query_file_path:
              :     raise Exception("At least one of --tpcds-db, --tpch-db, --tpch-kudu-db,"
              :         "--tpcds-kudu-db, --tpch-nested-db, --random-db, --query-file-path
is required")
> Hmm cumbersome... Maybe someone with more python experience knows a better 
Hm, maybe Michael has a suggestion here.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-HasComments: Yes

View raw message