impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] Preview: IMPALA-4467: Add support for CRUD operations in stress test
Date Tue, 15 Nov 2016 19:47:57 GMT
Alex Behm has posted comments on this change.

Change subject: Preview: IMPALA-4467: Add support for CRUD operations in stress test
......................................................................


Patch Set 1:

(16 comments)

Looks like it's going in the right direction.

Most comments are questions.

Might be good to outline the flow of a new Kudu stress run so it's easier to follow for reviewers.
Maybe in a top-level comment inside concurrent_select.py

http://gerrit.cloudera.org:8080/#/c/5093/1/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

Line 680:     # Query to run before worst_case_sql to get the worst case performance
What's worst_case_sql? Can you give a little more flavor to the comment? I'm not really sure
why this setup query is needed. Also consider that the setup query itself may fail/crash which
might complicate things.


Line 688:     self.options = {}
needs a comment


Line 691:     self.modifies_table = False
let's introduce a stmt_type enum that can have values like SELECT/INSERT/UPSERT/DELETE/COMPUTE_STATS
etc.


Line 694:     self.MIN_INT = 10
var names are not very telling, what are these numbers?


Line 832:       raise
seems like we intentionally did not raise before?


Line 942:           LOG.debug("Result set empty for query with id %s",
Another possible use of stmt_type


Line 1239:     try:
we can use the stmt_type here to only run explain for those stmts that support it (i.e. anything
except compute stats).


Line 1356: def clean_up_database(cursor):
reset_database?

"clean up" sounds like DROP DATABASE CASCADE


Line 1357:   LOG.info("Cleaning up {0} database".format(cursor.db_name))
needs comment


Line 1360:     if not table.name.endswith("_original"):
seems simpler to move the modified tables into a new database, then you don't need to check
"_original" everywhere


Line 1371:       if len(table.primary_keys) < 1:
add comment: only run for Kudu tables which must have a primary key


Line 1373:       # For now we will not handle the case in a special way where several columns
are a
Why? Seems easy enough to do.


Line 1376:       if primary_key.exact_type not in (TinyInt, SmallInt, BigInt):
Why this restriction?


Line 1383:       if table.name + "_original" in set(table.name for table in tables):
What's the motivation for this? We might be able to achieve the same thing in a different
way. Having a query represent two queries seems somewhat abstraction breaking.

Also, for UPSERT it's ok of the keys already exist. I think wen to explicitly also test the
path where PKs already exist.


Line 1433:       drop_query.sql = "DROP STATS {0}".format(table.name)
why drop? we're not checking the results anyway


Line 1438:       compute_query.name = "compute_stats_{0}".format(table.name)
can you add the mt_dop option as part of the name?


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message