impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Brown (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Date Tue, 19 Sep 2017 23:07:15 GMT
Michael Brown has posted comments on this change.

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8102/1//COMMIT_MSG
Commit Message:

PS1, Line 22: CDH-59396
This looks like internal Cloudera information. Please excise all internal Cloudera information
from the commit message and diff, as we have non-Cloudera community members not privy to that
information.


PS1, Line 25: --- tpcds-q39.test / MULTIPLE RESULT SET not recognized by test framework /
MARKED XFAIL.
            : --- tpcds-q47.test / RESULT MISMATCH in LSD of DECIMAL values / ADDED ROUND(2)
TO 8th COLUMN OF WITH TABLE, TAKE ACTUAL RESULT AS EXPECTED.
            : --- tpcds-q49.test / RESULT MISMATCH in LSD of DECIMAL values / MARKED XFAIL,
IMPALA-5945
            : --- tpcds-q57.test / RESULT MISMATCH, excess scale in DECIMAL values / FIXED,
ADDED TRUNCATE(2) AROUND 6th COLUMN.
These are good comments, but they are not line-wrapped. They could also stand to have some
vertical white space. Could you please make this section cosmetically cleaner?


PS1, Line 32: --- tpcds-q63.test / RESULT MISMATCH, excess scale in DECIMAL values / ADDED
CAST(DECIMAL(7, 2)) TO 3rd COLUMN
Comments like this are useful. It would be good for you to note this in tpcds-q63.test directly
in a comment. Please do the same for all annotations.


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q12.test
File testdata/workloads/tpcds/queries/tpcds-q12.test:

PS1, Line 4:       ,i_item_desc 
           :       ,i_category 
           :       ,i_class 
Gerrit highlights trailing white space in files. Click on this file in the WebUI to see what
I mean.

It's best not to have trailing white space since it makes diffing harder to read over time.
Can you please remove the trailing white space from all files? I'm sure there are Vim, sed,
or perl tricks that you know that could do it quickly.


PS1, Line 12: 	web_sales
            :     	,item 
            :     	,date_dim
I believe these characters (visible in Gerrit WebUI when you look at the file in its diff
viewer) indicate the presence of tab characters in the file. I expect it's better to replace
these with spaces; I'm not aware that we use tabs anywhere in the code base. For all files,
can you replace with spaces and make sure the alignment is clean?


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q23.test
File testdata/workloads/tpcds/queries/tpcds-q23.test:

PS1, Line 51:  limit 100;
            :  
            : with frequent_ss_items as
The stress test gets its bank of queries to run from files such as this, but it's not clear
to me how the stress test handles multiple queries and results gathering. This matters both
when collecting runtime info and when running the stress test proper. Can you please investigate
how the stress test handles having two queries here? You can talk to Matt Mulder or me about
this.


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q24.test
File testdata/workloads/tpcds/queries/tpcds-q24.test:

PS1, Line 48: having sum(netpaid) > (select 0.05*avg(netpaid)
            :                                  from ssales)
            : ;
            : 
            : with ssales as
Highlighting another place where there are 2 queries and the stress test's behavior isn't
clear.


PS1, Line 104: --
Is this a results delimiter? If yes, why doesn't q23 have a results delimiter?


http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q39.test
File testdata/workloads/tpcds/queries/tpcds-q39.test:

PS1, Line 27: ;
            : with inv as
Another multi-query file, like previously.


http://gerrit.cloudera.org:8080/#/c/8102/1/tests/query_test/test_tpcds_queries.py
File tests/query_test/test_tpcds_queries.py:

PS1, Line 87:   @pytest.mark.xfail(reason="IMPALA-5226")
            :   def test_tpcds_q10(self, vector):
            :     self.run_test_case(self.get_workload() + '-q10', vector)
While this works for functional end-to-end tests, we need a way to tell the stress test to
skip this (and other) queries so as not to include it when collecting runtime info or as part
of a stress run. You can talk to Matt Mulder or me about this.


-- 
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Wood <twood@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message