impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] Loads all TPC-DS tables
Date Tue, 23 May 2017 19:22:13 GMT
Michael Ho has posted comments on this change.

Change subject: Loads all TPC-DS tables
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6877/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java:

Line 1728:     assertEquals(24, resp.rows.size());
> Use symbol instead of magic constant.
Done


http://gerrit.cloudera.org:8080/#/c/6877/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test:

> This file is very long.  Any useful way it can be split up?
I agree. May be it makes sense to split it up by query range (0-10 in one file, 11-20 in one
file etc) when we add more queries to this file. I will refrain from doing so in this change
to keep it easy to review.


Line 46: |     predicates: dt.d_moy = 12, (dt.d_date_sk >= 2451149 AND dt.d_date_sk <=
2451179 OR dt.d_date_sk >= 2451514 AND dt.d_date_sk <= 2451544 OR dt.d_date_sk >=
2451880 AND dt.d_date_sk <= 2451910 OR dt.d_date_sk >= 2452245 AND dt.d_date_sk <=
2452275 OR dt.d_date_sk >= 2452610 AND dt.d_date_sk <= 2452640)
> These date constants seem brittle (regardless of representation.)
FWIW, you may notice that I didn't change any qurey in this test file. This change in outputs
are mostly due to the change in size to store_sales tables which leads to different plans
returned by the planner.

On the other hand, the official TPCDS-Q3 doesn't really have these partition filters. It was
probably done before we had dynamic partition pruning so I removed them now.


http://gerrit.cloudera.org:8080/#/c/6877/3/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
File testdata/workloads/functional-query/queries/QueryTest/seq-writer.test:

Line 101: where ss_sold_date_sk between 2451170 and 2451200;
> Why/when do these values change?
To keep the test time under a reasonable amount. This change loaded way more partitions than
before for store_sales tables so using the previous range will result in a rather long run
time.


-- 
To view, visit http://gerrit.cloudera.org:8080/6877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Greg Rahn <grahn@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Tim Wood <twood@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message