impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism
Date Tue, 11 Jul 2017 20:57:47 GMT
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7351

to look at the new patch set (#4).

Change subject: IMPALA-5612: join inversion should factor in parallelism
......................................................................

IMPALA-5612: join inversion should factor in parallelism

The join inversion optimisation did not factor in the degree of
parallelism that the join executed with after inversion. In some cases
this lead to bad decisions, e.g. executing a join on a single node
instead of 20 nodes.

This patch adds a more sophisticated cost model that factors degree
of parallelism into the join inversion decision.

The behaviour is unchanged if inversion does not change the degree of
parallelism.

Perf:
Ran cluster TPC-H and TPC-DS benchmarks. Average changes were small:
< 3%. Saw a mix of improvements and regressions. We were satisfied
that the regressions were cases when the planner "got lucky" previously.
E.g. on TPC-H Q2 a join was flipped to put lineitem on the left as a
result of inaccurate cardinality estimates.

Mostafa also ran a TPC-DS benchmark where the dimension tables were
loaded with num_nodes=1 to minimise the number of files. We saw some
huge speedups there on the unmodified queries, e.g. TPCDS-Q10 went from
291s to 32.25s. The worst percentage regression was Q50, which went
from 1.61s to 2.4s and the worst absolute regression was Q72, which
went from 694s to 874s (25%).

Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
---
M fe/src/main/java/org/apache/impala/planner/Planner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
8 files changed, 954 insertions(+), 854 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/7351/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message