From "Impala Public Jenkins (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism
Date Tue, 22 Aug 2017 19:42:06 GMT
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5612: join inversion should factor in parallelism

IMPALA-5612: join inversion should factor in parallelism

The join inversion optimisation did not factor in the degree of
parallelism that the join executed with after inversion. In some cases
this lead to bad decisions, e.g. executing a join on a single node
instead of 20 nodes.

This patch adds a more sophisticated cost model that factors degree
of parallelism into the join inversion decision.

The behaviour is unchanged if inversion does not change the degree of

Ran cluster TPC-H and TPC-DS benchmarks. Average changes were small:
< 3%. Saw a mix of improvements and regressions. We were satisfied
that the regressions were cases when the planner "got lucky" previously.
E.g. on TPC-H Q2 a join was flipped to put lineitem on the left as a
result of inaccurate cardinality estimates.

Mostafa also ran a TPC-DS benchmark where the dimension tables were
loaded with num_nodes=1 to minimise the number of files. We saw some
huge speedups there on the unmodified queries, e.g. TPCDS-Q10 went from
291s to 32.25s. The worst percentage regression was Q50, which went
from 1.61s to 2.4s and the worst absolute regression was Q72, which
went from 694s to 874s (25%).

Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
Reviewed-by: Tim Armstrong <>
Tested-by: Impala Public Jenkins
M fe/src/main/java/org/apache/impala/planner/
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
8 files changed, 938 insertions(+), 832 deletions(-)

  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved

