impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2932: Extend DistributedPlanner to account for hash table build cost
Date Tue, 23 Aug 2016 18:53:27 GMT
Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4098

Change subject: IMPALA-2932: Extend DistributedPlanner to account for hash table build cost
......................................................................

IMPALA-2932: Extend DistributedPlanner to account for hash table build cost

When deciding between a broadcast or repartition join, Impala calculates
the cost of each join as the total amount of data that is sent over the
network. This ignores some relevant costs, and can lead to bad plans.

One such relevant cost is the work to create the hash table used in the
join. This patch accounts for this by adding the amount of data inserted
into the hash table (the size of the right side of the join) to the
previous cost.

This generally increases the estimated cost of broadcast joins relative
to repartitioning joins, as the broadcast join must build the hash table
on each node the data was broadcast to, so its effect will be to make
repartitioning joins more likely to be chosen, especially in large
clusters.

This patch has not yet been performance tested.

Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
---
M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
8 files changed, 148 insertions(+), 125 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4098/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>

Mime
View raw message