impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.
Date Wed, 15 Feb 2017 06:10:14 GMT
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

IMPALA-4916: Fix maintenance of set of item sets in DisjointSet.

The bug: The DisjointSet maintains a set of unique item sets
using a HashSet<Set<T>>. The problem is that we modified
the Set<T> elements after inserting them into the HashSet.
This caused the removal of elements from the HashSet to fail.
Removal is required for maintaining a consistent DisjointSet.
The removal could even fail for the same Set<T> instance because
the hashCode() changed from when the Set<T> was originally
inserted to when the removal was attempted due to mutation
of the Set<T>.
An inconsistent DisjointSet can lead to incorrect equivalence
classes, which can lead to missing, redundant and even
non-executable predicates. Incorrect results and crashes are

For most queries, an inconsistent DisjointSet does not alter
the equivalence classes, and even fewer queries have incorrect
In fact, many of our existing planner tests trigger this bug,
but only 3 of them lead to an incorrect value transfer graph,
and all 3 had correct plans.

The fix: Use an IdentityHashMap to store the set of item sets.
It does not rely on the hashCode() and equals() of the stored
elements, so the same object can be added and later removed,
even when mutated in the meantime.

- Added a Preconditions check in DisjointSet that asserts
  correct removal of an item set. Many of our existing tests
  hit the check before this fix.
- Added a new unit test for DisjointSet which triggers
  the bug.
- Augmented DisjointSet.checkConsistency() to check for
  inconsistency in the set of item sets.
- Added validation of the value-transfer graph in
  single-node planner tests.
- A private core/hdfs run succeeded.

Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
M fe/src/main/java/org/apache/impala/analysis/
M fe/src/main/java/org/apache/impala/common/
M fe/src/main/java/org/apache/impala/util/
M fe/src/test/java/org/apache/impala/planner/
M fe/src/test/java/org/apache/impala/util/
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
6 files changed, 65 insertions(+), 19 deletions(-)

  git pull ssh:// refs/changes/80/5980/2
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I609c8795c09becd78815605ea8e82e2f99e82212
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Marcel Kornacker <>

View raw message