impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe McDonnell (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries
Date Tue, 02 May 2017 23:35:44 GMT
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries

Patch Set 4:


A couple quick observations
File be/src/exec/

PS4, Line 1454: );
The front end orders conjuncts by selectivity and cost. When we pull them out and attach them
to column materialization, the order is not preserved. If the conjunct is evaluated using
the dictionary, this should be fine. If the conjunct is not evaluated from the dictionary,
then it might result in a more expensive evaluation.

To put numbers on it:
Suppose there are two conjuncts A and B. A is expensive (cost = 10) and super selective (eliminates
0.99). B is cheap (cost = 1) and moderately selective (eliminates 0.50). The front end might
put B first, so if B eliminates 50% of the row, then A is called 50% of the time to eliminate
the rest. This has an amortized cost of 1 + 0.50 * 10 = 6, which is cheaper than calling A
100% of the time.

We can reorder the materialization of the columns at runtime using knowledge of which columns
are dictionary encoded and which aren't.

PS4, Line 1467: endif
It should be possible to do this up in HdfsScanNode. As an example, see extractKuduConjuncts
in KuduScanNode. This pulls out conjuncts that will be evaluated by Kudu.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <>
Gerrit-Reviewer: Joe McDonnell <>
Gerrit-HasComments: Yes

View raw message