impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Date Wed, 13 Dec 2017 20:39:00 GMT
Lars Volker has posted comments on this change. ( )

Change subject: IMPALA-4993: extend dictionary filtering to collections

Patch Set 4:


This looks good to me, I only added minor clean-up comments and will do a final pass on the
next PS.
File be/src/exec/hdfs-scan-node-base.h:
PS4, Line 90:   /// sequence-based file
This looks like a rebase artifact. I often look at diffs between patchsets (and I think others
do, too) so if possible, it helps to do a final rebase at the end instead of rebaseing in
the middle. Sometimes rebases are inevitable, in which case it's fine of course.
File be/src/exec/
PS4, Line 90: std::
We usually drop the std:: prefix in cc files.
PS4, Line 96: std::
same here.
File fe/src/main/java/org/apache/impala/planner/
PS1, Line 187:   // tuple. Uses a linked hash map for consistent display in explain.
> min-max pruning is discussed in the header, but dictionary pruning is not. 
PS1, Line 624:         addNotEmptyCollections(collectionConjuncts);
> A bit unclear to me as well. However, if I understand this literally, we on
Thanks for adding the comments. This way round it makes more sense. It's still not obvious
to me why in the prior code and before the check for slotIds.size() != 1, tupleIds.size()
== 1 had to be true. There seems to be some guarantee that conjuncts in the scanners work
on a single tuple only, which surprises me.
PS1, Line 656:     // Check to see if the conjunct evaluates to true when the slot is NULL
> Done
nit: conjuncts still lacks the _ :)
File fe/src/main/java/org/apache/impala/planner/
PS4, Line 111: the
Duplicate word
PS4, Line 1099:         output.append(detailPrefix + "parquet statistics predicates: " +
This one compiles the string differently than the others, probably I did this myself. While
you're here and if you don't mind, you could convert it to String.format().
PS4, Line 1116:         List<Integer> totalIdxList = Lists.newArrayList();
This is more of a guess, but can't you pass entry.getValue() to newArrayList?
File testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test:
PS1, Line 265: # - number of projections and their nesting depth
> I think there's some benefit to explicitly listing the areas tested/to-test
I understand your point, let's keep it here.
PS1, Line 275: 1
> there are two files in the table. one of them is dictionary encoded for int
That makes sense. Should we add a quick comment for the next person to come by ("Only one
of the files in this table is dictionary encoded")?

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 4
Gerrit-Owner: Vuk Ercegovac <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Vuk Ercegovac <>
Gerrit-Comment-Date: Wed, 13 Dec 2017 20:39:00 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message