impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3548: Prune runtime filters based on query options in the FE
Date Tue, 17 Oct 2017 22:06:31 GMT
Alex Behm has posted comments on this change. ( )

Change subject: IMPALA-3548: Prune runtime filters based on query options in the FE

Patch Set 10:

File be/src/service/
PS10, Line 455:     jbyteArray thrift_in_query_options) {
Not sure what 'thrift_in_query_options' means. How about using 'tquery_options' instead? Ideally,
we would not need to pass in the existing query options.
PS10, Line 457:   DeserializeThriftMsg(env, thrift_in_query_options, &options);
Needs error handling, e.g. using THROW_IF_ERROR_RET
PS10, Line 458:   const char *o = env->GetStringUTFChars(options_str, NULL);
* You need to release the string UTF chars, take a look at JniUtfCharGuard in jni-util.h

* style: const char* o
PS10, Line 461:   TParseQueryOptionsResult result;
I think it's simpler to convert all Status to an exception. That way we don't need the TParseQueryOptionsResult
at all, and the error handling is consistent (always throws an exception in case of error).
File fe/src/main/java/org/apache/impala/planner/
PS10, Line 119:     // Create runtime filters. 'singleNodePlan' now points to the root of
the distributed
Comment is wrong. 'singleNodePlan' definitely does not point to the root of the distributed

Use rootFragment.getPlanRoot() instead
File fe/src/main/java/org/apache/impala/planner/
PS10, Line 354:     public void addTarget(RuntimeFilterTarget target) { targets_.add(target);
add newline before
PS10, Line 397:     Preconditions.checkState(maxNumFilters >= 0);
Why is 0 not a valid value?
PS10, Line 423:       filter.computeNdvEstimate();
My understanding is that IMPALA-3450 has been fixed and we can remove this code.
PS10, Line 537:    * The assigned filters are the ones for which 'scanNode' can be used a
can be used as a destination node
PS10, Line 539:    * 1. If DISABLE_ROW_RUNTIME_FILTERING query option is set, a filter is
only assigned to
If the DISABLE_ROW_RUNTIME_FILTERING query option is set ...
PS10, Line 541:    * 2. If RUNTIME_FILTER_MODE query option is set to LOCAL, a filter is only
assigned to
If the RUNTIME_FILTER_MODE query option is set ...
PS10, Line 542:    *    'scanNode' if the filter is local to the scan node.
This doesn't really explain what the LOCAL option means. How about:

... a filter is only assigned to 'scanNode' if the filter is produced within the same fragment
that contains the scan node.
PS10, Line 563:       if (!isSingleNodeExec && runtimeFilterMode == TRuntimeFilterMode.LOCAL
Why the !isSingleNodeExec condition? I think that !isLocalTarget implies !isSingleNodeExec.
PS10, Line 576:   static private boolean IsLocalTarget(RuntimeFilter filter, ScanNode targetNode)
PS10, Line 583:   static private boolean IsBoundByPartitionColumns(Analyzer analyzer, Expr
File fe/src/main/java/org/apache/impala/service/
PS10, Line 92:   public native static byte[] NativeParseQueryOptions(String optionsStr,
to be clearer call the first arg csvQueryOptions
PS10, Line 93:  
Can we get rid of the second argument and use PlannerTestBase.mergeQueryOptions() instead?
PS10, Line 282:   public static TParseQueryOptionsResult ParseQueryOptions(String optionsStr,
Public function needs a comment.

API feels a little weird. This function can throw but also return an error in the TParseQueryOptionsResult

I suggest converting all errors into exceptions and returning a TQueryOptions or alternative
just modifying the given TQueryOptions in-place via PlannerTestBase.mergeQueryOptions(). Might
need to factor that out into a utility function.
File fe/src/test/java/org/apache/impala/planner/
PS10, Line 300:   public void testRuntimeFilterPruning() {
File fe/src/test/java/org/apache/impala/testutil/
PS10, Line 288:         parseQueryOptions(currentTestCase);
Why do we need to call this here and in L338? Isn't the call in L338 enough?
File testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-pruning.test:
PS10, Line 2: select count(*) from functional.alltypes a
When using join hints it is recommended to also use straight_join. Also please use the preferred
comment-style hint syntax, i.e.:

select /* +straight_join */ count(*) from functional.alltypes a
  join /* +broadcast */ functional.alltypes b ...

Please apply everywhere in this file.

I know we are already inconsistent in our existing tests, but that should not stop us from
improving new tests :)
PS10, Line 85: select count(*) from functional.alltypes a
Can we make it more obvious which runtime filters are the most selective? The selection is
not intuitive to follow.
PS10, Line 89: ---- QUERYOPTIONS
This is really useful! Thanks for adding it!
PS10, Line 91: ---- PLAN
To reduce the number of test lines, let's only keep those sections that are really needed.
In this case I think the DISTRIBUTEDPLAN should be enough.
PS10, Line 167: # DISABLE_ROW_RUNTIME_FILTERING is set: only partition column filters are
What's the effect of setting MAX_NUM_RUNTIME_FILTERS=3 in this test? Does it select the 3
partition-pruning filters as expected?
PS10, Line 252: # RUNTIME_FILTER_MODE is set to LOCAL: only local filters are applied
Same question as above. What happens when we set MAX_NUM_RUNTIME_FILTERS=4. Does it behave
as expected?
PS10, Line 418: # RUNTIME_FILTER_MODE is OFF: no filters are applied
Is this the same as MAX_NUM_RUNTIME_FILTERS=0?

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0f0b200e02442edcad8df3979f652d66c6e52eb
Gerrit-Change-Number: 7564
Gerrit-PatchSet: 10
Gerrit-Owner: Attila Jeges <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Philip Zeyliger <>
Gerrit-Reviewer: Thomas Tauber-Marshall <>
Gerrit-Comment-Date: Tue, 17 Oct 2017 22:06:31 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message