impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Date Tue, 14 Nov 2017 00:10:08 GMT
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 )

Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8480/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8480/1//COMMIT_MSG@12
PS1, Line 12: A nested value is defined to
            : be on a path of one or more nested types that is rooted at a
            : table column.
I don't understand what that sentence means. Can you try to clarify the distinction between
nested value and nested type?


http://gerrit.cloudera.org:8080/#/c/8480/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8480/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@435
PS1, Line 435:   // Checks if slot refers to an array "pos" pseudo-column.
Can you add a comment explaining why checking for getColumn() == null is not sufficient?


http://gerrit.cloudera.org:8080/#/c/8480/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@441
PS1, Line 441: isMapStruct
I think it would be clearer to add a isArrayStruct() method to CollectionStructType to emphasize
that that's what we're checking.


http://gerrit.cloudera.org:8080/#/c/8480/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@564
PS1, Line 564:  
nit: the surrounding code seems to omit this space.


http://gerrit.cloudera.org:8080/#/c/8480/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@567
PS1, Line 567:       for (Expr pred: entry.getValue()) {
             :         if (pred instanceof BinaryPredicate) {
             :           tryComputeBinaryMinMaxPredicate(analyzer, (BinaryPredicate) pred);
             :         } else if (pred instanceof InPredicate) {
             :           tryComputeInListMinMaxPredicate(analyzer, (InPredicate) pred);
             :         }
             :       }
This looks like a duplication of the above loop. Adding additional predicates in the future
may require changing both loops. Have you considered factoring it into it's own method?


http://gerrit.cloudera.org:8080/#/c/8480/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1109
PS1, Line 1109: slot.getColumn() == null
Is this another check for a pos slot?


http://gerrit.cloudera.org:8080/#/c/8480/1/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/8480/1/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test@141
PS1, Line 141: Basics test
I'm not sure I understand what Basics means. Can you elaborate? I think we often order tests
by ascending complexity so that the simpler ones fail before the complex ones.


http://gerrit.cloudera.org:8080/#/c/8480/1/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/8480/1/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@460
PS1, Line 460: ====
Does this remove the trailing newline?



-- 
To view, visit http://gerrit.cloudera.org:8080/8480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe
Gerrit-Change-Number: 8480
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Nov 2017 00:10:08 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message