impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries
Date Thu, 25 May 2017 17:56:52 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
......................................................................


Patch Set 9:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

PS9, Line 567: skippable_map
skippable_map != nullptr


PS9, Line 586: (*skippable_map)[i] == true
(*skippable_map)[i]


PS9, Line 586: 0 
huh ?


PS9, Line 590: bool
skip_val


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS9, Line 153: Again
No need for "Again".


PS9, Line 159:   /// This function is capable of doing codegen for either of the above functions;
simply
             :   /// pass a null value instead of a bool array pointer.
If 'skippable_map' is not NULL, this will generate the equivalent of EvalConjunctsPreEval().
Otherwise, this generates EvalConjuncts(). Also describe the content of skippable_map.

Alternately, we can keep this function private and expose two interfaces CodegenEvalConjuncts()
and CodegenEvalConjunctsPreEval(). The former always passes nullptr for 'skippable_map' argument.


PS9, Line 163: skippable_map
Will 'skippable_conjuncts' be a more appropriate name ?


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/hdfs-parquet-scanner-ir.cc
File be/src/exec/hdfs-parquet-scanner-ir.cc:

PS9, Line 88: LOG(INFO) << "tuple idx: " << tuple_index;
debugging output ?


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS9, Line 1050: std::map<SlotId, std::vector<int>>&
DictionaryFilterPositionMap


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

PS9, Line 167: typedef std::map<SlotId, std::vector<int>> DictFilterConjunctsPositionMap;
Comment what this map stands for ?

Slot Id -> index of the dictionary filter conjuncts in conjunct_ctxs_ ?


PS9, Line 366: Mapping to original position in conjuncts
Mapping from SlotId to ...


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS9, Line 206: IS_FILTERED
Comment what IS_FILTERED stands for.


PS9, Line 240: (filters == nullptr) ^ IS_FILTERED
Is there a reason to not use DCHECK((filters == nullptr) != IS_FILTERED) ?


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

PS9, Line 390: return false;
Is this hard coded for testing ? If not, please document why it's always false.


http://gerrit.cloudera.org:8080/#/c/6726/9/be/src/util/bitmap.h
File be/src/util/bitmap.h:

PS9, Line 86: (mask & buffer_[word_index])
(mask & buffer[word_index]) == 0


PS9, Line 99: buffer_.size()
num_words_


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <zamsden@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message