impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zach Amsden (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries
Date Thu, 08 Jun 2017 20:26:20 GMT
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
......................................................................


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 420:               LIKELY(dictionary_results_.num_bits() > 0)) {
> I think the predicate evaluation on 40,000 values is probably cheap enough 
We can certainly try it.  I was worried the pre-computation might be expensive if we have,
say, string manipulation in predicates, as opposed to inexpensive, simple comparisons.  Still,
even if we have the same number of predicate evaluations, they end up going through the unoptimized
EvalConjuncts() path, as opposed to the codegen'd path.

As for IS_FILTERED, that is set when the column reader is created.  IS_DICT_ENCODED is determined
per page.  We are left with no way to remove dictionary_results_.num_bits() on a per-row basis,
since we can't unset IS_FILTERED, and IS_DICT_ENCODED may be true even if the encoding did
not cover all values.

I'll try precomputing all the values and see what happens.


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <zamsden@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message