impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter
Date Fri, 13 Oct 2017 20:34:05 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8170/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8170/4//COMMIT_MSG@15
PS4, Line 15: 
Did you do any perf runs? It would be good to verify that the extra flag checking doesn't
affect perf (I suspect it doesn't).

It would also be good to confirm that testdata/workloads/targeted-perf/queries/primitive_empty_build_join_1.test
gets faster (I think it should!).


http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-parquet-scanner.cc@403
PS4, Line 403: std::
Shouldn't need std:: - it's imported in common/names.h


http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-scanner.cc@107
PS4, Line 107:     if (!BaseSequenceScanner::FileFormatIsSequenceBased(
Might be more readable to factor subexpression into variable e.g. is_sequence_based.


http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py
File tests/custom_cluster/test_always_false_filter.py:

http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py@53
PS4, Line 53:   seq_table_suffixes = ['_avro', '_rc', '_seq']
We'd normally create a test matrix based on these. Is the idea here to avoid restarting the
cluster for each file format? If so it would be good to leave a comment so that readers understand
why.


http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py@82
PS4, Line 82:   def test_skip_file(self, cursor):
Does this need to be a custom cluster test? I.e. does it need a special minicluster to execute.
It's best to make things query tests if possible since starting a cluster is slow and the
tests aren't parallelisable.



-- 
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Fri, 13 Oct 2017 20:34:05 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message