impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5347: Parquet scanner microoptimizations
Date Tue, 23 May 2017 00:48:19 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5347: Parquet scanner microoptimizations
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6950/4/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 979: Status HdfsParquetScanner::ResetScratchBatch() {
> Why not move this into ScratchTupleBatch, i.e. pass in the template tuple t
ScratchTupleBatch then would have to call out to HdfsScanNode::InitTuple(). I can do a larger
restructure, e.g. moving InitTuple() into Tuple or similar if you think that will make things
clearer. I think it's probably an improvement - just checking that you think that makes sense
before doing it.


Line 983:   if (template_tuple_ == nullptr && tuple_byte_size_ <= CACHE_LINE_SIZE)
{
> Not sure I completely understand the CACHE_LINE_SIZE check. We are zeroing 
Augmented the comment.

There's some cut-over where the old code is faster. E.g. if the tuple has 1000 slots, it's
probably better to zero out 125 bytes of null indicators row-by-row instead of zeroing out
all the 1024 multi-kb rows.

I think this optimisation doesn't matter too much for tuples with more than a handful of slots,
since the cost of materialization is high compared to the cost of zeroing things.


-- 
To view, visit http://gerrit.cloudera.org:8080/6950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49ec523a65542fdbabd53fbcc4a8901d769e5cd5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: anujphadke <aphadke@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message