impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe McDonnell (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Date Thu, 28 Sep 2017 16:55:24 GMT
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/data-sink.h@126
PS3, Line 126:   /// This memtracker tracks all the allocations done by parquet dictonary
encoder.
             :   boost::scoped_ptr<MemTracker> dict_mem_tracker_;
Can we move this down to HdfsParquetTableWriter?


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/exec-node.h@325
PS3, Line 325:   /// Tracks all the memory allocation by parquet dictonary decoders
             :   boost::scoped_ptr<MemTracker> dict_mem_tracker_;
Can this move down to HdfsParquetScanner? Other ExecNodes don't need this.


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@113
PS3, Line 113:  
Remove trailing spaces. This also applies to lines that don't have any code (there are a couple
others in this file). These are easily visible in the Gerrit UI.

If using emacs, the following will highlight trailing whitespace:
(setq-default show-trailing-whitespace t)


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@164
PS3, Line 164: mtrackp
For class fields, the convention is to end the field name with an underscore. Also, look at
other code that uses MemTrackers and see what variable name that code uses. It is good to
harmonize these things across the codebase.


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412
PS3, Line 412: ConsumeBytes(sizeof(value));
The parquet DictionaryPageHeader contains a num_values field. Look where we call CreateDictionaryDecoder
and you'll see that we reference to it on the next line. I think we should consider doing
a single ConsumeBytes using num_values * size of type.



-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 28 Sep 2017 16:55:24 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message