impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikramjeet Vig (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Date Wed, 04 Oct 2017 22:29:59 GMT
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h@485
PS6, Line 485:   /// Tracks all the memory allocation used by dictionary in DictDecoder.
             :   struct DictMemTrack {
             :     std::unique_ptr<MemTracker> mem_tracker;
             : 
             :     DictMemTrack(MemTracker *parent_memtrack):mem_tracker(new MemTracker(-1,
             :         "Decoder parquet dict", parent_memtrack, false)) { };
             : 
             :     MemTracker*  GetMemTracker() {
             :       return  mem_tracker.get();
             :     }
             : 
             :     ~DictMemTrack() {
             :       mem_tracker->CloseAndUnregisterFromParent();
             :     }
             :   };
I dont think we need to add a new struct for this, you can just call CloseAndUnregisterFromParent()
on the dict_mem_tracker in HdfsParquetScanner::Close

Similarly for the writer


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@113
PS6, Line 113:  // Bytes used for memory tracking by dictionary in DictEncoder is decremented.
             :   virtual void Cleanup() {
             :     if (dict_encoder_base_ != nullptr) {
             :       dict_encoder_base_->Cleanup();
             :     }
             :   }
you can move this to the BaseColumnWriter::Close() method


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@767
PS6, Line 767:   // Below code will decrement the bytes used for memory tracking by dictionary
in
             :   // DictEncoder class for each ColumnWriter.
             :   for (int i = 0; i < columns_.size(); i++) {
             :      if (columns_[i] != nullptr) {
             :        columns_[i]->Cleanup();
             :      }
             :   }
similarly you can move this to the HdfsParquetTableWriter::Close() method


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@138
PS6, Line 138: SizeofDict
I think we can use a more precise name here, SizeOfDict might be confused with returning the
number of elements in the Dictionary.
Something on the lines of DictByteSize?


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@438
PS6, Line 438: bytes_alloc += sizeof(value);
can you switch to what joe suggested earlier, that is, instead of bytes_alloc, using something
like the function SizeofDict() that you wrote to update ConsumeBytes at the end.
Refer Joe's first comment.



-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 6
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Wed, 04 Oct 2017 22:29:59 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message