impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pranay Singh (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Date Thu, 05 Oct 2017 01:58:59 GMT
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h@485
PS6, Line 485:   /// Tracks all the memory allocation used by dictionary in DictDecoder.
             :   struct DictMemTrack {
             :     std::unique_ptr<MemTracker> mem_tracker;
             : 
             :     DictMemTrack(MemTracker *parent_memtrack):mem_tracker(new MemTracker(-1,
             :         "Decoder parquet dict", parent_memtrack, false)) { };
             : 
             :     MemTracker*  GetMemTracker() {
             :       return  mem_tracker.get();
             :     }
             : 
             :     ~DictMemTrack() {
             :       mem_tracker->CloseAndUnregisterFromParent();
             :     }
             :   };
> I dont think we need to add a new struct for this, you can just call CloseA
I had added a new structure to have the creation /destruction of memtracker in one place

and also the CloseAndUnregisterFromParent() has to be called after
releasing all bytes owned by MemTracker in the DictDecoder class (which happens in it's destructor),
since the destructor of member variables are called later than the owner making a separate
struct/class for DictMemTrack  solved the problem.


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@113
PS6, Line 113:  // Bytes used for memory tracking by dictionary in DictEncoder is decremented.
             :   virtual void Cleanup() {
             :     if (dict_encoder_base_ != nullptr) {
             :       dict_encoder_base_->Cleanup();
             :     }
             :   }
> you can move this to the BaseColumnWriter::Close() method
This can't be done because I'm not tracking the bytes at the BaseColumnWriter


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@767
PS6, Line 767:   // Below code will decrement the bytes used for memory tracking by dictionary
in
             :   // DictEncoder class for each ColumnWriter.
             :   for (int i = 0; i < columns_.size(); i++) {
             :      if (columns_[i] != nullptr) {
             :        columns_[i]->Cleanup();
             :      }
             :   }
> similarly you can move this to the HdfsParquetTableWriter::Close() method
I'll try moving the functionality in Close() for HdfsParquetTableWriter/Reader


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@138
PS6, Line 138: SizeofDict
> I think we can use a more precise name here, SizeOfDict might be confused w
I've changed it to be DictByteSize


http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@438
PS6, Line 438: bytes_alloc += sizeof(value);
> can you switch to what joe suggested earlier, that is, instead of bytes_all
This looks fine what is the issue with this?



-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 6
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 05 Oct 2017 01:58:59 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message