impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Date Fri, 03 Nov 2017 23:07:31 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 15:

(8 comments)

Pretty close, have some formatting nits (nothing horrible, just trying to keep the codebase
consistent) and one perf concern.

http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@54
PS15, Line 54:     : dict_encoded_size_(0), pool_(NULL), dict_bytes_cnt_(0), dict_mem_tracker_(NULL)
{}
Ok to leave for now but we've generally been moving towards using the recent C++ extension
that allows initialising member variables to constants at their declartion.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@59
PS15, Line 59: DCHECK
DCHECK_EQ(dict_bytes_cnt_, 0);


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@77
PS15, Line 77:   void ClearIndices() {
We generally use the more concise one-line formatting for very short single-statement functions
like this one.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@123
PS15, Line 123:  
formatting nit: * goes on the left with the type name.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@168
PS15, Line 168: inline
nit: inline isn't necessary. It isn't harmful but can be confusing to have unnecessary modifiers.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@269
PS15, Line 269:  *
nit: * should be on left.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@330
PS15, Line 330: inline
nit: unnecessary inline


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@374
PS15, Line 374:   ConsumeBytes(sizeof(node));
I'm concerned that calling MemTracker::Consume for every node added to the table could hurt
performance in some workloads since it will end up incrementing the memory consumption counter
in the root process-wide MemTracker up to 40k times per dictionary.

(In contrast, MemPool::Allocate() is generally fine since it allocates memory in chunks).

How about we track the memory for some number of nodes at a time. E.g.

  const int NODE_MEM_TRACKING_GRANULARITY = 4096;
  ...
  if (nodes % NODE_MEM_TRACKING_GRANULARITY == 0) {
    ConsumeBytes(sizeof(node) * NODE_MEM_TRACKING_GRANULARITY);
  }



-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 15
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:07:31 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message