impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Robinson (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5773: Correctly account for memory used in data stream receiver queue
Date Sat, 12 Aug 2017 01:49:51 GMT
Henry Robinson has submitted this change and it was merged.

Change subject: IMPALA-5773: Correctly account for memory used in data stream receiver queue
......................................................................


IMPALA-5773: Correctly account for memory used in data stream receiver queue

DataStreamRecvrs keep one or more queues of batches received to provide
some buffering. Each queue has a fixed byte size capacity. The estimate
of the contribution of a new RowBatch to that queue was using the
compressed size of the TRowBatch it would be deserialized from, which is
the wrong value (since the batch is uncompressed after deserialization).

* Add RowBatch::Get[Des|S]erializedSize(const TRowBatch&) to RowBatch
* Fix the estimate to use the uncompressed size.
* Add a DataStreamReceiver child profile to the exchg node so that the
  peak memory used by the receiver can be monitored easily.

Confirmed that the following query:

select count(distinct concat(cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120))) from lineitem;

succeeds with a mem-limit of 800Mb. Before this patch it would fail in a
one-node cluster as the datastream recvr would buffer more batches than
the memory limit would allow.

Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Reviewed-on: http://gerrit.cloudera.org:8080/7646
Tested-by: Impala Public Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
12 files changed, 54 insertions(+), 48 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Henry Robinson: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message