impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Toolchain-CR] IMPALA-3582: TBinaryProtocol buffers consume unbounded memory
Date Tue, 24 May 2016 23:59:36 GMT
Tim Armstrong has uploaded a new patch set (#5).

Change subject: IMPALA-3582: TBinaryProtocol buffers consume unbounded memory
......................................................................

IMPALA-3582: TBinaryProtocol buffers consume unbounded memory

This patch removes the builtin string buffer in TBinaryProtocol. It was
used to buffer large strings being read via thrift. The buffer is never
shrunk, so each thrift connection ends up consuming memory equal to the
largest string that was received by the connection. This can result in a
lot of memory, given a large number of connections and large strings.a

This patch is equivalent to THRIFT-2021, but also deletes the unused
member variables.

The string buffer is actually unnecessary, since the buffer contents are
always copied into the string, so the buffer simply adds an extra copy.
This may not be a big win, since string::resize() implicitly does a
memset().

The fix is to resize the string then write directly to the string's memory.
The C++11 standard guarantees that a std::string's memory is contiguous.

Testing:
Did a local build and ran a local stress test with heap profiling.
Inspected heap profile to make sure that we did not have large
allocations from TBinaryProtocol::readStringBody() sitting
around when Impala is idle.

Change-Id: I58bdc899e604bde2f5b2d5360f79537ce25715ee
---
M buildall.sh
A source/thrift/thrift-0.9.0-patches/0008-IMPALA-3582-string-buffer-memory.patch
2 files changed, 70 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/01/3201/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3201
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58bdc899e604bde2f5b2d5360f79537ce25715ee
Gerrit-PatchSet: 5
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hxu@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message