impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe McDonnell (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] Fix parquet table writer dictionary leak
Date Tue, 28 Feb 2017 21:53:41 GMT
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6181

Change subject: Fix parquet table writer dictionary leak
......................................................................

Fix parquet table writer dictionary leak

Currently, in HdfsTableSink, OutputPartitions are added to the RuntimeState
object pool to be freed at the end of the query. However, for clustered inserts
into a partitioned table, the OutputPartitions are only used one at a time.
They can be immediately freed once done writing to that partition.

In addition, the HdfsParquetTableWriter's ColumnWriters are also added to
this object pool. These constitute a significant amount of memory, as they
contain the dictionaries for Parquet encoding.

This change makes HdfsParquetTableWriter's ColumnWriters use unique_ptrs so
that they are cleaned up when the HdfsParquetTableWriter is deleted. It also
explicitly cleans up the OutputPartition rather than leaving it to the object
pool.

Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
3 files changed, 14 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/6181/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonnell@cloudera.com>

Mime
View raw message