impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Impala Public Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4899: Fix parquet table writer dictionary leak
Date Fri, 03 Mar 2017 01:20:05 GMT
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4899: Fix parquet table writer dictionary leak
......................................................................


IMPALA-4899: Fix parquet table writer dictionary leak

Currently, in HdfsTableSink, OutputPartitions are added to the RuntimeState
object pool to be freed at the end of the query. However, for clustered inserts
into a partitioned table, the OutputPartitions are only used one at a time.
They can be immediately freed once done writing to that partition.

In addition, the HdfsParquetTableWriter's ColumnWriters are also added to
this object pool. These constitute a significant amount of memory, as they
contain the dictionaries for Parquet encoding.

This change makes HdfsParquetTableWriter's ColumnWriters use unique_ptrs so
that they are cleaned up when the HdfsParquetTableWriter is deleted. It also
uses a unique_ptr on the PartitionPair for the OutputPartition.

The table writers maintain a pointer to the OutputPartition. This remains a
raw pointer. This is safe, because OutputPartition has a scoped_ptr to the
table writer. The table writer will never outlive the OutputPartition.

Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f
Reviewed-on: http://gerrit.cloudera.org:8080/6181
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.h
5 files changed, 26 insertions(+), 14 deletions(-)

Approvals:
  Marcel Kornacker: Looks good to me, approved
  Impala Public Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message