impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skye Wanderman-Milne (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3311: fix string data coming out of aggs in subplans
Date Thu, 12 May 2016 20:55:41 GMT
Skye Wanderman-Milne has uploaded a new patch set (#5).

Change subject: IMPALA-3311: fix string data coming out of aggs in subplans
......................................................................

IMPALA-3311: fix string data coming out of aggs in subplans

The problem: varlen data (e.g. strings) produced by aggregations is
freed by FreeLocalAllocations() after passing up the output
batch. This works for streaming operators or blocking operators that
copy their input, but results in memory corruption when the output
reaches non-copying blocking operators, e.g. SubplanNode and
NestedLoopJoinNode.

The fix: this patch makes the PartitionedAggregationNode copy out
produced string data if the node is in a subplan. Otherwise it calls
MarkNeedsToReturn() on the output batch. Marking the batch would work
in the subplan case as well, but would likely be less efficient since
it would result in many small batches coming out of the subplan.

The patch includes a test case. However, this test only exposes the
problem with an ASAN build and the --disable_mem_pools flag, which we
don't currently have automated testing for.

Change-Id: Iada891504c261ba54f4eb8c9d7e4e5223668d7b9
---
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exprs/agg-fn-evaluator.h
M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
A testdata/workloads/perf-regression/queries/subplan_aggregation.test
5 files changed, 89 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/29/2929/5
-- 
To view, visit http://gerrit.cloudera.org:8080/2929
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iada891504c261ba54f4eb8c9d7e4e5223668d7b9
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <skye@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Skye Wanderman-Milne <skye@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message