impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator
Date Sat, 27 Aug 2016 00:48:07 GMT
Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4143

Change subject: IMPALA-3610: Account for memory used by filters in the coordinator
......................................................................

IMPALA-3610: Account for memory used by filters in the coordinator

Before this patch, Impala would not account for the memory used to
aggregate runtime filters together in the coordinator. Impala's memory
could therefore be silently overcommitted.

This patch accounts for aggregated filter memory in a new filter
memtracker that is attached to the coordinator's query_mem_tracker(). If
the query memory limit is exceeded when a filter update arrives, that
update is discarded. If the filter is from a partitioned join, the
entire filter can therefore be discarded immediately (to alleviate
memory pressure) and a dummy 'always true' filter is sent to backends to
unblock them.

If the filter is from a broadcast join, no aggregation is done, so there
is no tracking. The Thrift input and output filter data structures are
not tracked (as we generally don't track RPC objects, but plan to in the
future). The filter payload is moved from the input request structure to
the output broadcast structure without copying.

Memory that is added to a memtracker must always be released. To do
this, we need to signal to the coordinator that it is finished, and that
there is no point trying to process any future updates that might arrive
concurrently. This patch adds Coordinator::Done() which is called from
QueryExecState::Done(), and which releases memory from all in-process
runtime filters.

Finally, this patch increases the upper limit for runtime filters to
512MB. This allows testing on very large datasets. The default maximum
is still 16MB, per RUNTIME_FILTER_MAX_SIZE.

Testing: Added a new test that triggers the OOM condition on the
coordinator. All existing runtime filter tests pass.

Change-Id: I3ff5f2fcc55fac763cdb6c802acd9d02d3ee9877
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test
10 files changed, 206 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/4143/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3ff5f2fcc55fac763cdb6c802acd9d02d3ee9877
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>

Mime
View raw message