impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Date Tue, 04 Apr 2017 21:05:06 GMT
Tim Armstrong has uploaded a new patch set (#16).

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool

IMPALA-3203: Part 2: per-core free lists in buffer pool

Add per-core lists of clean pages and free pages to enable allocation
of buffers without contention on shared locks in the common case.

This is implemented with an additional layer of abstraction in
"BufferAllocator", which tracks all memory (free buffers and clean
pages) that is not in use but has not been released to the OS.
The old BufferAllocator is renamed to SystemAllocator.

See "Spilled Page Mgmt" and "MMap Allocator & Scalable Free Lists" in for a high-level summary of how this fits into
the buffer pool design.

The guts of the new code is BufferAllocator::AllocateInternal(),
which progresses through several strategies for allocating memory.

Misc changes:
* Enforce upper limit on buffer size to reduce the number of free lists
* Add additional allocation counters.
* Slightly reorganise the MemTracker GC functions to use lambdas and
  clarify the order in which they should be called. Also adds a target
  memory value so that they don't need to free *all* of the memory in
  the system.
* Fix an accounting bug in the buffer pool where it didn't
  evict dirty pages before reclaiming a clean page.

We will need to validate the performance of the system under high query
concurrency before this is used as part of query execution. The benchmark
in Part 1 provided some evidence that this approach of a list per core
should scale well to many cores.

Added buffer-allocator-test to test the free list resizing algorithm

Added a test to buffer-pool-test to exercise the various new memory
reclamation code paths that are now possible. Also run buffer-pool-test
under two different faked-out NUMA setups - one with no NUMA and another
with three NUMA nodes.

buffer-pool-test, suballocator-test, and buffered-tuple-stream-v2-test
provide some further basic coverage. Future system and unit tests will
validate this further before it is used for query execution (see

Ran an initial version of IMPALA-4114, the ported BufferedBlockMgr
tests, against this. The randomised stress test revealed some accounting
bugs which are fixed. I'll post those tests as a follow-on patch.

Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
M be/src/benchmarks/
M be/src/common/
M be/src/runtime/
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/
M be/src/runtime/bufferpool/
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool-counters.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/
M be/src/runtime/bufferpool/
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/
M be/src/runtime/bufferpool/free-list.h
M be/src/runtime/bufferpool/
M be/src/runtime/bufferpool/suballocator.h
A be/src/runtime/bufferpool/
A be/src/runtime/bufferpool/system-allocator.h
M be/src/runtime/
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/
M be/src/runtime/
M be/src/runtime/mem-tracker.h
M be/src/runtime/
M be/src/runtime/
M be/src/runtime/tmp-file-mgr.h
A be/src/testutil/cpu-util.h
A be/src/testutil/rand-util.h
M be/src/util/
M be/src/util/cpu-info.h
30 files changed, 1,557 insertions(+), 364 deletions(-)

  git pull ssh:// refs/changes/14/6414/16
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Taras Bobrovytsky <>
Gerrit-Reviewer: Tim Armstrong <>

View raw message