impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3203: Part 1: Free list implementation
Date Mon, 20 Mar 2017 22:19:52 GMT
Hello Impala Public Jenkins, Jim Apple, Dan Hecht,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6410

to look at the new patch set (#10).

Change subject: IMPALA-3203: Part 1: Free list implementation
......................................................................

IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/free-lists-benchmark.cc
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/runtime/bufferpool/free-list-test.cc
A be/src/runtime/bufferpool/free-list.h
M be/src/util/benchmark-test.cc
M be/src/util/benchmark.cc
M be/src/util/benchmark.h
12 files changed, 813 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/6410/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6410
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message