impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IMPALA-3203) Implement scalable buffer recycling in buffer pool
Date Tue, 21 Mar 2017 14:56:42 GMT

    [ https://issues.apache.org/jira/browse/IMPALA-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934737#comment-15934737
] 

Tim Armstrong commented on IMPALA-3203:
---------------------------------------




IMPALA-3203: Part 1: Free list implementation

We will have a single free list per size class. Free buffers
are stored in a heap, ordered by the memory address of the
buffer, to help reduce address space fragmentation.

A follow-up patch will use the free lists in BufferPool.

Currently TCMalloc has thread-local caches with somewhat similar
purpose. However, these are not suitable for our purposes for
several reasons:
* They are designed for caching small allocations - large allocations
  like most buffers are served from a global page heap protected
  by a global lock.
* We intend to move away from using TCMalloc for buffers: IMPALA-5073
* Thread caches are ineffective for the producer-consumer pattern where
  one thread allocates memory and another thread frees it.
* TCMalloc gives us limited control over how and when memory is
  actually released to the OS.

Testing:
Added unit tests for sanity checks and verification of behaviour
that is trickier to check in integration or system tests.
The cost will be exercised more thoroughly via BufferPool
in Part 2.

Performance:
Includes a benchmark that demonstrates the scalability of
the free lists under concurrency. When measuring pure throughput
of free list operations, having a free list per core is
significantly faster than a shared free list, or allocating
directly from TCMalloc. On 8 cores, if the memory allocated is
actually touched then for 64KB+ buffers, memory access is the
bottleneck rather than lock contention.

The benchmark also showed that non-inlined constructors and move
operators of BufferHandle were taking significant CPU cycles, so
I inlined those.

This suggests that having a free list per core is more than sufficient
(however, we will need to validate this with system concurrency
benchmarks once we switch to using this during query execution).

Change-Id: Ia89acfa4efdecb96d3678443b4748932b4133b9b
Reviewed-on: http://gerrit.cloudera.org:8080/6410
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins

> Implement scalable buffer recycling in buffer pool
> --------------------------------------------------
>
>                 Key: IMPALA-3203
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3203
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>    Affects Versions: Impala 2.6.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Minor
>              Labels: resource-management
>
> In order to scale to many cores, we need to have scalable free lists and clean page lists
in BufferPool so that throughput can continue to scale with the number of cores.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message