impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Date Sat, 03 Dec 2016 00:23:47 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
......................................................................


Patch Set 9:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4715/4//COMMIT_MSG
Commit Message:

Line 14: larger chunks. This helps avoid external fragmentation and is quite
I really really really really don't want "implement a general-purpose memory allocator" to
block the rest of IMPALA-3200 - i.e. switching execution over to the new buffer pool. 

The current solution for this - call BufferedBlockMgr::ConsumeMemory() and allocate memory
from tcmalloc on the side is pretty bad and doesn't carry over to a new buffer pool.

There's the coding effort, but I'd have less confidence in getting the edge cases in a more
complex allocator right and testing the whole state space. This is not a good thing if we're
switching it on at the same time as the rest of the buffer pool code.


http://gerrit.cloudera.org:8080/#/c/4715/5/be/src/bufferpool/suballocator-test.cc
File be/src/bufferpool/suballocator-test.cc:

PS5, Line 290: . T
> That could be a bug in libstdc++. I don't think we can use libc++ yet becau
It looks like the names of the parameters are just misleading. I looked at the C++14 standard:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf

It provides a formula for the distribution that shows that the two parameters are actually
the mu/sigma parameters https://en.wikipedia.org/wiki/Log-normal_distribution, not the mean/standard
deviation of the distribution itself.


http://gerrit.cloudera.org:8080/#/c/4715/8/be/src/bufferpool/suballocator-test.cc
File be/src/bufferpool/suballocator-test.cc:

Line 107:   static void ExpectReservationUnused(ReservationTracker& reservation) {
There was a minor fix here (variable name) that I forgot to amend before pushing.


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

Line 171: /// An allocation made by a Suballocator. Each allocation returned by Suballocator
must
> While it may be a happy case, I think it is the far more common case of uni
But why shouldn't we use it for a different case where it's the right tool?

$ git grep 'RuntimeState\*.*state_;'
Reveals a bunch of places where we store a raw pointer to the RuntimeState that's owned by
PlanFragmentExecutor. Some of these likely form a cycle since many things are reachable from
RuntimeState.

It's possible to use raw pointers for prev/next (I just tried doing it) but it's weird to
pop the pointers out of unique_ptrs to put them in the lists, and we lose a bit of automatic
documentation/checking about the ownership. I posted the diff here: https://gerrit.cloudera.org/#/c/5336/1/be/src/bufferpool/suballocator.cc


http://gerrit.cloudera.org:8080/#/c/4715/8/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

PS8, Line 86: otherwise destructing
            :   /// the returned 'result' will DCHECK
> Does that mean I don't have to worry about calling Free if we compile only 
The "must" is intended to convey that it's a system invariant that allocated memory must be
freed. Thus we should have a DCHECK there to enforce the invariant. I don't really know what
happens in a release build since it violates additional invariants about freeing buffers -
probably it ends up just being a memory leak reflected in the internal accounting.

The motivation is just in general that it's good to have clear system invariants. It does
help detect memory leaks at the point where they happen, but also simplifies Suballocator
since we don't need to write code to clean up after badly-behaved callers.

BufferedBlockMgr used to follow the alternative design for its Blocks - you didn't need to
delete them - but it was the source of a number of hard-to-find memory leaks and other memory
lifetime issues.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message