impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3203: Part 2: per-core free lists in buffer pool
Date Thu, 30 Mar 2017 16:57:26 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool
......................................................................


Patch Set 15:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6414/14/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

Line 347:     lock_guard<SpinLock> pl(page->buffer_lock);
> 'client_lock' would guard against the page being moved to a different arena
Yup, realized that after writing this comment. oops.


http://gerrit.cloudera.org:8080/#/c/6414/15/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

PS15, Line 59: .
should be more explicit that it might return less memory than 'target_bytes'. Maybe say "Free
as much memory as possible up to 'target_bytes' from this area back to the system allocator.


PS15, Line 64: returns the total amount freed 
hmm, it's kinda confusing that the return value has this subtle difference in meaning depending
on claim_memory. 

It seems the most intuitive return value would be "Returns the number of bytes of system memory
claimed for the caller. Memory returned to the system in excess of that amount is added into
'system_bytes_remaining_'.

Can the claim_memory=false path accommodate something like that? If we can't reconcile it
to have a single return value, then it seems we really have two different abstractions bunched
into one call.


PS15, Line 70: FreeMemory
do you think it makes sense to call this FreeSystemMemory() to make it clearer that this is
about pushing memory back to the system?


PS15, Line 254: ffinally
typo


Line 261:       if (delta == len) break;
since we can't get rid of the other DecreaseSystemBytesRemaining() probably better to move
this back to where it was.


PS15, Line 263: lock_arenas
thought this would become 'final_attempt'


Line 304:   vector<std::unique_lock<SpinLock>> arena_locks;
let's add a comment explaining all of this. Something like:

There are two strategies for scavenging buffers:
1) Fast, opportunistic: Each arena is searched in succession. Although reservations guarantee
that the memory we need is available somewhere, this can fail because we can race with another
thread that returned buffers to an arena that we've already searched and took the buffers
from an arena we haven't yet searched.

2) Slow, guaranteed to succeed: In order to ensure that we can find the memory in a single
pass, we hold on to the locks for arenas we've already examined. That way, if another thread
can't take the memory that we need from an arena that we haven't yet examined (or from 'system_bytes_available_')
because in order to do so, it would have had to return the equivalent amount of memory to
an earlier arena or added it back into 'systems_bytes_reamining_'. The former can't happen
since we're still holding those locks, and the latter is solved by trying DecreaseSystemBytesRemaining()
at the end.


PS15, Line 317:   // It's possible that 'system_bytes_remaining_' was incremented since we
last checked
              :   // it.
maybe this sentence would now be explained in the proposed comment above.


PS15, Line 319: another thread between when that thread released memory to the system and
              :   // incremented 'system_bytes_remaining_'.
I think it'd be good to explicitly say:
... since that happens while holding one of the arena locks.
or similar. though the exception is when AllocateInternal() we add it back in without holding
a lock, which isn't a correctness problem since that memory couldn't be needed by another
client, but makes the statement a little confusing.


PS15, Line 322: false
why false instead of true? I guess it doesn't matter since it should succeed to get all of
it, but 'true' seems more intuitive to me.


http://gerrit.cloudera.org:8080/#/c/6414/15/be/src/runtime/bufferpool/buffer-allocator.h
File be/src/runtime/bufferpool/buffer-allocator.h:

Line 47: /// required memory from the OS).
I think explaining a couple of more things would help:

1) The fact that BufferAllocator relies on the reservations (which are managed by the BufferPool
& client) and the BufferPool's eviction policy, for correctness, to ensure that a client
that has unused reservation can call Allocate() and the BufferAllocator() will always be able
to find the memory to back the buffer somewhere.

2) Some implementation notes explaining that available memory is sort of kept in three forms:
buffers in the free lists, buffers in the clean pages lists, and available memory that can
be called up from the system. This information is kind of in the above paragraphs, but I think
explaining it in that way will help understand the implementation. Probably paragraph two
above can be moved into this section since the class abstraction itself doesn't expose arenas,
right?


PS15, Line 91: void
does the caller care if it couldn't release 'bytes_to_free' bytes?


PS15, Line 133: buffers
memory? since some of it might come from 'system_bytes_remaining_'.


PS15, Line 135: lock_arenas
one thing that was confusing is that the name of this makes it sound like arenas are not locked
when lock_arenas=false. But that's not true -- it's just they aren't concurrently locked.

As mentioned in the cc file, I don't think the caller actually cares how the arena locking
works, right? What this abstraction is really trying to expose is a "fast opportunistic path
that might not succeed", and a "slow but guaranteed to succeed path". So, how about abstracting
it and explaining it that way?


-- 
To view, visit http://gerrit.cloudera.org:8080/6414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message