impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Date Fri, 18 Nov 2016 02:00:13 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
......................................................................


Patch Set 4:

(34 comments)

http://gerrit.cloudera.org:8080/#/c/4715/4//COMMIT_MSG
Commit Message:

PS4, Line 13:  buddy allocation
> Buddy allocators tend to have high internal fragmentation.Did you consider 
The main use case is to allocate power-of-two buffers, so currently this isn't a problem.
Maybe there are some other use cases down the line but for now it seems better to go with
a simple algorithm.


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator-test.cc
File be/src/bufferpool/suballocator-test.cc:

Line 28: 
> why the gap?
Done


Line 44:   virtual void SetUp() {
> override, here and below
Done


PS4, Line 63:  != NULL
> elide
I think this is a matter of taste, but we've generally standardised on explicit null checks.


PS4, Line 68: Random
> 'Randomness' or 'PRNG'; the seed itself is not random
Wikipedia knows what I mean :) https://en.wikipedia.org/wiki/Random_seed


Line 75:   void VerifyMemoryValid(const vector<unique_ptr<Suballocation>>&
allocs);
> static
Done


Line 78:   void FreeAllocations(
> static
Done


PS4, Line 80: unique_ptr<Suballocation>
> auto
Done


PS4, Line 88: default_random_engine
> It might be good to pick one for repeatability reasons
Done


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.cc
File be/src/bufferpool/suballocator.cc:

Line 40:     free_lists_[i] = NULL;
> This is the default. If you elide this loop, this should happen anyway.
Done


Line 59:   const int target_list_idx = Bits::Log2CeilingNonZero64(bytes);
> How do you know bytes is non-zero?
This is now handled with the minimum allocation size logic.


Line 60:   for (int i = target_list_idx; i < ALLOCATION_SIZES; ++i) {
> This can be O(1) with ctz and a bitmap for the freelist emptiness
It's O(1) now just with a higher constant factor assuming an upper bound on allocation sizes.
I'd say we should defer this unless we actually see it become a bottleneck.


Line 88:   int64_t buffer_len = max(min_buffer_len_, BitUtil::RoundUpToPowerOfTwo(bytes));
> const
Done


Line 127:     if (!status.ok()) {
> if (!Suballocation::Create(&nodes[i]).ok()) {...
Done


Line 144:     int64_t child_len = free_node->len_ / 2;
> const
Done


PS4, Line 233: != NULL
> elide
We wouldn't normally elide it.


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

Line 31: enum class ExpandReservations {
> ExpansionStrategy, maybe?
Done


Line 37: };
> This can live inside Suballocator, right?
It can but we end up with Suballocator::ExpansionStrategy everywhere, which is pretty verbose.


PS4, Line 39: buddy
> There are a few different variants of buddy systems. It might be good to sp
Elaborated slightly in this comment.


PS4, Line 43: relatively large allocations
> Yet free_lists_ goes all the way down to one byte? Why not enforce this at 
I think my main concern there would be that the memory overhead of the allocations isn't tracked,
so could overwhelm the actual tracked memory.

I added logic to enforce a minimum allocation size and round up allocations to that amount
(DCHECKing or similar seems a little awkward).


Line 50:       ExpandReservations expand_reservations);
> Can you annotate these params in a comment
Done


Line 54:   /// Allocate bytes from BufferPool. The allocation is NULL if unsuccessful because
> nullptr, here and elsewhere
Done


Line 71:   /// Allocate a buffer of size 'bytes' from the buffer pool and initialize 'result'
> bytes must be less than MAX_ALLOCATION_BYTES
Done


PS4, Line 79: Can fail if malloc() fails
> if new returns nullptr
Reworded to somethign more generic about memory allocation failing.


PS4, Line 109: by
> 'but'
Done


PS4, Line 127: must
> or else leaked memory or some kind of segfault or nasal demons?
Done


PS4, Line 132: Destructor
> Elide
Done


Line 138:  private:
> also disallow_copy_and_assign?
Done


Line 152:   /// The buffer backing the suballocation, if the suballocation is backed by an
entire
> capitalization, here and elsewhere: Suballocation and Allocation
Not sure why we'd capitalise allocation since it's not a class.


Line 153:   /// buffer. Otherwise uninitialized. 'buffer_' is open iff buddy is NULL.
> buddy_
Done


Line 156:   /// If this is a left child, the parent of this and its buddy. The parent's allocation
> It would be helpful to explain the tree structure first.
Added some explanation to the class comment.


Line 159:   std::unique_ptr<Suballocation> parent_;
> Can the parent be a buddy, thus being pointed to by its buddy and one of it
Yes. I think the explanation in the class comment should cover it.


Line 171:   Suballocation* prev_free_;
> so next_free_ is not really unique? I'm troubled by this design pattern. Di
unique_ptr to me means unique ownership, which doesn't preclude a raw pointer existing, so
long as the raw pointer doesn't outlive the unique_ptr.

weak_ptr/shared_ptr only seems appropriate when there are multiple owners and the lifetime
of those owning pointers may be shorter that weak_ptr.


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/common/names.h
File be/src/common/names.h:

Line 124: using std::move;
> This might be overkill. I'd like 'move' to be available as a variable name 
It seems consistent with the other cases here (e.g. I can imagine 'left' being a legitimate
variable name). I feel like within .cc files in the codebase we should mostly avoid using
variable/function names that clash with commonly used standard library function names.

I can remove if you feel strongly.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message