impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Apple (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Date Sat, 26 Nov 2016 20:56:52 GMT
Jim Apple has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
......................................................................


Patch Set 5:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/4715/4//COMMIT_MSG
Commit Message:

PS4, Line 13:  buddy allocation
> The main use case is to allocate power-of-two buffers, so currently this is
Though the number of elements in HTs is a power of two, the size of the allocations could
be different if the sizeof a slot changes, yes? For instance, if sizeof(slot) is 24, then
most allocations will waste a good bit of space?


http://gerrit.cloudera.org:8080/#/c/4715/5/be/src/bufferpool/suballocator-test.cc
File be/src/bufferpool/suballocator-test.cc:

Line 72:   /// Verify that the memory for all of the allocations is writable and disjoint
by
Maybe replace "Verify" with "ASSERT", for clarity?


PS5, Line 72: allocations
"Suballocations", here and below


PS5, Line 77: list
vector


PS5, Line 77: clear
call std::vector::clear() after the for loop to prevent UB if a caller uses (*allocs)[i] for
some i.


Line 85:   ObjectPool obj_pool_;
Each of these member variables could use an explanation about why it is here.


Line 101:   ASSERT_OK(pool.RegisterClient("client", &global_reservation_, profile(), &client));
The ASSERTs in this file could generally benefit from more info in the "<<" section
to help diagnose failures from the logs. I have recently seen a backend test error that I
was able to reproduce, but only on a machine image I had limited access to, so I had to ssh
in between other people using it, and more backend test logs would have saved me some time.


Line 111:     ASSERT_OK(allocator.Allocate(ALLOC_SIZE, &allocs.back()));
I generally think of ASSERT as being for failures in which the test should not continue, but
 here I think some further parts of the test are interesting. For instance, the smaller allocations
below are interesting even if this one fails.

I can understand breaking from the loop if this fails, though.

My feeling is that several other ASSERTs should be EXPECTs in this file.


Line 116:   // Attempts to allocate more memory should fail gracefully.
Can we allocate 0 bytes here?


Line 127:   for (unique_ptr<Suballocation>& alloc : allocs) {
call FreeAllocations?


PS5, Line 147: odd
It might also be good to test edge cases: (1 << i) - 1


Line 159:     memset(alloc->data(), 0, alloc->len()); // Check memory is writable.
Add note that ASAN performs this check.


Line 180:   const int num_allocs = 16;
either "NUM_ALLOCS" or "total_mem", above


PS5, Line 184: int64_t curr_alloc_size = TEST_BUFFER_LEN / 8;
             :   while (curr_alloc_size * num_allocs < TOTAL_MEM) {
Could be more readable as a for loop


Line 188:     for (int i = 0; i < num_allocs; ++i) {
Can this be range-based, like the loop on 205?


Line 197:     // allocations should require an extra page.
Can you add a comment explaining why should they never require more than an extra page? That
way, if this breaks, the test code will help the fixer understand why it was originally true
and why it is part of the buddy system contract.


Line 206:     allocator.Free(move(alloc));
call FreeAllocations()


PS5, Line 215:   const int64_t TOTAL_MEM = TEST_BUFFER_LEN * 100;
             :   global_reservation_.InitRootTracker(nullptr, TOTAL_MEM);
             :   BufferPool pool(TEST_BUFFER_LEN, TOTAL_MEM);
             : 
             :   ReservationTracker reservation;
             :   reservation.InitChildTracker(nullptr, &global_reservation_, nullptr,
TOTAL_MEM);
             :   BufferPool::Client client;
             :   ASSERT_OK(pool.RegisterClient("client", &reservation, profile(), &client));
             : 
             :   Suballocator allocator(&pool, &client, TEST_BUFFER_LEN, ExpansionStrategy::NEVER);
This section seems like a good candidate for a member function, as it appears very similarly
elsewhere


Line 263: /// Do some randomised testing of the allocator. Simulate some interesting patterns
with
This tests only correctness, not anything about fragmentation, right?


Line 289:       int64_t remaining_mem_per_alloc = (TOTAL_MEM - allocated_mem) / num_allocs;
const


PS5, Line 290: 1.0
The first parameter below is 2 - that's the mean, right, so the average is 0.2?


PS5, Line 334: 8
I didn't see this in the header file. I'd suggest that this expectation should be in the contract.


Line 336:     for (int64_t offset = 0; offset < alloc->len(); offset += 8) {
std::iota


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.cc
File be/src/bufferpool/suballocator.cc:

Line 144:     DCHECK_EQ(free_node->len_, 1LL << (i + LOG_MIN_ALLOCATION_BYTES));
> Done
Does not look done. Maybe some changes were left out of PS5 by mistake?


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

Line 37: };
> It can but we end up with Suballocator::ExpansionStrategy everywhere, which
Maybe typedef liberally? impala::ExpansionStrategy strikes me as potentially more confusing


Line 50: ///
> Done
Looks missing


PS4, Line 127: llpt
> Done
Looks missing?


Line 171:   int64_t len() const { return len_; }
> unique_ptr to me means unique ownership, which doesn't preclude a raw point
std::unique_ptr to me means that all other references are statically scoped within the unique_ptr's
lifetime, like

    unique_ptr<T> foo;
    Bar(foo.get());

I think of std::weak_ptr partially as having the effect of breaking shared_ptr cycles. As
a code reader, I would find aw pointer less confusing than one unique_ptr and one raw pointer.
Another pattern I was able to get working that I find more understandable is making parent
pointers shared pointers and buddy pointers implicit in an array ordering, as follows:

    struct Duo {
      struct Block {
        shared_ptr<Duo> duo;
        bool idx;
        shared_ptr<Duo> Split() {
          auto result = make_shared<Duo>();
          result->parent = *this;
          result->size = duo->size / 2;
          return result;
        }
      };
      Block parent;
      bool in_use[2];
      int64_t size;
      using T = char*;
      T payload[2];
      static void Release(Block& b) {
        b.duo->in_use[b.idx] = false;
        if (!b.duo->in_use[1 - b.idx] && b.duo->parent.duo != nullptr) {
          Release(b.duo->parent);
        }
        b.duo.reset();
      }
    };


http://gerrit.cloudera.org:8080/#/c/4715/5/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

PS5, Line 94: and the number of different
            :   /// power-of-two allocation sizes
not anymore


Line 106:   int ComputeListIndex(int64_t bytes) const;
Deserves a comment


PS5, Line 202:   /// If this is in a free list, the next element in the list. nullptr if this
is the last
             :   /// element in the free list.
             :   std::unique_ptr<Suballocation> next_free_;
             : 
             :   /// If this is in a free list, the previous element in the list. nullptr
if this is the
             :   /// first element.
             :   Suballocation* prev_free_;
Since only unused Suballocations need these, they can be "intrusive" - you can reuse the first
few bytes of the buffer_ for these pointers.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-impala@apache.org>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message