impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Impala Public Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Date Thu, 16 Mar 2017 21:07:59 GMT
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
......................................................................


IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Reviewed-on: http://gerrit.cloudera.org:8080/5811
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/hash-table-test.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
23 files changed, 3,012 insertions(+), 105 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message