impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5169: Add support for async pins in buffer pool
Date Wed, 19 Apr 2017 20:36:49 GMT
Tim Armstrong has uploaded a new patch set (#6).

Change subject: IMPALA-5169: Add support for async pins in buffer pool
......................................................................

IMPALA-5169: Add support for async pins in buffer pool

Makes Pin() do async reads behind-the-scenes, instead of
blocking until the read completes. The blocking is done
instead when the client tries to access the buffer via
PageHandle::GetBuffer() or ExtractBuffer().

This is implemented with a new sub-state of "pinned"
where the page has a buffer and consumes reservation
but the buffer does not contain valid data.

Motivation:
This unlocks various opportunities to overlap read I/Os
with other work:
* Reads to different disks can execute in parallel
* I/O and computation can be overlapped.

This initially benefits BufferedTupleStream::PinStream(),
where many pages are pinned at once. With this change
the reads run asynchronously. This can potentially lead
to large speedups when spilling. E.g. if the pages for a Hash
Join's partition are spread across 10 disks, we could get 10x
the read throughput, plus overlap the I/O with hash table build.

In future we can use this to do read-ahead over unpinned
BufferedTupleStreams or for unpinned Runs in Sorter, but
this requires changes to the client code to Pin() pages
in advance.

Testing:
* BufferedTupleStreamV2 already exercises this.
* Various BufferPool tests already exercise this.
* Added a basic test to cover edge cases made possible by the
  new state transitions.
* Extended the randomised test to cover this.

Change-Id: Ibdf074c1ac4405d6f08d623ba438a85f7d39fd79
---
M be/src/runtime/buffered-tuple-stream-v2.cc
M be/src/runtime/buffered-tuple-stream-v2.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
8 files changed, 418 insertions(+), 153 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6612/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibdf074c1ac4405d6f08d623ba438a85f7d39fd79
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message