Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A74AA200C73 for ; Wed, 10 May 2017 18:38:15 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A5DFB160B9C; Wed, 10 May 2017 16:38:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id ED27A160B99 for ; Wed, 10 May 2017 18:38:14 +0200 (CEST) Received: (qmail 39803 invoked by uid 500); 10 May 2017 16:38:14 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 39792 invoked by uid 99); 10 May 2017 16:38:13 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 May 2017 16:38:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 71A6CC4562 for ; Wed, 10 May 2017 16:38:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.363 X-Spam-Level: X-Spam-Status: No, score=0.363 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Ox7RxR-a1DC2 for ; Wed, 10 May 2017 16:38:12 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 672E25F4A7 for ; Wed, 10 May 2017 16:38:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v4AGcBtP024046; Wed, 10 May 2017 16:38:11 GMT Message-Id: <201705101638.v4AGcBtP024046@ip-10-146-233-104.ec2.internal> Date: Wed, 10 May 2017 16:38:10 +0000 From: "Tim Armstrong (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Dan Hecht Reply-To: tarmstrong@cloudera.com X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-5169=3A_Add_support_for_async_pins_in_buffer_pool=0A?= X-Gerrit-Change-Id: Ibdf074c1ac4405d6f08d623ba438a85f7d39fd79 X-Gerrit-ChangeURL: X-Gerrit-Commit: 6f3e60665d45a80d2b4b511164696f96c0ce54b8 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Wed, 10 May 2017 16:38:15 -0000 Tim Armstrong has uploaded a new patch set (#10). Change subject: IMPALA-5169: Add support for async pins in buffer pool ...................................................................... IMPALA-5169: Add support for async pins in buffer pool Makes Pin() do async reads behind-the-scenes, instead of blocking until the read completes. The blocking is done instead when the client tries to access the buffer via PageHandle::GetBuffer() or ExtractBuffer(). This is implemented with a new sub-state of "pinned" where the page has a buffer and consumes reservation but the buffer does not contain valid data. Motivation: This unlocks various opportunities to overlap read I/Os with other work: * Reads to different disks can execute in parallel * I/O and computation can be overlapped. This initially benefits BufferedTupleStream::PinStream(), where many pages are pinned at once. With this change the reads run asynchronously. This can potentially lead to large speedups when spilling. E.g. if the pages for a Hash Join's partition are spread across 10 disks, we could get 10x the read throughput, plus overlap the I/O with hash table build. In future we can use this to do read-ahead over unpinned BufferedTupleStreams or for unpinned Runs in Sorter, but this requires changes to the client code to Pin() pages in advance. Testing: * BufferedTupleStreamV2 already exercises this. * Various BufferPool tests already exercise this. * Added a basic test to cover edge cases made possible by the new state transitions. * Extended the randomised test to cover this. Change-Id: Ibdf074c1ac4405d6f08d623ba438a85f7d39fd79 --- M be/src/runtime/buffered-tuple-stream-v2.cc M be/src/runtime/buffered-tuple-stream-v2.h M be/src/runtime/bufferpool/buffer-pool-internal.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h 8 files changed, 416 insertions(+), 155 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6612/10 -- To view, visit http://gerrit.cloudera.org:8080/6612 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibdf074c1ac4405d6f08d623ba438a85f7d39fd79 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong