Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C991F200B6F for ; Wed, 24 Aug 2016 20:05:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C67A1160AB1; Wed, 24 Aug 2016 18:05:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 17AEB160A91 for ; Wed, 24 Aug 2016 20:05:24 +0200 (CEST) Received: (qmail 95332 invoked by uid 500); 24 Aug 2016 18:05:24 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 95312 invoked by uid 99); 24 Aug 2016 18:05:24 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Aug 2016 18:05:24 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 9B54A1806F8 for ; Wed, 24 Aug 2016 18:05:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id CcWQf5c8DbJK for ; Wed, 24 Aug 2016 18:05:20 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id AC87560CF1 for ; Wed, 24 Aug 2016 18:05:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u7OI5HY2026152; Wed, 24 Aug 2016 18:05:17 GMT Date: Wed, 24 Aug 2016 18:05:17 +0000 From: "Alex Behm (Code Review)" To: impala-cr@cloudera.com, dev@impala.incubator.apache.org Message-ID: Reply-To: alex.behm@cloudera.com X-Gerrit-MessageType: newchange Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-3905=3A_Add_single-threaded_scan_node=2E=0A?= X-Gerrit-Change-Id: I98cc7f970e1575dd83875609985e1877ada3d5e0 X-Gerrit-ChangeURL: X-Gerrit-Commit: 7b86030387dedf249723e1e1de5672f141cf042b MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Wed, 24 Aug 2016 18:05:26 -0000 Alex Behm has uploaded a new change for review. http://gerrit.cloudera.org:8080/4113 Change subject: IMPALA-3905: Add single-threaded scan node. ...................................................................... IMPALA-3905: Add single-threaded scan node. Adds a new single-threaded scan node HdfsScanNodeMt that materializes tuples in the thread calling GetNext(). The new scan node uses the HdfsScanner::GetNext() interface, which currently is only implemented for Parquet. As before, I/O is performed asynchronously via the I/O manager. The new scan node is enabled if the mt_dop query option is set to a value greater than 1. Otherwise, the existing multi-threaded scan node is used. The changes are mostly a refactoring of the existing multi-threaded scan node to separate out the common code between the existing multi-threaded scan node and the new single-threaded one. Summary of changes: - Move code from hdfs-scan-node.h/cc into a new hdfs-scan-node-base.h/cc. - Add new single-threaded scan node in hdfs-scan-node-mt.h/cc. - Both scan nodes inherit from HdfsScanNodeBase. - Rework the allocation of templates tuples such that the memory is drawn from a new mem pool in the scanners, and that each scanner clones the partition exprs contexts. Before, the memory was taken from the parent scan node's mem pool, and there was only one instance of the partition exprs contexts. Their access was protected under a lock, however, not in all instances, so their use was not always obviously correct. The change in this patch makes thread safety obvious and helps move a lock into the multi-threaded scan node which would otherwise have to remain in the HdfsScanNodeBase class. - ScannerContext::cancelled() is used for synchronizing threads in multi-threaded scans. Changed it to always return true when used in the context of the new single-threaded scan node. - Simplify a couple of loops with C++11 for-each. There are currently no automated tests that exercise the single-threaded scan node. Testing: Passed debug/exhaustive and asan/core builds on HDFS. Change-Id: I98cc7f970e1575dd83875609985e1877ada3d5e0 --- M be/src/exec/CMakeLists.txt M be/src/exec/base-sequence-scanner.cc M be/src/exec/base-sequence-scanner.h M be/src/exec/exec-node.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-lzo-text-scanner.cc M be/src/exec/hdfs-lzo-text-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-rcfile-scanner.h A be/src/exec/hdfs-scan-node-base.cc A be/src/exec/hdfs-scan-node-base.h A be/src/exec/hdfs-scan-node-mt.cc A be/src/exec/hdfs-scan-node-mt.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/exprs/expr-context.h M be/src/runtime/tuple.h 28 files changed, 1,712 insertions(+), 1,384 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/4113/1 -- To view, visit http://gerrit.cloudera.org:8080/4113 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I98cc7f970e1575dd83875609985e1877ada3d5e0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm