Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ACEC4200CCA for ; Wed, 14 Jun 2017 02:00:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id AB899160BDC; Wed, 14 Jun 2017 00:00:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F26F3160BE9 for ; Wed, 14 Jun 2017 02:00:04 +0200 (CEST) Received: (qmail 89306 invoked by uid 500); 14 Jun 2017 00:00:04 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 89256 invoked by uid 99); 14 Jun 2017 00:00:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Jun 2017 00:00:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B1FD4CF6BC for ; Wed, 14 Jun 2017 00:00:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.011 X-Spam-Level: X-Spam-Status: No, score=-100.011 tagged_above=-999 required=6.31 tests=[SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Tj90oe6YTZM1 for ; Wed, 14 Jun 2017 00:00:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 1ED765FAC9 for ; Wed, 14 Jun 2017 00:00:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5381BE0295 for ; Wed, 14 Jun 2017 00:00:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 052A620DF1 for ; Wed, 14 Jun 2017 00:00:00 +0000 (UTC) Date: Wed, 14 Jun 2017 00:00:00 +0000 (UTC) From: "Jinfeng Ni (JIRA)" To: dev@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-5586) UnionAll operator does more than necessary value vector allocation and copy MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 14 Jun 2017 00:00:05 -0000 Jinfeng Ni created DRILL-5586: --------------------------------- Summary: UnionAll operator does more than necessary value vector allocation and copy Key: DRILL-5586 URL: https://issues.apache.org/jira/browse/DRILL-5586 Project: Apache Drill Issue Type: Bug Reporter: Jinfeng Ni When inputs to UnionAll operators are just simple field reference, in stead of an expression involving a function, which requires evaluation, it should leverage value vector's transfer API. Doing transfer would avoid the allocation of buffer for value vector in outgoing batch, plus the overhead to copy the data from incoming batch to outgoing batch. For example, in the following query: {code} select l_orderkey from cp.`tpch/lineitem.parquet` l union all select n_nationkey from cp.`tpch/nation.parquet` {code} Both left and right side of UnionAll operator is simple filed reference, and Drill should call transfer API. However, the current code would do buffer allocation & copy for both left and right. Such processing would significantly slow UnionAll operator's performance, and eventually slow down query evaluation. DRILL-5521 reverts a change in logic whether applying transfer logic made in DRILL-5419, based on SchemaPath equal comparison. Even we fix that problem, it's not enough to use SchemaPath equal comparison as criteria whether transfer should be used. Ideally, even the output field and incoming field have different names, UnionAll operator should do {{transfer}}, instead of {{copy}}, as long as the expression is simple field reference. {code} select l_orderkey as Key1 from cp.`tpch/lineitem.parquet` l union all select n_nationkey as Key2 from cp.`tpch/nation.parquet` {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)