Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B62E218F2D for ; Thu, 8 Oct 2015 18:22:48 +0000 (UTC) Received: (qmail 81109 invoked by uid 500); 8 Oct 2015 18:22:27 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 81067 invoked by uid 500); 8 Oct 2015 18:22:26 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 80963 invoked by uid 99); 8 Oct 2015 18:22:26 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Oct 2015 18:22:26 +0000 Date: Thu, 8 Oct 2015 18:22:26 +0000 (UTC) From: "Khurram Faraaz (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (DRILL-3419) Handle scans optimally when all files are pruned out MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz updated DRILL-3419: ---------------------------------- Assignee: Jinfeng Ni > Handle scans optimally when all files are pruned out > ---------------------------------------------------- > > Key: DRILL-3419 > URL: https://issues.apache.org/jira/browse/DRILL-3419 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.1.0 > Reporter: Khurram Faraaz > Assignee: Jinfeng Ni > Fix For: 1.3.0 > > > Note that in case (1) and case (2) we prune, however it is not clear if we prune is case (3), that is because we see a FILTER in the query plan in case (3) > CTAS > {code} > 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, cast(columns[1] as char(2)) col2 from `millionValGroup.csv`; > +-----------+----------------------------+ > | Fragment | Number of records written | > +-----------+----------------------------+ > | 1_1 | 21932064 | > | 1_0 | 28067936 | > +-----------+----------------------------+ > 2 rows selected (73.661 seconds) > {code} > case 1) > {code} > explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 LIKE '%Z%'; > | 00-00 Screen > 00-01 Project(col1=[$0], col2=[$1]) > 00-02 UnionExchange > 01-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=2, columns=[`col2`, `col1`]]]) > {code} > case 2) > {code} > explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 LIKE 'A%'; > | 00-00 Screen > 00-01 Project(col1=[$0], col2=[$1]) > 00-02 UnionExchange > 01-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_1.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_2.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=6, columns=[`col2`, `col1`]]]) > {code} > case 3) we are NOT pruning here. > {code} > explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 LIKE 'Z%'; > | 00-00 Screen > 00-01 Project(col1=[$1], col2=[$0]) > 00-02 SelectionVectorRemover > 00-03 Filter(condition=[LIKE($0, 'Z%')]) > 00-04 Project(col2=[$1], col1=[$0]) > 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_48.parquet]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, `col1`]]]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)