Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 111E5200D2E for ; Tue, 31 Oct 2017 20:13:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 0FB331609EF; Tue, 31 Oct 2017 19:13:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 306321609E6 for ; Tue, 31 Oct 2017 20:13:04 +0100 (CET) Received: (qmail 24048 invoked by uid 500); 31 Oct 2017 19:13:03 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 24039 invoked by uid 99); 31 Oct 2017 19:13:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 31 Oct 2017 19:13:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 9322E1A3976 for ; Tue, 31 Oct 2017 19:13:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id wtIxD6SMGIOO for ; Tue, 31 Oct 2017 19:13:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 249045F121 for ; Tue, 31 Oct 2017 19:13:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 96B72E041C for ; Tue, 31 Oct 2017 19:13:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 53FFF212F7 for ; Tue, 31 Oct 2017 19:13:00 +0000 (UTC) Date: Tue, 31 Oct 2017 19:13:00 +0000 (UTC) From: "Vitalii Diravka (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 31 Oct 2017 19:13:05 -0000 [ https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227349#comment-16227349 ] Vitalii Diravka commented on DRILL-5822: ---------------------------------------- [~Paul.Rogers] Here is the issue of unnecessary sorting of columns for query with the following conditions: using wildcard in the query and ORDER BY clause, and when this is planned into multiple fragments ("alter session set `planner.slice_target`=1;"). The issue is connected to adding canonicalizing the schemas of input batches for Merging Receiver in DRILL-847. But this approach is outdated since for now in the process of loading batches in the RecordBatchLoader the new batch with same columns (SchemaPaths) but other ordering of them is perceived with same schema as the previous batch has: [All fields from the last batch is a hashMap structure|https://github.com/apache/drill/blob/fe79a633a3da8b4f6db50454fde64c30c73233bb/exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchLoader.java#L90] and [when new batch appears the columns are just removed from the old one by the key|https://github.com/apache/drill/blob/fe79a633a3da8b4f6db50454fde64c30c73233bb/exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchLoader.java#L102]. So the schemaChange flag still equals to false. And then [the schema will built|https://github.com/apache/drill/blob/fe79a633a3da8b4f6db50454fde64c30c73233bb/exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchLoader.java#L138]. Here is only the issue that RecordBatchLoader permutes column order for the above case. And it was described in the jira ticket created by you DRILL-5828 and can be fixed there. So my changes fix the current issue but not fully cover the requirements from your comment. Will It be reasonably if that changes will be done in context of DRILL-5828? > The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order > ----------------------------------------------------------------------------------------------------------- > > Key: DRILL-5822 > URL: https://issues.apache.org/jira/browse/DRILL-5822 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.11.0 > Reporter: Prasad Nagaraj Subramanya > Assignee: Vitalii Diravka > Fix For: 1.12.0 > > > Columns ordering doesn't preserve for the star query with sorting when this is planned into multiple fragments. > Repro steps: > 1) {code}alter session set `planner.slice_target`=1;{code} > 2) ORDER BY clause in the query. > Scenarios: > {code} > 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`; > +-------+--------------------------------+ > | ok | summary | > +-------+--------------------------------+ > | true | planner.slice_target updated. | > +-------+--------------------------------+ > 1 row selected (0.082 seconds) > 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name limit 1; > +--------------+----------+--------------+------------------------------------------------------+ > | n_nationkey | n_name | n_regionkey | n_comment | > +--------------+----------+--------------+------------------------------------------------------+ > | 0 | ALGERIA | 0 | haggle. carefully final deposits detect slyly agai | > +--------------+----------+--------------+------------------------------------------------------+ > 1 row selected (0.141 seconds) > 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1; > +-------+--------------------------------+ > | ok | summary | > +-------+--------------------------------+ > | true | planner.slice_target updated. | > +-------+--------------------------------+ > 1 row selected (0.091 seconds) > 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name limit 1; > +------------------------------------------------------+----------+--------------+--------------+ > | n_comment | n_name | n_nationkey | n_regionkey | > +------------------------------------------------------+----------+--------------+--------------+ > | haggle. carefully final deposits detect slyly agai | ALGERIA | 0 | 0 | > +------------------------------------------------------+----------+--------------+--------------+ > 1 row selected (0.201 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)