drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order
Date Tue, 31 Oct 2017 16:38:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227074#comment-16227074
] 

Paul Rogers commented on DRILL-5822:
------------------------------------

The general rule for the SQL project clause is the following:

* If the list is explicit, `SELECT b, c, a` then columns are returned in that order, even
if the table defines them in the order (a, b, c).
* If the lis is implicit using a wildcard, `SELECT *`, then the column order is that defined
by the table. In our example above, the order would be `a, b, c`.

Since Drill is distributed and schema-on-read, we run into the issue that two tables might
have the same columns, but defined in different orders. For example, `{"a": 10, "b": 20, "c":
30}` and `{"c": 40, "b": 50, "c": 60}`. In this case, there is no "correct" order. Instead,
Drill must:

1. Recognize that the above scenario can occur.
2. Define each merging operator to follow some reconciliation rule.

Here a "merging" operator is anything that can see batches from two distinct scans. That is,
almost all operators, but at least the receivers.

A good reconciliation rule is that the first schema wins, and all other batches are projected
into that first schema. In our example, `a, b, c` and `c, b, a` are both projected into `a,
b, c`.

The PMC has asked that we not discuss design issues in PR reviews. So, can you perhaps please
explain here the approach that this PR takes to solve the problem? Do we agree on the description
above? Or, did this PR take a different approach?

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't
preserve column order
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-5822
>                 URL: https://issues.apache.org/jira/browse/DRILL-5822
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>            Reporter: Prasad Nagaraj Subramanya
>            Assignee: Vitalii Diravka
>             Fix For: 1.12.0
>
>
> Columns ordering doesn't preserve for the star query with sorting when this is planned
into multiple fragments.
> Repro steps:
> 1) {code}alter session set `planner.slice_target`=1;{code}
> 2) ORDER BY clause in the query.
> Scenarios:
> {code}
> 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
> +-------+--------------------------------+
> |  ok   |            summary             |
> +-------+--------------------------------+
> | true  | planner.slice_target updated.  |
> +-------+--------------------------------+
> 1 row selected (0.082 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name limit
1;
> +--------------+----------+--------------+------------------------------------------------------+
> | n_nationkey  |  n_name  | n_regionkey  |                      n_comment           
           |
> +--------------+----------+--------------+------------------------------------------------------+
> | 0            | ALGERIA  | 0            |  haggle. carefully final deposits detect slyly
agai  |
> +--------------+----------+--------------+------------------------------------------------------+
> 1 row selected (0.141 seconds)
> 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
> +-------+--------------------------------+
> |  ok   |            summary             |
> +-------+--------------------------------+
> | true  | planner.slice_target updated.  |
> +-------+--------------------------------+
> 1 row selected (0.091 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name limit
1;
> +------------------------------------------------------+----------+--------------+--------------+
> |                      n_comment                       |  n_name  | n_nationkey  | n_regionkey
 |
> +------------------------------------------------------+----------+--------------+--------------+
> |  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0            | 0
           |
> +------------------------------------------------------+----------+--------------+--------------+
> 1 row selected (0.201 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message