drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1500) Partition filtering might lead to an unnecessary column in the result set.
Date Mon, 12 Jan 2015 22:11:35 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274265#comment-14274265
] 

Jinfeng Ni commented on DRILL-1500:
-----------------------------------

+1.

The patch looks good to me.

The issue seems to be introduced in the flatten operator work. I'm wondering how we could
prevent such ProjectPrel replacement in any future PrelVisitor. One idea is to make the ProjectAllowDupPrel
constructor private, and only publicly expose copy() method. Also, add a public static method
to explicitly create new instance of this special type of ProjectPrel.  This might help prevent
similar issue happening in the future. 



 

> Partition filtering might lead to an unnecessary column in the result set. 
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-1500
>                 URL: https://issues.apache.org/jira/browse/DRILL-1500
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Jinfeng Ni
>            Assignee: Aman Sinha
>            Priority: Critical
>             Fix For: 0.8.0
>
>         Attachments: 0001-DRILL-1500-Partial-fix-Don-t-overwrite-top-level-Pro.patch
>
>
> When partition filtering is used together with select * query, Drill might return the
partitioning column duplicately. 
> Q1 : 
> {code}
> select * from dfs.`/Users/jni/work/incubator-drill/exec/java-exec/src/test/resources/multilevel/parquet`
where dir0=1994 and dir1='Q1' order by dir0 limit 1;
> +------------+------------+------------+------------+------------+------------+-------------+------------+-----------------+---------------+----------------+--------------+
> |   dir00    |    dir0    |    dir1    |  o_clerk   | o_comment  | o_custkey  | o_orderdate
| o_orderkey | o_orderpriority | o_orderstatus | o_shippriority | o_totalprice |
> +------------+------------+------------+------------+------------+------------+-------------+------------+-----------------+---------------+----------------+--------------+
> | 1994       | 1994       | Q1         | Clerk#000000743 | y pending requests integrate
| 1292       | 1994-01-20  | 66         | 5-LOW           | F             | 0            
 | 104190.66    |
> +------------+------------+------------+------------+------------+------------+-------------+------------+-----------------+---------------+----------------+--------------+
> 1 row selected (2.097 seconds)
> {code}
> We can see that column "dir0" appeared twice in the result set.  In comparison, here
is the query without partition filtering and the query result:
> Q2:
> {code}
> select * from dfs.`/Users/jni/work/incubator-drill/exec/java-exec/src/test/resources/multilevel/parquet`
order by dir0 limit 1;
> +------------+------------+------------+------------+------------+-------------+------------+-----------------+---------------+----------------+--------------+
> |    dir0    |    dir1    |  o_clerk   | o_comment  | o_custkey  | o_orderdate | o_orderkey
| o_orderpriority | o_orderstatus | o_shippriority | o_totalprice |
> +------------+------------+------------+------------+------------+-------------+------------+-----------------+---------------+----------------+--------------+
> | 1994       | Q1         | Clerk#000000743 | y pending requests integrate | 1292   
   | 1994-01-20  | 66         | 5-LOW           | F             | 0              | 104190.66
   |
> +------------+------------+------------+------------+------------+-------------+------------+-----------------+---------------+----------------+--------------+
> 1 row selected (0.761 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message