drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-886) Wrong results for a query with Right Outer Join on the second (and subsequent) executions
Date Thu, 14 Aug 2014 19:30:21 GMT

     [ https://issues.apache.org/jira/browse/DRILL-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jacques Nadeau resolved DRILL-886.
----------------------------------

    Resolution: Fixed

> Wrong results for a query with Right Outer Join on the second (and subsequent) executions
> -----------------------------------------------------------------------------------------
>
>                 Key: DRILL-886
>                 URL: https://issues.apache.org/jira/browse/DRILL-886
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Aman Sinha
>            Assignee: Chun Chang
>            Priority: Critical
>             Fix For: 0.5.0
>
>
> The following query with a right outer join produces correct results on the first execution
in a session but wrong results on the second and subsequent executions.   A potential cause
for the problem can be seen from the two Explain plans:  the scan of the nation table shows
a difference in the columns being projected.  
> 0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from cp.`tpch/region.parquet`
r right join cp.`tpch/nation.parquet` n on  n.n_regionkey = r.r_regionkey;
> +-------------+-------------+
> | n_regionkey | r_regionkey |
> +-------------+-------------+
> | 0           | 0           |
> | 0           | 0           |
> | 0           | 0           |
> | 0           | 0           |
> | 0           | 0           |
> | 1           | 1           |
> | 1           | 1           |
> | 1           | 1           |
> | 1           | 1           |
> | 1           | 1           |
> | 2           | 2           |
> | 2           | 2           |
> | 2           | 2           |
> | 2           | 2           |
> | 2           | 2           |
> | 3           | 3           |
> | 3           | 3           |
> | 3           | 3           |
> | 3           | 3           |
> | 3           | 3           |
> | 4           | 4           |
> | 4           | 4           |
> | 4           | 4           |
> | 4           | 4           |
> | 4           | 4           |
> +-------------+-------------+
> 25 rows selected (2.207 seconds)
> 0: jdbc:drill:zk=local> select n.n_regionkey, r.r_regionkey from cp.`tpch/region.parquet`
r right join cp.`tpch/nation.parquet` n on  n.n_regionkey = r.r_regionkey;
> +-------------+-------------+
> | n_regionkey | r_regionkey |
> +-------------+-------------+
> | 0           | null        |
> | 1           | null        |
> | 1           | null        |
> | 1           | null        |
> | 4           | null        |
> | 0           | null        |
> | 3           | null        |
> | 3           | null        |
> | 2           | null        |
> | 2           | null        |
> | 4           | null        |
> | 4           | null        |
> | 2           | null        |
> | 4           | null        |
> | 0           | null        |
> | 0           | null        |
> | 0           | null        |
> | 1           | null        |
> | 2           | null        |
> | 3           | null        |
> | 4           | null        |
> | 2           | null        |
> | 3           | null        |
> | 3           | null        |
> | 1           | null        |
> +-------------+-------------+
> 25 rows selected (0.514 seconds)
> EXPLAIN plan for the good run: 
> | 00-00    Screen
> 00-01      Project(n_regionkey=[$0], r_regionkey=[$1])
> 00-02        Project(n_regionkey=[$3], r_regionkey=[$1])
> 00-03          HashJoin(condition=[=($3, $1)], joinType=[right])
> 00-05            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/region.parquet]],
selectionRoot=/tpch/region.parquet, columns=[SchemaPath [`r_regionkey`]]]])
> 00-04            Project(*0=[$0], n_regionkey=[$1])
> 00-06              BroadcastExchange
> 01-01                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/nation.parquet]],
selectionRoot=/tpch/nation.parquet, columns=[SchemaPath [`n_regionkey`]]]])
> Explain plan for the bad run: 
> | 00-00    Screen
> 00-01      Project(n_regionkey=[$0], r_regionkey=[$1])
> 00-02        Project(n_regionkey=[$3], r_regionkey=[$1])
> 00-03          HashJoin(condition=[=($2, $1)], joinType=[right])
> 00-05            Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/region.parquet]],
selectionRoot=/tpch/region.parquet, columns=[SchemaPath [`r_regionkey`]]]])
> 00-04            Project(*0=[$0], n_regionkey=[$1])
> 00-06              BroadcastExchange
> 01-01                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/nation.parquet]],
selectionRoot=/tpch/nation.parquet, columns=null]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message