drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1401) Wrong result with csv data when projecting columns not part of the Join or Filter
Date Fri, 12 Sep 2014 15:59:33 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131686#comment-14131686
] 

Aman Sinha commented on DRILL-1401:
-----------------------------------

This bug is a regression caused by DRILL-1309 which has to do with projection pushdown into
scans.   While a more complete fix for the issue is investigated, I have disabled the projection
pushdown only for text scans.  I tested with the data supplied in this bug and it produces
the correct results.  
Note to QA: please also test DRILL-1408 which is a duplicate  of this issue. 

> Wrong result with csv data when projecting columns not part of the Join or Filter
> ---------------------------------------------------------------------------------
>
>                 Key: DRILL-1401
>                 URL: https://issues.apache.org/jira/browse/DRILL-1401
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.5.0
>            Reporter: Chun Chang
>            Assignee: Aman Sinha
>         Attachments: 0001-DRILL-1401-Disable-projection-pushdown-for-text-form.patch,
aggregate_100r.csv
>
>
> #Wed Sep 10 13:35:05 PDT 2014
> git.commit.id.abbrev=686eb9e
> join (inner,full outer,left,right) does not work if the join is directly applied to csv
files. It will produce either null values on the projected columns or NumberFormatException
depends on joining condition applied on which columns (first or second columns). If you create
views on the csv files, then join works on the views.
> For example, the following join give null values:
> 0: jdbc:drill:schema=dfs> select cast(`aggregate_100r.csv`.columns[0] as int), cast(`aggregate_100r.csv`.columns[1]
as int), cast(`join_100r.csv`.columns[1] as int) from `aggregate_100r.csv` inner join `join_100r.csv`
on cast(`aggregate_100r.csv`.columns[0] as int) = cast(`join_100r.csv`.columns[0] as int);
> +------------+------------+------------+
> |   EXPR$0   |   EXPR$1   |   EXPR$2   |
> +------------+------------+------------+
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 0          | null       | null       |
> | 1          | null       | null       |
> | 1          | null       | null       |
> | 1          | null       | null       |
> | 1          | null       | null       |
> The following give NumberFormatException:
> 0: jdbc:drill:schema=dfs> select cast(`aggregate_100r.csv`.columns[0] as int), cast(`aggregate_100r.csv`.columns[1]
as int), cast(`join_100r.csv`.columns[1] as int) from `aggregate_100r.csv` inner join `join_100r.csv`
on cast(`aggregate_100r.csv`.columns[1] as int) = cast(`join_100r.csv`.columns[1] as int);
> Query failed: Failure while running fragment.  [3f67299e-f312-445b-8d6b-74984a820f0c]
> Error: exception while executing query: Failure while trying to get next result batch.
(state=,code=0)
> The following with views works:
> 0: jdbc:drill:schema=dfs> select `aggregate_100r_v`.c0, `aggregate_100r_v`.c1, `join_100r_v`.c1
from `aggregate_100r_v` inner join `join_100r_v` on `aggregate_100r_v`.c0 = `join_100r_v`.c0;
> +------------+------------+------------+
> |     c0     |     c1     |    c10     |
> +------------+------------+------------+
> | 0          | 0          | 0          |
> | 0          | 0          | 2          |
> | 0          | 0          | 1          |
> | 0          | 0          | 0          |
> | 0          | 0          | 0          |
> | 0          | 0          | 2          |
> | 0          | 0          | 1          |
> | 0          | 0          | 0          |
> | 0          | 1          | 0          |
> | 0          | 1          | 2          |
> | 0          | 1          | 1          |
> | 0          | 1          | 0          |
> | 0          | 2          | 0          |
> | 0          | 2          | 2          |
> | 0          | 2          | 1          |
> | 0          | 2          | 0          |
> | 1          | 1          | 1          |
> | 1          | 1          | 2          |
> | 1          | 1          | 1          |
> | 1          | 1          | 0          |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message