drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Farkas (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DRILL-4211) Column aliases not pushed down to JDBC stores in some cases when Drill expects aliased columns to be returned.
Date Sat, 12 Aug 2017 00:40:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122525#comment-16122525
] 

Timothy Farkas edited comment on DRILL-4211 at 8/12/17 12:39 AM:
-----------------------------------------------------------------

After discussing with Paul and Jinfeng, this issue can be fixed by pushing the whole select
portion of the query down into the JDBC operator. And required the select portion of the query
to alias potentially conflicting columns.


was (Author: timothyfarkas):
After discussing with Paul and Juinfeng, this issue can be fixed by pushing the whole select
portion of the query down into the JDBC operator. And required the select portion of the query
to alias potentially conflicting columns.

> Column aliases not pushed down to JDBC stores in some cases when Drill expects aliased
columns to be returned.
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-4211
>                 URL: https://issues.apache.org/jira/browse/DRILL-4211
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.3.0, 1.11.0
>         Environment: Postgres db storage
>            Reporter: Robert Hamilton-Smith
>            Assignee: Timothy Farkas
>              Labels: newbie
>             Fix For: 1.12.0
>
>
> When making an sql statement that incorporates a join to a table and then a self join
to that table to get a parent value , Drill brings back inconsistent results. 
> Here is the sql in postgres with correct output:
> {code:sql}
> select trx.categoryguid,
> cat.categoryname, w1.categoryname as parentcat
> from transactions trx
> join categories cat on (cat.CATEGORYGUID = trx.CATEGORYGUID)
> join categories w1 on (cat.categoryparentguid = w1.categoryguid)
> where cat.categoryparentguid IS NOT NULL;
> {code}
> Output:
> ||categoryid||categoryname||parentcategory||
> |id1|restaurants|food&Dining|
> |id1|restaurants|food&Dining|
> |id2|Coffee Shops|food&Dining|
> |id2|Coffee Shops|food&Dining|
> When run in Drill with correct storage prefix:
> {code:sql}
> select trx.categoryguid,
> cat.categoryname, w1.categoryname as parentcat
> from db.schema.transactions trx
> join db.schema.categories cat on (cat.CATEGORYGUID = trx.CATEGORYGUID)
> join db.schema.wpfm_categories w1 on (cat.categoryparentguid = w1.categoryguid)
> where cat.categoryparentguid IS NOT NULL
> {code}
> Results are:
> ||categoryid||categoryname||parentcategory||
> |id1|restaurants|null|
> |id1|restaurants|null|
> |id2|Coffee Shops|null|
> |id2|Coffee Shops|null|
> Physical plan is:
> {code:sql}
> 00-00    Screen : rowType = RecordType(VARCHAR(50) categoryguid, VARCHAR(50) categoryname,
VARCHAR(50) parentcat): rowcount = 100.0, cumulative cost = {110.0 rows, 110.0 cpu, 0.0 io,
0.0 network, 0.0 memory}, id = 64293
> 00-01      Project(categoryguid=[$0], categoryname=[$1], parentcat=[$2]) : rowType =
RecordType(VARCHAR(50) categoryguid, VARCHAR(50) categoryname, VARCHAR(50) parentcat): rowcount
= 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id =
64292
> 00-02        Project(categoryguid=[$9], categoryname=[$41], parentcat=[$47]) : rowType
= RecordType(VARCHAR(50) categoryguid, VARCHAR(50) categoryname, VARCHAR(50) parentcat): rowcount
= 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id =
64291
> 00-03          Jdbc(sql=[SELECT *
> FROM "public"."transactions"
> INNER JOIN (SELECT *
> FROM "public"."categories"
> WHERE "categoryparentguid" IS NOT NULL) AS "t" ON "transactions"."categoryguid" = "t"."categoryguid"
> INNER JOIN "public"."categories" AS "categories0" ON "t"."categoryparentguid" = "categories0"."categoryguid"])
: rowType = RecordType(VARCHAR(255) transactionguid, VARCHAR(255) relatedtransactionguid,
VARCHAR(255) transactioncode, DECIMAL(1, 0) transactionpending, VARCHAR(50) transactionrefobjecttype,
VARCHAR(255) transactionrefobjectguid, VARCHAR(1024) transactionrefobjectvalue, TIMESTAMP(6)
transactiondate, VARCHAR(256) transactiondescription, VARCHAR(50) categoryguid, VARCHAR(3)
transactioncurrency, DECIMAL(15, 3) transactionoldbalance, DECIMAL(13, 3) transactionamount,
DECIMAL(15, 3) transactionnewbalance, VARCHAR(512) transactionnotes, DECIMAL(2, 0) transactioninstrumenttype,
VARCHAR(20) transactioninstrumentsubtype, VARCHAR(20) transactioninstrumentcode, VARCHAR(50)
transactionorigpartyguid, VARCHAR(255) transactionorigaccountguid, VARCHAR(50) transactionrecpartyguid,
VARCHAR(255) transactionrecaccountguid, VARCHAR(256) transactionstatementdesc, DECIMAL(1,
0) transactionsplit, DECIMAL(1, 0) transactionduplicated, DECIMAL(1, 0) transactionrecategorized,
TIMESTAMP(6) transactioncreatedat, TIMESTAMP(6) transactionupdatedat, VARCHAR(50) transactionmatrulerefobjtype,
VARCHAR(50) transactionmatrulerefobjguid, VARCHAR(50) transactionmatrulerefobjvalue, VARCHAR(50)
transactionuserruleguid, DECIMAL(2, 0) transactionsplitorder, TIMESTAMP(6) transactionprocessedat,
TIMESTAMP(6) transactioncategoryassignat, VARCHAR(50) transactionsystemcategoryguid, VARCHAR(50)
transactionorigmandateid, VARCHAR(100) fingerprint, VARCHAR(50) categoryguid0, VARCHAR(50)
categoryparentguid, DECIMAL(3, 0) categorytype, VARCHAR(50) categoryname, VARCHAR(50) categorydescription,
VARCHAR(50) partyguid, VARCHAR(50) categoryguid1, VARCHAR(50) categoryparentguid0, DECIMAL(3,
0) categorytype0, VARCHAR(50) categoryname0, VARCHAR(50) categorydescription0, VARCHAR(50)
partyguid0): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network,
0.0 memory}, id = 64259
> {code}
> I worked around it by creating a view on postgres but not ideal. Thanks in advance.
> First Drill Jira Bug.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message