drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-1325) Better error messages when no equality join is specified
Date Tue, 24 Feb 2015 22:47:06 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Hsuan-Yi Chu updated DRILL-1325:
-------------------------------------
    Attachment: DRILL-1325.1.patch

> Better error messages when no equality join is specified
> --------------------------------------------------------
>
>                 Key: DRILL-1325
>                 URL: https://issues.apache.org/jira/browse/DRILL-1325
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.4.0
>         Environment: MapR 3.1.1
> Drill 0.4.0
> CentOS 6.4
>            Reporter: Hao Zhu
>            Assignee: Sean Hsuan-Yi Chu
>            Priority: Minor
>             Fix For: 0.8.0
>
>         Attachments: DRILL-1325.1.patch
>
>
> Equal hash join works fine, for example:
> {code}
> > explain plan for  select a.*,b.user_port
> . . . . . . . . . . . . . . . . . . . . . . .>  from cp.`employee.json` a, sys.drillbits
b
> . . . . . . . . . . . . . . . . . . . . . . .>  where a.position_id=b.user_port ;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(*=[$0], user_port=[$1])
> 00-02        Project(T1¦¦*=[$0], user_port=[$2])
> 00-03          HashJoin(condition=[=($1, $2)], joinType=[inner])
> 00-05         |
> +------------+------------+
> 1 row selected (0.428 seconds)
> {code}
> However if we change "=" to "<>" , it failed:
> {code}
> > explain plan for  select a.*,b.user_port
> . . . . . . . . . . . . . . . . . . . . . . .>  from cp.`employee.json` a, sys.drillbits
b
> . . . . . . . . . . . . . . . . . . . . . . .>  where a.position_id<>b.user_port
;
> Query failed: Failure while parsing sql. Node [rel#2056:Subset#4.LOGICAL.ANY([]).[]]
could not be implemented; planner state:
> Root: rel#2056:Subset#4.LOGICAL.ANY([]).[]
> Original rel:
> AbstractConverter(subset=[rel#2056:Subset#4.LOGICAL.ANY([]).[]], convention=[LOGICAL],
DrillDistributionTraitDef=[ANY([])], sort=[[]]): rowcount = 1.7976931348623157E308, cumulative
cost = {inf}, id = 2057
>   ProjectRel(subset=[rel#2055:Subset#4.NONE.ANY([]).[]], *=[$0], user_port=[$3]): rowcount
= 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, Infinity cpu, 0.0
io, 0.0 network}, id = 2054
>     FilterRel(subset=[rel#2053:Subset#3.NONE.ANY([]).[]], condition=[<>(CAST($1):INTEGER,
$3)]): rowcount = 8.988465674311579E307, cumulative cost = {8.988465674311579E307 rows, 1.7976931348623157E308
cpu, 0.0 io, 0.0 network}, id = 2052
>       JoinRel(subset=[rel#2051:Subset#2.NONE.ANY([]).[]], condition=[true], joinType=[inner]):
rowcount = 10000.0, cumulative cost = {10000.0 rows, 0.0 cpu, 0.0 io, 0.0 network}, id = 2050
>         EnumerableTableAccessRel(subset=[rel#2048:Subset#0.ENUMERABLE.ANY([]).[]], table=[[cp,
employee.json]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0
network}, id = 2043
>         EnumerableTableAccessRel(subset=[rel#2049:Subset#1.ENUMERABLE.ANY([]).[]], table=[[sys,
drillbits]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network},
id = 2044
> Sets:
> Set#0, type: (DrillRecordRow[*, position_id])
> 	rel#2048:Subset#0.ENUMERABLE.ANY([]).[], best=rel#2043, importance=0.6561
> 		rel#2043:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[cp, employee.json]),
rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network}
> 		rel#2093:AbstractConverter.ENUMERABLE.ANY([]).[](child=rel#2092:Subset#0.LOGICAL.ANY([]).[],convention=ENUMERABLE,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=463.0, cumulative cost={inf}
> 		rel#2120:AbstractConverter.ENUMERABLE.ANY([]).[](child=rel#2119:Subset#0.NONE.ANY([]).[],convention=ENUMERABLE,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 	rel#2092:Subset#0.LOGICAL.ANY([]).[], best=rel#2126, importance=0.5904900000000001
> 		rel#2094:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2048:Subset#0.ENUMERABLE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=100.0, cumulative cost={inf}
> 		rel#2121:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2119:Subset#0.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2102:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2119:Subset#0.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2118:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2092:Subset#0.LOGICAL.ANY([]).[],*=$0,position_id=$1),
rowcount=463.0, cumulative cost={926.0 rows, 934.0 cpu, 0.0 io, 0.0 network}
> 		rel#2126:DrillScanRel.LOGICAL.ANY([]).[](table=[cp, employee.json],groupscan=EasyGroupScan
[selectionRoot=/employee.json, columns = null]), rowcount=463.0, cumulative cost={463.0 rows,
926.0 cpu, 0.0 io, 0.0 network}
> 	rel#2119:Subset#0.NONE.ANY([]).[], best=null, importance=0.531441
> 		rel#2122:AbstractConverter.NONE.ANY([]).[](child=rel#2048:Subset#0.ENUMERABLE.ANY([]).[],convention=NONE,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=100.0, cumulative cost={inf}
> 		rel#2123:AbstractConverter.NONE.ANY([]).[](child=rel#2092:Subset#0.LOGICAL.ANY([]).[],convention=NONE,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=463.0, cumulative cost={inf}
> 		rel#2080:ProjectRel.NONE.ANY([]).[](child=rel#2048:Subset#0.ENUMERABLE.ANY([]).[],*=$0,position_id=$1),
rowcount=100.0, cumulative cost={inf}
> Set#1, type: RecordType(VARCHAR(1) host, INTEGER user_port, INTEGER control_port, INTEGER
data_port)
> 	rel#2049:Subset#1.ENUMERABLE.ANY([]).[], best=rel#2044, importance=0.6561
> 		rel#2044:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[sys, drillbits]), rowcount=100.0,
cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network}
> 		rel#2096:AbstractConverter.ENUMERABLE.ANY([]).[](child=rel#2095:Subset#1.LOGICAL.ANY([]).[],convention=ENUMERABLE,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=20.0, cumulative cost={inf}
> 	rel#2095:Subset#1.LOGICAL.ANY([]).[], best=rel#2125, importance=0.5904900000000001
> 		rel#2097:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2049:Subset#1.ENUMERABLE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=100.0, cumulative cost={inf}
> 		rel#2125:DrillScanRel.LOGICAL.ANY([]).[](table=[sys, drillbits],groupscan=SystemTableScan:
DRILLBITS), rowcount=20.0, cumulative cost={20.0 rows, 80.0 cpu, 0.0 io, 0.0 network}
> Set#2, type: RecordType(ANY *, ANY position_id, VARCHAR(1) host, INTEGER user_port, INTEGER
control_port, INTEGER data_port)
> 	rel#2051:Subset#2.NONE.ANY([]).[], best=null, importance=0.7290000000000001
> 		rel#2050:JoinRel.NONE.ANY([]).[](left=rel#2048:Subset#0.ENUMERABLE.ANY([]).[],right=rel#2049:Subset#1.ENUMERABLE.ANY([]).[],condition=true,joinType=inner),
rowcount=10000.0, cumulative cost={inf}
> 	rel#2099:Subset#2.LOGICAL.ANY([]).[], best=null, importance=0.7290000000000001
> 		rel#2100:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2051:Subset#2.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2098:DrillJoinRel.LOGICAL.ANY([]).[](left=rel#2092:Subset#0.LOGICAL.ANY([]).[],right=rel#2095:Subset#1.LOGICAL.ANY([]).[],condition=true,joinType=inner),
rowcount=463.0, cumulative cost={inf}
> Set#3, type: RecordType(ANY *, ANY position_id, VARCHAR(1) host, INTEGER user_port, INTEGER
control_port, INTEGER data_port)
> 	rel#2053:Subset#3.NONE.ANY([]).[], best=null, importance=0.81
> 		rel#2052:FilterRel.NONE.ANY([]).[](child=rel#2051:Subset#2.NONE.ANY([]).[],condition=<>(CAST($1):INTEGER,
$3)), rowcount=8.988465674311579E307, cumulative cost={inf}
> 		rel#2079:JoinRel.NONE.ANY([]).[](left=rel#2048:Subset#0.ENUMERABLE.ANY([]).[],right=rel#2049:Subset#1.ENUMERABLE.ANY([]).[],condition=<>(CAST($1):INTEGER,
$3),joinType=inner), rowcount=5000.0, cumulative cost={inf}
> 	rel#2068:Subset#3.LOGICAL.ANY([]).[], best=null, importance=0.81
> 		rel#2069:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2053:Subset#3.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2109:DrillFilterRel.LOGICAL.ANY([]).[](child=rel#2099:Subset#2.LOGICAL.ANY([]).[],condition=<>(CAST($1):INTEGER,
$3)), rowcount=8.988465674311579E307, cumulative cost={inf}
> Set#4, type: RecordType(ANY *, INTEGER user_port)
> 	rel#2055:Subset#4.NONE.ANY([]).[], best=null, importance=0.9
> 		rel#2054:ProjectRel.NONE.ANY([]).[](child=rel#2053:Subset#3.NONE.ANY([]).[],*=$0,user_port=$3),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2064:ProjectRel.NONE.ANY([]).[](child=rel#2063:Subset#6.NONE.ANY([]).[],*=$0,user_port=$2),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 	rel#2056:Subset#4.LOGICAL.ANY([]).[], best=null, importance=1.0
> 		rel#2057:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2055:Subset#4.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2067:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2065:Subset#6.LOGICAL.ANY([]).[],*=$0,user_port=$2),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2070:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2068:Subset#3.LOGICAL.ANY([]).[],*=$0,user_port=$3),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> Set#5, type: RecordType(ANY *, ANY position_id, INTEGER user_port)
> 	rel#2061:Subset#5.NONE.ANY([]).[], best=null, importance=0.7290000000000001
> 		rel#2058:ProjectRel.NONE.ANY([]).[](child=rel#2051:Subset#2.NONE.ANY([]).[],*=$0,position_id=$1,user_port=$3),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2113:JoinRel.NONE.ANY([]).[](left=rel#2119:Subset#0.NONE.ANY([]).[],right=rel#2084:Subset#8.NONE.ANY([]).[],condition=true,joinType=inner)Error:
exception while executing query: Failure while trying to get next result batch. (state=,code=0),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 	rel#2106:Subset#5.LOGICAL.ANY([]).[], best=null, importance=0.81
> 		rel#2107:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2061:Subset#5.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2116:DrillJoinRel.LOGICAL.ANY([]).[](left=rel#2092:Subset#0.LOGICAL.ANY([]).[],right=rel#2103:Subset#8.LOGICAL.ANY([]).[],condition=true,joinType=inner),
rowcount=463.0, cumulative cost={inf}
> 		rel#2124:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2099:Subset#2.LOGICAL.ANY([]).[],*=$0,position_id=$1,user_port=$3),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> Set#6, type: RecordType(ANY *, ANY position_id, INTEGER user_port)
> 	rel#2063:Subset#6.NONE.ANY([]).[], best=null, importance=0.81
> 		rel#2062:FilterRel.NONE.ANY([]).[](child=rel#2061:Subset#5.NONE.ANY([]).[],condition=<>(CAST($1):INTEGER,
$2)), rowcount=8.988465674311579E307, cumulative cost={inf}
> 		rel#2073:ProjectRel.NONE.ANY([]).[](child=rel#2053:Subset#3.NONE.ANY([]).[],*=$0,position_id=$1,user_port=$3),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2085:JoinRel.NONE.ANY([]).[](left=rel#2119:Subset#0.NONE.ANY([]).[],right=rel#2084:Subset#8.NONE.ANY([]).[],condition=<>(CAST($1):INTEGER,
$2),joinType=inner), rowcount=1.7976931348623157E308, cumulative cost={inf}
> 	rel#2065:Subset#6.LOGICAL.ANY([]).[], best=null, importance=0.9
> 		rel#2066:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2063:Subset#6.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2105:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2068:Subset#3.LOGICAL.ANY([]).[],*=$0,position_id=$1,user_port=$3),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2108:DrillFilterRel.LOGICAL.ANY([]).[](child=rel#2106:Subset#5.LOGICAL.ANY([]).[],condition=<>(CAST($1):INTEGER,
$2)), rowcount=8.988465674311579E307, cumulative cost={inf}
> Set#8, type: RecordType(INTEGER user_port)
> 	rel#2084:Subset#8.NONE.ANY([]).[], best=null, importance=0.6561
> 		rel#2081:ProjectRel.NONE.ANY([]).[](child=rel#2049:Subset#1.ENUMERABLE.ANY([]).[],user_port=$1),
rowcount=100.0, cumulative cost={inf}
> 	rel#2103:Subset#8.LOGICAL.ANY([]).[], best=rel#2117, importance=0.7290000000000001
> 		rel#2104:AbstractConverter.LOGICAL.ANY([]).[](child=rel#2084:Subset#8.NONE.ANY([]).[],convention=LOGICAL,DrillDistributionTraitDef=ANY([]),sort=[]),
rowcount=1.7976931348623157E308, cumulative cost={inf}
> 		rel#2117:DrillProjectRel.LOGICAL.ANY([]).[](child=rel#2095:Subset#1.LOGICAL.ANY([]).[],user_port=$1),
rowcount=20.0, cumulative cost={40.0 rows, 84.0 cpu, 0.0 io, 0.0 network}
>  [05104eb3-8320-47c0-b9d2-710840e44a29]
> {code}
> Here are 2 suggestions/asks:
> 1. Do we support non-equal join?  if yes , we should fix this issue.
> 2. If we do not support, the error message should be more clear and simpler to tell something
like "non-equal join is not supported in current version...."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message