drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khurram Faraaz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-2796) Select keys from JSON file where <column-name> not in null results in RelOptPlanner.CannotPlanException
Date Tue, 14 Apr 2015 23:42:58 GMT
Khurram Faraaz created DRILL-2796:
-------------------------------------

             Summary: Select keys from JSON file where <column-name> not in null results
in RelOptPlanner.CannotPlanException
                 Key: DRILL-2796
                 URL: https://issues.apache.org/jira/browse/DRILL-2796
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 0.9.0
         Environment:  9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from
HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT
            Reporter: Khurram Faraaz
            Assignee: Jinfeng Ni


Query that has, <column-name> not in (null) in its predicate, results in RelOptPlanner.CannotPlanException.
Tests were run on 4 node cluster and on CentOS.
Data is being selected from a JSON data file.

{code}

0: jdbc:drill:> select * from `mKeyJSN.json`;
+------------+------------+------------+------------+------------+------------+------------+------------+
|    key1    |    key2    |    key3    |    key4    |    key5    |    key6    |    key7  
 |    key8    |
+------------+------------+------------+------------+------------+------------+------------+------------+
| 1234       | null       | null       | null       | null       | null       | null     
 | null       |
| null       | 1245685    | null       | null       | null       | null       | null     
 | null       |
| null       | null       | hello world! | null       | null       | null       | null   
   | null       |
| null       | null       | null       | true       | null       | null       | null     
 | null       |
| null       | null       | null       | null       | 2000-03-10 | null       | null     
 | null       |
| null       | null       | null       | null       | null       | 2012-01-21 15:19:12.123
| null       | null       |
| null       | null       | null       | null       | null       | null       | 21:34:32.321
| null       |
| null       | null       | null       | null       | null       | null       | null     
 | 9789.99    |
+------------+------------+------------+------------+------------+------------+------------+------------+
8 rows selected (0.1 seconds)

store.json.all_text_mode was set to false

0: jdbc:drill:> select * from sys.options where name like '%json.all_text_mode%';
+------------+------------+------------+------------+------------+------------+------------+
|    name    |    kind    |    type    |  num_val   | string_val |  bool_val  | float_val
 |
+------------+------------+------------+------------+------------+------------+------------+
| store.json.all_text_mode | BOOLEAN    | SYSTEM     | null       | null       | false   
  | null       |
+------------+------------+------------+------------+------------+------------+------------+
1 row selected (0.135 seconds)

Failing query

0: jdbc:drill:> select key1,key2,key3,key4,key5,key6,key7,key8 from `mKeyJSN.json` where
key5 not in (null);
Query failed: RelOptPlanner.CannotPlanException: Node [rel#14924:Subset#7.LOGICAL.ANY([]).[]]
could not be implemented; planner state:

Root: rel#14924:Subset#7.LOGICAL.ANY([]).[]
Original rel:
AbstractConverter(subset=[rel#14924:Subset#7.LOGICAL.ANY([]).[]], convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])],
sort=[[]]): rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 14925
  ProjectRel(subset=[rel#14923:Subset#7.NONE.ANY([]).[]], key1=[$2], key2=[$3], key3=[$4],
key4=[$5], key5=[$1], key6=[$6], key7=[$7], key8=[$8]): rowcount = 1.7976931348623157E308,
cumulative cost = {1.7976931348623157E308 rows, Infinity cpu, 0.0 io, 0.0 network, 0.0 memory},
id = 14922
    FilterRel(subset=[rel#14921:Subset#6.NONE.ANY([]).[]], condition=[AND(NOT(IS TRUE($11)),
IS NOT NULL($9))]): rowcount = 4.0448095534402104E307, cumulative cost = {4.0448095534402104E307
rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14920
      JoinRel(subset=[rel#14919:Subset#5.NONE.ANY([]).[]], condition=[=($9, $10)], joinType=[left]):
rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 0.0 cpu,
0.0 io, 0.0 network, 0.0 memory}, id = 14918
        ProjectRel(subset=[rel#14912:Subset#1.NONE.ANY([]).[]], $f0=[$0], $f1=[$1], $f2=[$2],
$f3=[$3], $f4=[$4], $f5=[$5], $f6=[$6], $f7=[$7], $f8=[$8], $f9=[$1]): rowcount = 100.0, cumulative
cost = {100.0 rows, 1000.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14911
          EnumerableTableAccessRel(subset=[rel#14910:Subset#0.ENUMERABLE.ANY([]).[]], table=[[dfs,
tmp, mKeyJSN.json]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io,
0.0 network, 0.0 memory}, id = 14888
        AggregateRel(subset=[rel#14917:Subset#4.NONE.ANY([]).[]], group=[{0}], agg#0=[MIN($1)]):
rowcount = 1.7976931348623158E307, cumulative cost = {1.7976931348623158E307 rows, 0.0 cpu,
0.0 io, 0.0 network, 0.0 memory}, id = 14916
          ProjectRel(subset=[rel#14915:Subset#3.NONE.ANY([]).[]], $f0=[$0], $f1=[true]): rowcount
= 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, Infinity cpu, 0.0
io, 0.0 network, 0.0 memory}, id = 14914
            ValuesRel(subset=[rel#14913:Subset#2.NONE.ANY([]).[]], tuples=[[{ null }]]): rowcount
= 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14890

content from JSON data file

{"key1":1234}
{"key2":1245685}
{"key3":"hello world!"}
{"key4":true}
{"key5":"2000-03-10"}
{"key6":"2012-01-21 15:19:12.123"}
{"key7":"21:34:32.321"}
{"key8":9789.99}

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message