drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khurram Faraaz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-4513) Query returns null when star in project and filter over non null values
Date Wed, 16 Mar 2016 07:21:33 GMT
Khurram Faraaz created DRILL-4513:
-------------------------------------

             Summary: Query returns null when star in project and filter over non null values
                 Key: DRILL-4513
                 URL: https://issues.apache.org/jira/browse/DRILL-4513
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - JSON
    Affects Versions: 1.4.0
         Environment: 4 node cluster CentOS
            Reporter: Khurram Faraaz


When there is a star in project and there are many nulls followed by a non null value in the
column c1 in JSON file, query does not return correct results.

json file with 4096 records that have a field
{c1: null}

followed by the 4097th record that has
{c1: "Hello World"}

Drill version
http://yum.qa.lab/drill-opensource.ebf/mapr-drill-1.4.0.201603151147-1.noarch.rpm

{noformat}
0: jdbc:drill:schema=dfs.tmp> select * from sys.version;
+---------+-----------+----------------+-------------+-------------+------------+
| version | commit_id | commit_message | commit_time | build_email | build_time |
+---------+-----------+----------------+-------------+-------------+------------+
| 1.4.0 | 99ec15919cd73f5e7b976f6193af45c0e46ffd7a | DRILL-4479: For empty fields under all_text_mode
enabled (a) use varchar for the default columns and (b) ensure we create fields corresponding
to all columns. | 14.03.2016 @ 20:52:13 UTC | Unknown | 15.03.2016 @ 18:47:50 UTC |
+---------+-----------+----------------+-------------+-------------+------------+
1 row selected (0.375 seconds)
0: jdbc:drill:schema=dfs.tmp> alter system set `store.json.all_text_mode`=true;
+-------+------------------------------------+
|  ok   |              summary               |
+-------+------------------------------------+
| true  | store.json.all_text_mode updated.  |
+-------+------------------------------------+
1 row selected (0.136 seconds)
0: jdbc:drill:schema=dfs.tmp> select * from `tooManyNulls.json` t where t.c1 = 'Hello World';
Error: SYSTEM ERROR: NumberFormatException: Hello World

Fragment 0:0

[Error Id: f9022f22-12cd-46c7-b471-470a50506ba2 on centos-04.qa.lab:31010] (state=,code=0)
0: jdbc:drill:schema=dfs.tmp> select * from `tooManyNulls.json` t where t.c1 IS NOT NULL;
+-------+
|   *   |
+-------+
| null  |
+-------+
1 row selected (0.239 seconds)
0: jdbc:drill:schema=dfs.tmp> select t.c1 from `tooManyNulls.json` t where t.c1 = 'Hello
World';
+--------------+
|      c1      |
+--------------+
| Hello World  |
+--------------+
1 row selected (0.282 seconds)
0: jdbc:drill:schema=dfs.tmp> select t.c1 from `tooManyNulls.json` t where t.c1 IS NOT
NULL;
+--------------+
|      c1      |
+--------------+
| Hello World  |
+--------------+
1 row selected (0.259 seconds)
{noformat}

Another interesting observation to note, the below query returns null. However, when the same
query is used as a sub-query, we see the query returns correct results.

{noformat}
0: jdbc:drill:schema=dfs.tmp> select * from `tooManyNulls.json` t where t.c1 IS NOT NULL;
+-------+
|   *   |
+-------+
| null  |
+-------+
1 row selected (0.203 seconds)
0: jdbc:drill:schema=dfs.tmp> select tmp.c1 from (select * from `tooManyNulls.json` t where
t.c1 IS NOT NULL) tmp;
+--------------+
|      c1      |
+--------------+
| Hello World  |
+--------------+
1 row selected (0.25 seconds)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message