drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boaz Ben-Zvi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-4961) Schema change error due to a missing column in a Json file
Date Tue, 25 Oct 2016 02:18:58 GMT
Boaz Ben-Zvi created DRILL-4961:
-----------------------------------

             Summary: Schema change error due to a missing column in a Json file
                 Key: DRILL-4961
                 URL: https://issues.apache.org/jira/browse/DRILL-4961
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.8.0
            Reporter: Boaz Ben-Zvi


A missing column in a batch defaults to a (hard coded) nullable INT (e.g., see line 128 in
ExpressionTreeMaterializer.java), which can cause a schema conflict when that column in another
batch has a conflicting type (e.g. VARCHAR).

To recreate (the following test also created DRILL-4960 ; which may be related) :  Run a parallel
aggregation over two small Json files (e.g. copy twice contrib/storage-mongo/src/test/resources/emp.json
) where in one of the files a whole column was eliminated (e.g. "last_name").

0: jdbc:drill:zk=local> alter session set planner.slice_target = 1;
+-------+--------------------------------+
|  ok   |            summary             |
+-------+--------------------------------+
| true  | planner.slice_target updated.  |
+-------+--------------------------------+
1 row selected (0.091 seconds)
0: jdbc:drill:zk=local> select first_name, last_name from `drill/data/emp` group by first_name,
last_name;
Error: SYSTEM ERROR: SchemaChangeException: Incoming batches for merging receiver have different
schemas!

Fragment 1:0

[Error Id: 1315ddc5-5c31-404f-917b-c7a082d016cf on 10.250.57.63:31010] (state=,code=0)

The above used a streaming aggregation; when switching to hash aggregation the same error
manifests differently:

0: jdbc:drill:zk=local> alter session set `planner.enable_streamagg` = false;
+-------+------------------------------------+
|  ok   |              summary               |
+-------+------------------------------------+
| true  | planner.enable_streamagg updated.  |
+-------+------------------------------------+
1 row selected (0.083 seconds)
0: jdbc:drill:zk=local> select first_name, last_name from `drill/data/emp` group by first_name,
last_name;
Error: SYSTEM ERROR: IllegalStateException: Failure while reading vector.  Expected vector
class of org.apache.drill.exec.vector.NullableIntVector but was holding vector class org.apache.drill.exec.vector.NullableVarCharVector,
field= last_name(VARCHAR:OPTIONAL)[$bits$(UINT1:REQUIRED), last_name(VARCHAR:OPTIONAL)[$offsets$(UINT4:REQUIRED)]]


Fragment 2:0

[Error Id: 58daaaa0-3bfe-4197-b4bd-44f9d7604d77 on 10.250.57.63:31010] (state=,code=0)
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message