drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Altekruse (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function
Date Mon, 06 Jul 2015 23:18:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615860#comment-14615860
] 

Jason Altekruse edited comment on DRILL-2745 at 7/6/15 11:17 PM:
-----------------------------------------------------------------

This is currently expected behavior, as the error message reports, Drill doesn't support lists
of different types. This also applies to nested lists, the error is being produced because
of this column in the record. I have clipped the list but it still illustrates the issue.
We cannot have different scalar types in the individual nested lists, in this case you have
one list with numbers and the next with strings.

This input also includes a null in a list which is not supported in Drill. All of these problems
should be able to be worked around if you turn on `store.json.all_text_mode` for the json
reader. This is because null values in lists will be turned into a string containing the word
"null" when in all_text_mode. If there is still an issue after turning on all_text_mode please
file a new JIRA as these issues are unrelated to flatten.

{code}
{ "outkey":[[1000000,10000000,2000000,999999,1,0,-1,100000],["a","b","c","d","e","p","o","f","m","q","d","s","v"]]
}
{code}




was (Author: jaltekruse):
This is currently expected behavior, as the error message reports, Drill doesn't support lists
of different types. This also applies to nested lists, the error is being produced because
of this column in the record. I have clipped the list but it still illustrates the issue.
We cannot have different scalar types in the individual nested lists, in this case you have
one list with numbers and the next with strings.

This input also includes a null in a list which is not supported in Drill. All of these problems
should be able to be worked around if you turn on `store.json.all_text_mode` for the json
reader. This is because null values in lists will be turned into a string containing the word
"null" when in all_text_mode. If there is still an issue after turning on all_text_mode please
file a new JIRA as these issues are unrelated to flatten.

{code}
{ "outkey":[[1000000,10000000,2000000,999999,1,0,-1,100000],["a","b","c","d","e","p","o","f","m","q","d","s","v"]]
{code}



> Query returns IOB Exception when JSON data with empty arrays is input to flatten function
> -----------------------------------------------------------------------------------------
>
>                 Key: DRILL-2745
>                 URL: https://issues.apache.org/jira/browse/DRILL-2745
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 0.9.0
>         Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early
from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT 
>            Reporter: Khurram Faraaz
>            Assignee: Jason Altekruse
>             Fix For: 1.2.0
>
>
> IOB Exception is returned when JSON file that has many empty arrays and arrays with different
types of data is passed to flatten function.
> Tested on 4 node cluster on CentOS
> {code}
> 0: jdbc:drill:> select flatten(outkey) from `nestedJArry.json` ;
> Query failed: RemoteRpcException: Failure while running fragment., index: 176, length:
4 (expected: range(0, 176)) [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010
]
> [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> 0: jdbc:drill:> select outkey from `nestedJArry.json`;
> +------------+
> |   outkey   |
> +------------+
> | [["1000000","10000000","2000000","999999","1","0","-1","100000"],["a","b","c","d","e","p","o","f","m","q","d","s","v"],["2012-04-01","1998-02-20","2011-08-05","1992-01-01"],["10:30:29.123","12:29:21.999"],["sdfklgjsdlkjfghlsidhfgopiuesrtoipuertoiurtyoiurotuiydkfjlbn,bfn;waokefpqowertoipuwergklnjdfbpdsiofgoigiuewqrqiugkjehgjksdhbvkjshdfkjsdfbnlkfbkljrghljrelkhbdlkfjbgkdfjbgkndfbnkldfgklbhjdflkghjlnkoiurty984756897345609782-3458745uiyoheirluht7895e6y"],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],["null"],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],["test
string","hello world!","just do it!","houston we have a problem"],["1","2","3","4","5","6","7","8","9","0"]]
|
> +------------+
> 1 row selected (0.088 seconds)
> Stack trace from drillbit.log
> 2015-04-09 23:54:41,965 [2ad8eebd-adb6-6f7e-469e-4bb8ca276984:frag:0:0] WARN  o.a.d.e.w.fragment.FragmentExecutor
- Error while initializing or executing fragment
> java.lang.IndexOutOfBoundsException: index: 176, length: 4 (expected: range(0, 176))
>         at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:187) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.chk(DrillBuf.java:209) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at io.netty.buffer.DrillBuf.setInt(DrillBuf.java:513) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
>         at org.apache.drill.exec.vector.UInt4Vector$Mutator.set(UInt4Vector.java:363)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.RepeatedVarCharVector.splitAndTransferTo(RepeatedVarCharVector.java:173)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.vector.RepeatedVarCharVector$TransferImpl.splitAndTransfer(RepeatedVarCharVector.java:200)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.test.generated.FlattenerGen1107.flattenRecords(FlattenTemplate.java:106)
~[na:na]
>         at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork(FlattenRecordBatch.java:156)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:68)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:96)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:58)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:163)
~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_75]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_75]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message