drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-2334) Text record reader should fail gracefully when encountering bad records
Date Fri, 27 Feb 2015 03:00:13 GMT
Aman Sinha created DRILL-2334:
---------------------------------

             Summary: Text record reader should fail gracefully when encountering bad records
                 Key: DRILL-2334
                 URL: https://issues.apache.org/jira/browse/DRILL-2334
             Project: Apache Drill
          Issue Type: Improvement
          Components: Storage - Text & CSV
            Reporter: Aman Sinha
            Assignee: Hanifi Gunes


The attached file has 1 bad record.   Running a simple count(*) query on this file errors
out with IOBE and/or possible schema change exception.

The hex dump of the file shows a bunch of 0's (the '*' below indicates more lines of 0's):
{code}
00001c0 3a 35 35 2e 35 30 35 35 30 00 00 00 00 00 00 00
00001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
02a01c0 00 00 00 00 00 00 00 00 00 35 35 35 0a 35 35 35
{code}

{code}
0: jdbc:drill:zk=local> select count(*) from `badRecords2.dat`;
+------------+
|   EXPR$0   |
+------------+
Query failed: RemoteRpcException: Failure while running fragment., You tried to do a batch
data read operation when you were in a state of STOP.  You can only do this type of operation
when you are in a state of OK or OK_NEW_SCHEMA.
{code}

log file also shows an IOBE related to this: 

{code}
18:49:00.003 [2b1024e4-5639-b4ec-392e-8d5879c3d4db:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch
- Failed to read the batch. Stopping...
java.lang.IndexOutOfBoundsException: index: 374, length: 2752540 (expected: range(0, 65536))
        at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1143) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
        at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:272)
~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
        at io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:390) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
        at io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:25)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
        at io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:651) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
        at org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:481)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at org.apache.drill.exec.vector.RepeatedVarCharVector$Mutator.addSafe(RepeatedVarCharVector.java:451)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at org.apache.drill.exec.store.text.DrillTextRecordReader.next(DrillTextRecordReader.java:172)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:165) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message