Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C84817A37 for ; Fri, 27 Feb 2015 03:00:26 +0000 (UTC) Received: (qmail 65960 invoked by uid 500); 27 Feb 2015 03:00:13 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 65909 invoked by uid 500); 27 Feb 2015 03:00:13 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 65856 invoked by uid 99); 27 Feb 2015 03:00:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Feb 2015 03:00:13 +0000 Date: Fri, 27 Feb 2015 03:00:13 +0000 (UTC) From: "Aman Sinha (JIRA)" To: dev@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-2334) Text record reader should fail gracefully when encountering bad records MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Aman Sinha created DRILL-2334: --------------------------------- Summary: Text record reader should fail gracefully when encountering bad records Key: DRILL-2334 URL: https://issues.apache.org/jira/browse/DRILL-2334 Project: Apache Drill Issue Type: Improvement Components: Storage - Text & CSV Reporter: Aman Sinha Assignee: Hanifi Gunes The attached file has 1 bad record. Running a simple count(*) query on this file errors out with IOBE and/or possible schema change exception. The hex dump of the file shows a bunch of 0's (the '*' below indicates more lines of 0's): {code} 00001c0 3a 35 35 2e 35 30 35 35 30 00 00 00 00 00 00 00 00001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 02a01c0 00 00 00 00 00 00 00 00 00 35 35 35 0a 35 35 35 {code} {code} 0: jdbc:drill:zk=local> select count(*) from `badRecords2.dat`; +------------+ | EXPR$0 | +------------+ Query failed: RemoteRpcException: Failure while running fragment., You tried to do a batch data read operation when you were in a state of STOP. You can only do this type of operation when you are in a state of OK or OK_NEW_SCHEMA. {code} log file also shows an IOBE related to this: {code} 18:49:00.003 [2b1024e4-5639-b4ec-392e-8d5879c3d4db:frag:0:0] DEBUG o.a.d.exec.physical.impl.ScanBatch - Failed to read the batch. Stopping... java.lang.IndexOutOfBoundsException: index: 374, length: 2752540 (expected: range(0, 65536)) at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1143) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final] at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:272) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final] at io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:390) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final] at io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:25) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:651) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:481) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.vector.RepeatedVarCharVector$Mutator.addSafe(RepeatedVarCharVector.java:451) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.store.text.DrillTextRecordReader.next(DrillTextRecordReader.java:172) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:165) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)