Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 50A38200D55 for ; Fri, 3 Nov 2017 23:08:05 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 4F352160BDE; Fri, 3 Nov 2017 22:08:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 82EC8160BFC for ; Fri, 3 Nov 2017 23:08:04 +0100 (CET) Received: (qmail 3577 invoked by uid 500); 3 Nov 2017 22:08:03 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 3539 invoked by uid 99); 3 Nov 2017 22:08:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Nov 2017 22:08:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C92D51805BE for ; Fri, 3 Nov 2017 22:08:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Kr1mxcKbjcoF for ; Fri, 3 Nov 2017 22:08:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id EBACE5F6BF for ; Fri, 3 Nov 2017 22:08:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 17033E0E05 for ; Fri, 3 Nov 2017 22:08:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 495C1241A3 for ; Fri, 3 Nov 2017 22:08:00 +0000 (UTC) Date: Fri, 3 Nov 2017 22:08:00 +0000 (UTC) From: "Paul Rogers (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-5929) Misleading error for text file with blank line delimiter MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 03 Nov 2017 22:08:05 -0000 Paul Rogers created DRILL-5929: ---------------------------------- Summary: Misleading error for text file with blank line delimiter Key: DRILL-5929 URL: https://issues.apache.org/jira/browse/DRILL-5929 Project: Apache Drill Issue Type: Bug Affects Versions: 1.11.0 Reporter: Paul Rogers Priority: Minor Consider the following functional test query: {code} select * from table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\')) {code} For some reason (yet to be determined), when running this from Java, the line delimiter ended up empty. This cases the following line to fail with an {{ArrayIndexOutOfBoundsException}}: {code} class TextInput ... public final byte nextChar() throws IOException { if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0 {code} We then translate the exception: {code} class TextReader ... public final boolean parseNext() throws IOException { ... } catch (Exception ex) { try { throw handleException(ex); ... private TextParsingException handleException(Exception ex) throws IOException { ... if (ex instanceof ArrayIndexOutOfBoundsException) { // Not clear this exception is still thrown... ex = UserException .dataReadError(ex) .message( "Drill failed to read your text file. Drill supports up to %d columns in a text file. Your file appears to have more than that.", MAXIMUM_NUMBER_COLUMNS) .build(logger); } {code} That is, due to a missing delimiter, we get an index out of bounds exception, which we translate to an error about having too many fields. But, the file itself has only a handful of fields. Thus, the error is completely wrong. Then, we compound the error: {code} private TextParsingException handleException(Exception ex) throws IOException { ... throw new TextParsingException(context, message, ex); class CompliantTextReader ... public boolean next() { ... } catch (IOException | TextParsingException e) { throw UserException.dataReadError(e) .addContext("Failure while reading file %s. Happened at or shortly before byte position %d.", split.getPath(), reader.getPos()) .build(logger); {code} That is, our AIOB exception became a user exception that became a text parsing exception that became a data read error. But, this is not a data read error. It is an error in Drill's own validation logic. Not clear we should be wrapping user exceptions in other errors that we wrap in other user exceptions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)