drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5929) Misleading error for text file with blank line delimiter
Date Fri, 03 Nov 2017 22:08:00 GMT
Paul Rogers created DRILL-5929:
----------------------------------

             Summary: Misleading error for text file with blank line delimiter
                 Key: DRILL-5929
                 URL: https://issues.apache.org/jira/browse/DRILL-5929
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.11.0
            Reporter: Paul Rogers
            Priority: Minor


Consider the following functional test query:

{code}
select * from table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\'))
{code}

For some reason (yet to be determined), when running this from Java, the line delimiter ended
up empty. This cases the following line to fail with an {{ArrayIndexOutOfBoundsException}}:

{code}
class TextInput ...
  public final byte nextChar() throws IOException {
    if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0
{code}

We then translate the exception:

{code}
class TextReader ...
  public final boolean parseNext() throws IOException {
...
    } catch (Exception ex) {
      try {
        throw handleException(ex);
...
  private TextParsingException handleException(Exception ex) throws IOException {
...
    if (ex instanceof ArrayIndexOutOfBoundsException) {
      // Not clear this exception is still thrown...

      ex = UserException
          .dataReadError(ex)
          .message(
              "Drill failed to read your text file.  Drill supports up to %d columns in a
text file.  Your file appears to have more than that.",
              MAXIMUM_NUMBER_COLUMNS)
          .build(logger);
    }
{code}

That is, due to a missing delimiter, we get an index out of bounds exception, which we translate
to an error about having too many fields. But, the file itself has only a handful of fields.
Thus, the error is completely wrong.

Then, we compound the error:

{code}
  private TextParsingException handleException(Exception ex) throws IOException {
...
    throw new TextParsingException(context, message, ex);

class CompliantTextReader ...
  public boolean next() {
...
    } catch (IOException | TextParsingException e) {
      throw UserException.dataReadError(e)
          .addContext("Failure while reading file %s. Happened at or shortly before byte position
%d.",
            split.getPath(), reader.getPos())
          .build(logger);
{code}

That is, our AIOB exception became a user exception that became a text parsing exception that
became a data read error.

But, this is not a data read error. It is an error in Drill's own validation logic. Not clear
we should be wrapping user exceptions in other errors that we wrap in other user exceptions.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message