drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5489) Unprotected array access in RepeatedVarCharOutput ctor
Date Tue, 09 May 2017 01:04:04 GMT
Paul Rogers created DRILL-5489:
----------------------------------

             Summary: Unprotected array access in RepeatedVarCharOutput ctor
                 Key: DRILL-5489
                 URL: https://issues.apache.org/jira/browse/DRILL-5489
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.10.0
            Reporter: Paul Rogers
            Priority: Minor


Suppose a user runs a query of form:
{code}
SELECT columns[70000] FROM `dfs`.`mycsv.csv`
{code}

Internally, this will create a {{PathSegment}} to represent the selected column. This is passed
into the {{RepeatedVarCharOutput}} constructor where it is used to set a flag in an array
of 64K booleans. But, while the code is very diligent of making sure that the column name
is "columns" and that the path segment is an array, it does not check the array value. Instead:

{code}
        for(Integer i : columnIds){
          ...
          fields[i] = true;
        }
{code}

We need to add a bounds check to reject array indexes that are not valid: negative or above
64K.

While we are at it, we might as well fix another bit of bogus code:

{code}
        for(Integer i : columnIds){
          maxField = 0;
          maxField = Math.max(maxField, i);
          ...
        }
{code}

The code to compute maxField simply uses the last value, not the maximum value. This will
be thrown off by a query of the form:

{code}
SELECT columns[20], columns[1] FROM ...
{code}

It may be that the code further up the hierarchy does the checks. But, if so, it should do
the other checks as well. Leaving the checks incomplete is confusing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message