drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5202) Planner misses opportunity to link sort & filter without remover
Date Wed, 18 Jan 2017 16:58:26 GMT
Paul Rogers created DRILL-5202:
----------------------------------

             Summary: Planner misses opportunity to link sort & filter without remover
                 Key: DRILL-5202
                 URL: https://issues.apache.org/jira/browse/DRILL-5202
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.9.0
            Reporter: Paul Rogers
            Priority: Minor


Consider the following query:

{code}
SELECT * FROM (SELECT * FROM `mock`.`mock.json` ORDER BY col1) d WHERE d.col1 = 'bogus'
{code}

The data source here is a mock: it simply generates a data set with 10 columns numbered col1
to col10. Then it generates 10,000 rows of data.

The plan for this query misses an optimization opportunity. (See plan below.)

The current plan is (abbreviated):

* scan
* ...
* sort
* selection vector remover
* project
* filter
* selection vector remover
* project
* screen

Careful inspection shows that this query is very simple. The following steps would work just
as well:

* scan
* ...
* sort
* filter
* project
* screen

That is, the filter can handle an input with a selection vector. So, no SVR is needed between
the sort and the filter. Plus, this is a {{SELECT *}} query, so all the extra projects don't
really do anything useful, so they can be removed where unneeded. The revised plan eliminates
an unnecessary data copy.

Of course, the planner should have pushed the filter below the sort. But that is DRILL-5200.

{code} 
 "graph" : [ {
    "pop" : "mock-scan",
    "@id" : 8,
    ...
  }, {
    "pop" : "project",
    "@id" : 7,
    "exprs" : [ {
      "ref" : "`T0¦¦*`",
      "expr" : "`*`"
    }, {
      "ref" : "`col1`",
      "expr" : "`col1`"
    } ],
    "child" : 8,
    ...
  }, {
    "pop" : "external-sort",
    "@id" : 6,
    "child" : 7,
    "orderings" : [ {
      "order" : "ASC",
      "expr" : "`col1`",
      "nullDirection" : "UNSPECIFIED"
    } ],
    ...
  }, {
    "pop" : "selection-vector-remover",
    "@id" : 5,
    "child" : 6,
    ...
  }, {
    "pop" : "project",
    "@id" : 4,
    "exprs" : [ {
      "ref" : "`T0¦¦*`",
      "expr" : "`T0¦¦*`"
    } ],
    "child" : 5,
    ...
  }, {
    "pop" : "filter",
    "@id" : 3,
    "child" : 4,
    ...
  }, {
    "pop" : "selection-vector-remover",
    "@id" : 2,
    ...
  }, {
    "pop" : "project",
    "@id" : 1,
    "exprs" : [ {
      "ref" : "`*`",
      "expr" : "`T0¦¦*`"
    } ],
    "child" : 2,
   ...
  }, {
    "pop" : "screen",
    "@id" : 0,
    ...
  } ]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message