drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Deneche A. Hakim (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3952) Improve Window Functions performance when not all batches are required to process the current batch
Date Mon, 19 Oct 2015 17:44:05 GMT
Deneche A. Hakim created DRILL-3952:
---------------------------------------

             Summary: Improve Window Functions performance when not all batches are required
to process the current batch
                 Key: DRILL-3952
                 URL: https://issues.apache.org/jira/browse/DRILL-3952
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.2.0
            Reporter: Deneche A. Hakim
            Assignee: Deneche A. Hakim
             Fix For: 1.3.0


Currently, the window operator blocks until all batches of current partition to be available.
For some queries it's necessary (e.g. aggregate with no order-by in the window definition),
but for other cases the window operator can process and pass the current batch downstream
sooner.

Implementing this should help the window operator use less memory and run faster, especially
in the presence of a limit operator.

The purpose of this JIRA is to improve the window operator in the following cases:
- aggregate, when order-by clause is available in window definition, can process current batch
as soon as it receives the last peer row
- lead can process current batch as soon as it receives 1 more batch
- lag can process current batch immediately
- first_value can process current batch immediately
- last_value, when order-by clause is available in window definition, can process current
batch as soon as it receives the last peer row
- row_number, rank and dense_rank can process current batch immediately 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message