drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "abdelhakim deneche" <adene...@gmail.com>
Subject Re: Review Request 30051: DRILL-1908: new window function implementation
Date Mon, 26 Jan 2015 16:48:17 GMT


> On Jan. 23, 2015, 11:39 p.m., Chris Westin wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java,
line 85
> > <https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line85>
> >
> >     If I'm understanding this correctly, you're saying that you don't begin to process
a batch until you've received all of the last partition that starts in that batch.
> >     
> >     Couldn't this be a problem is the partition is very, very large, possibly extending
to the end of the data stream?

Depending on the window definition we may be able to process and release a batch before we
receive all rows of it's partitions, but this only works for some specific cases. A more general
solution should save the batches to disk until they are ready to be processed.


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/#review69499
-----------------------------------------------------------


On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30051/
> -----------------------------------------------------------
> 
> (Updated Jan. 21, 2015, 11:05 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Bugs: DRILL-1908
>     https://issues.apache.org/jira/browse/DRILL-1908
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> In order to fix DRILL-1487 a complete rewrite of the StreamingWindowFrameRecordBatch
was needed. This patch adds a new WindowFrameRecordBatch that correctly handles window functions
with or without order by clauses.
> This code still lacks support for frame clauses and may be optimized to reduce unneeded
frame computations.
> 
> 
> Diffs
> -----
> 
>   common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 28424a5

>   common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c 
>   contrib/data/pom.xml 86075f2 
>   contrib/data/window-test-data/pom.xml PRE-CREATION 
>   exec/java-exec/pom.xml 90734a5 
>   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 5288f5d

>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java 17738ee

>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
9b8929f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
a3e7940 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
b4e3fed 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
9588cef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
PRE-CREATION 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
f1a8bc0 
>   exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
00c20b2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
f20627d 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
7c04477 
>   exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 6eff6db

> 
> Diff: https://reviews.apache.org/r/30051/diff/
> 
> 
> Testing
> -------
> 
> Unit tests are available to test window functions in mulitple scenarios:
> - b1.p1: single batch with a single partition
> - b1.p2: 2 batches, each containing a different parition
> - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
> - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch and has
rows in 3 batches
> - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an edge case:
the 2nd time innerNext() is called, WindowFrameRecordBatch has enough saved batches to call
it's framer.doWork() without the need to call next(incoming)
> 
> All tests, except the last one, come in 2 variations: with and without "order by" clause
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message