drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5125) Provide option to use generic code for sv remover
Date Thu, 22 Dec 2016 03:02:58 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768855#comment-15768855
] 

ASF GitHub Bot commented on DRILL-5125:
---------------------------------------

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/704

    DRILL-5125: Provide option to use generic code for sv remover

    Performance tests showed that, for queries with a large number of
    columns, it is faster to use a “generic” implementation of the
    selection vector remover “copier” than to use a generated version.
    
    This PR provides "generic" versions of the SV2 and SV4 copiers
    used by the selection vector remover. The generic forms are
    enabled using a new boot-time config parameter that is disabled
    by default (preserving the traditional generated code.)
    
    The generic form relies on a "virtual function" (really, just a
    plain Java function) defined in the base ValueVector class and
    implemented by each concrete vector: both the pre-defined and
    generated forms. This form "does the right thing" for the copy
    operation so that we don't need to generate code just to handle
    the method dispatch operation (which Java does quite well on its
    own.)
    
    A unit tests validates that the generic form works by runing
    the existing SV remover tests with the generic option turned on.
    
    See the DRILL-5125 for details.
    
    Add test

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-5125

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/704.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #704
    
----
commit ba3a38a403b140149d1605decefae088765ead56
Author: Paul Rogers <progers@maprtech.com>
Date:   2016-12-12T18:06:43Z

    DRILL-5125: Provide option to use generic code for sv remover
    
    Performance tests showed that, for queries with a large number of
    columns, it is faster to use a “generic” implementation of the
    selection vector remover “copier” than to use a generated version.
    
    This PR provides "generic" versions of the SV2 and SV4 copiers
    used by the selection vector remover. The generic forms are
    enabled using a new boot-time config parameter that is disabled
    by default (preserving the traditional generated code.)
    
    The generic form relies on a "virtual function" (really, just a
    plain Java function) defined in the base ValueVector class and
    implemented by each concrete vector: both the pre-defined and
    generated forms. This form "does the right thing" for the copy
    operation so that we don't need to generate code just to handle
    the method dispatch operation (which Java does quite well on its
    own.)
    
    A unit tests validates that the generic form works by runing
    the existing SV remover tests with the generic option turned on.
    
    See the DRILL-5125 for details.
    
    Add test

----


> Provide option to use generic code for sv remover
> -------------------------------------------------
>
>                 Key: DRILL-5125
>                 URL: https://issues.apache.org/jira/browse/DRILL-5125
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> Consider a non-typical Drill query: one with 6000 rows but 243 fields. Consider this
query:
> {code}
> select * from (select *, row_number() over(order by somedate) as rn from dfs.`/some/path/data.json`)
where rn=10
> {code}
> This produces a query with the following structure:
> {code}
> 00-00    Screen
> 00-01      ProjectAllowDup(*=[$0], rn=[$1])
> 00-02        Project(T0¦¦*=[$0], w0$o0=[$2])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[=($2, 10)])
> 00-05              Window(window#0=[window(partition {} order by [1] rows between UNBOUNDED
PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
> 00-06                SelectionVectorRemover
> 00-07                  Sort(sort0=[$1], dir0=[ASC])
> 00-08                    Project(T0¦¦*=[$0], validitydate=[$1])
> 00-09                      Scan(groupscan=...)
> {code}
> Instrumenting, the code to measure compile time, two “long poles” stood out:
> {code}
> Compile Time for org.apache.drill.exec.test.generated.CopierGen3: 500
> Compile Time for org.apache.drill.exec.test.generated.CopierGen8: 1659
> {code}
> Much of the initial run time of 5578 ms is taken up in compiling two classes (2159 ms).
> The classes themselves are very simple: create member variables for 486 vectors (2 x
column count), and call a method on each to do the copy. The only type-specific work is the
member variable and call to the (non-virtual) CopyFrom or CopyFromSafe methods. The generated
class can easily be replaced by a “generic” class and virtual functions in the vector
classes to choose the correct copy method.
> Clearly, avoiding code gen means avoiding the compile times with a first-run savings.
Here are the last 8 runs (out of 10), with code cached turned off (forcing a compile on each
query run), with and without the generic versions:
> * Original (no code cache): 1832 ms / run
> * Generic (no code cache): 1317 ms / run
> This demonstrates the expected outcome: avoiding compilation of generated code saves
~500 ms per run (or 28%). (Note: the numbers above were obtained on a version of the code
that already had various optimizations described in other JIRA entries.)
> The reason, for generating code is that one would expect that 243 in-line statements
(an unwound loop) to be faster than a loop with 243 iterations. In addition, the generic version
uses an array in place of ~500 variables, and a virtual function call rather than in-line,
type-specific calls. One would expect the unrolled loop to be faster.
> Repeat the exercise, this time with the code cache turned on so that no compile cost
is payed for either code path (because the test excludes the first two runs in which the generated
code is compiled.)
> * Original: 1302 ms / run
> * Generic version: 1040 ms / run
> Contrary to expectations, the loop is faster than the in-line statements. In this instance,
the array/loop/virtual function version is ~260 ms faster (20%).
> The test shows that the code can be simplified, a costly costly code-gen and compile
step can be skipped, and this query will go faster. Plus, since the change removes generated
classes from the code cache, there is more room for the remaining classes, which may improve
the hit rate.
> This ticket offers the performance improvement as an option, described in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message