drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5070) Code cache compares sources, but method order varies
Date Mon, 28 Nov 2016 00:36:58 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15700579#comment-15700579
] 

Paul Rogers commented on DRILL-5070:
------------------------------------

Analysis of impact. Method order is random. If a generated class has n method, then there
are n! possible orderings, and the cache holds up to n! different variations when one would
do. Reduction in each size for various method counts (assuming each variation ends up being
cached):

* 2 methods: 1/2
* 3 methods: 1/6
* 4 methods: 1/24

That is, a class with 3 methods will store 1/6 the number of variations after the fix as before.

Since the cache is cluttered with fewer functionally duplicate classes, more room will be
available to preserve other classes, potentially further reducing the amount of compilation
required.

> Code cache compares sources, but method order varies
> ----------------------------------------------------
>
>                 Key: DRILL-5070
>                 URL: https://issues.apache.org/jira/browse/DRILL-5070
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> The Drill generated code cache compares the sources from two different generation events
to detect duplicate code. Unfortunately, the code generator emits methods in the order returned
by {{Class.getDeclaredMethods}}, but this method makes no guarantee about the order of the
methods.
> This issue appeared when attempting to modify tests to capture generated code for comparison
to future results. Even a simple generated case from {{ExpressionTest.testBasicExpression()}}
that generates {{if(true) then 1 else 0 end}} (all constants) produced methods in different
orders on each test run.
> The fix is simple, in the {{SignatureHolder}} constructor, sort methods by name after
retrieving them from the class. The sort ensures that method order is deterministic. Fortunately,
the number of methods is small, so the sort step adds little cost.
> Without this fix, it is likely that the code cache holds many "copies" of the same code:
equivalent code but with different method orders. After this fix, the cache should hold only
one copy of each bit of equivalent code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message