drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3912) Common subexpression elimination
Date Wed, 07 Oct 2015 22:37:26 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947721#comment-14947721

Ted Dunning commented on DRILL-3912:

It sounds like this only deals with common sub-expressions in expressions.

A far more significant optimization would be to deal with common sub-expressions at a larger
scale.  A classic case is multiple re-use of a single expression in a common table expression.
 For instance,

with x as (select dir0, id from dfs.tdunning.zoom where id < 12),  
       y as (select id, count(*) cnt from x group by id),
       z as (select count(distinct id) id_count from x)
select dir0, x.id, y.cnt from x , y, z  where x.id = y.id and y.cnt / z.id_count >  3

Without good sub-expression elimination, table zoom will be scanned three times. Last I heard,
DRILL doesn't optimize this away.

> Common subexpression elimination
> --------------------------------
>                 Key: DRILL-3912
>                 URL: https://issues.apache.org/jira/browse/DRILL-3912
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Steven Phillips
>            Assignee: Steven Phillips
> Drill currently will evaluate the full expression tree, even if there are redundant subtrees.
Many of these redundant evaluations can be eliminated by reusing the results from previously
evaluated expression trees.
> For example,
> {code}
> select a + 1, (a + 1)* (a - 1) from t
> {code}
> Will compute the entire (a + 1) expression twice. With CSE, it will only be evaluated
> The benefit will be reducing the work done when evaluating expressions, as well as reducing
the amount of code that is generated, which could also lead to better JIT optimization.

This message was sent by Atlassian JIRA

View raw message