spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang-Chi Hsieh (JIRA)" <>
Subject [jira] [Created] (SPARK-18395) Evaluate common subexpression like lazy variable with a function approach
Date Thu, 10 Nov 2016 07:52:58 GMT
Liang-Chi Hsieh created SPARK-18395:

             Summary: Evaluate common subexpression like lazy variable with a function approach
                 Key: SPARK-18395
             Project: Spark
          Issue Type: Improvement
          Components: SQL
            Reporter: Liang-Chi Hsieh

As per the discussion at pr 15807, we need to change the way of subexpression elimination.

In current approach, common subexpressions are evaluated no matter they are really used or
not later. E.g., in the following generated codes, {{subexpr2}} is evaluated even only the
{{if}} branch is run.

    if (isNull(subexpr)) {
    } else {
      AssertNotNull(subexpr)  // subexpr2
      SomeExpr(AssertNotNull(subexpr)) // SomeExpr(subexpr2)

Besides possible performance regression, the expression like {{AssertNotNull}} can throw exception.
So wrongly evaluating {{subexpr2}} will throw exception unexceptedly..

With this patch, now common subexpressions are not evaluated until they are used. We create
a function for each common subexpression which evaluates and stores the result as a member
variable. We have an initialization status variable to record whether the subexpression is

Thus, when an expression using the subexpression is going to be evaluated, we check if the
subexpression is initialized, if yes directly returning the result, if no call the function
to evaluate it.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message