spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aokolnychyi <...@git.apache.org>
Subject [GitHub] spark pull request #22857: [SPARK-25860][SQL] Replace Literal(null, _) with ...
Date Sun, 28 Oct 2018 09:56:52 GMT
Github user aokolnychyi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22857#discussion_r228741884
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
---
    @@ -736,3 +736,65 @@ object CombineConcats extends Rule[LogicalPlan] {
           flattenConcats(concat)
       }
     }
    +
    +/**
    + * A rule that replaces `Literal(null, _)` with `FalseLiteral` for further optimizations.
    + *
    + * For example, `Filter(Literal(null, _))` is equal to `Filter(FalseLiteral)`.
    + *
    + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null,
_)))`;
    + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually
    + * `Filter(FalseLiteral)`.
    + *
    + * As a result, many unnecessary computations can be removed in the query optimization
phase.
    + *
    + * Similarly, the same logic can be applied to conditions in [[Join]], predicates in
[[If]],
    + * conditions in [[CaseWhen]].
    + */
    +object ReplaceNullWithFalse extends Rule[LogicalPlan] {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
    +    case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond)))
    +    case p: LogicalPlan => p transformExpressions {
    +      case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
    +      case CaseWhen(branches, elseValue) =>
    +        val newBranches = branches.map { case (cond, value) =>
    +          replaceNullWithFalse(cond) -> value
    +        }
    +        CaseWhen(newBranches, elseValue)
    +    }
    +  }
    +
    +  /**
    +   * Recursively replaces `Literal(null, _)` with `FalseLiteral`.
    +   *
    +   * Note that `transformExpressionsDown` can not be used here as we must stop as soon
as we hit
    +   * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or `Literal(null,
_)`.
    +   */
    +  private def replaceNullWithFalse(e: Expression): Expression = e match {
    +    case cw: CaseWhen if getValues(cw).forall(isNullOrBoolean) =>
    +      val newBranches = cw.branches.map { case (cond, value) =>
    +        replaceNullWithFalse(cond) -> replaceNullWithFalse(value)
    +      }
    +      val newElseValue = cw.elseValue.map(replaceNullWithFalse)
    +      CaseWhen(newBranches, newElseValue)
    +    case If(pred, trueVal, falseVal) if Seq(trueVal, falseVal).forall(isNullOrBoolean)
=>
    --- End diff --
    
    Yep, I shortened this to stay in one line below. I can either rename `pred` to `p`or split
line 783 into multiple.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message