spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cloud-fan <...@git.apache.org>
Subject [GitHub] spark issue #20670: [SPARK-23405] Add constranits
Date Mon, 26 Feb 2018 11:36:42 GMT
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20670
  
    Good catch! This is a real problem, but the fix looks hacky.
    
    By definition, I think `plan.contraints` should only include constraints that refer to
`plan.output`, as that's the promise a plan can make to its parent. However, join is special
as `Join.condition` can refer to both of the join sides, and we add the constraints to `Join.condition`,
which is kind of we are making a promise to Join's children, not parent. My proposal:
    ```
      lazy val constraints: ExpressionSet = {
        if (conf.constraintPropagationEnabled) {
          allConstraints.filter { c =>
            c.references.nonEmpty && c.references.subsetOf(outputSet) && c.deterministic
          }
        } else {
          ExpressionSet(Set.empty)
        }
      }
    
      lazy val allConstraints = ExpressionSet(validConstraints
              .union(inferAdditionalConstraints(validConstraints))
              .union(constructIsNotNullConstraints(validConstraints)))
    ```
    Then we can call `plan.allConstraints` when inferring contraints for join.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message