spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hvanhovell <>
Subject [GitHub] spark pull request #15238: [SPARK-17653][SQL] Remove unnecessary distincts i...
Date Tue, 27 Sep 2016 17:49:57 GMT
Github user hvanhovell commented on a diff in the pull request:
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
    @@ -191,25 +191,34 @@ object ExtractFiltersAndInnerJoins extends PredicateHelper {
      * A pattern that collects all adjacent unions and returns their children as a Seq.
    + * If the top union is wrapped in a [[Distinct]], then the [[Distinct]] in the adjacent
unions, if
    + * any, will be eliminated.
     object Unions {
    -  def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
    -    case u: Union => Some(collectUnionChildren(mutable.Stack(u), Seq.empty[LogicalPlan]))
    +  def unapply(plan: LogicalPlan): Option[(Seq[LogicalPlan], Boolean)] = plan match {
    +    case u: Union =>
    +      Some(collectUnionChildren(mutable.Stack(u), Seq.empty[LogicalPlan], false), false)
    +    case Distinct(u: Union) =>
    +      Some(collectUnionChildren(mutable.Stack(u), Seq.empty[LogicalPlan], true), true)
         case _ => None
       // Doing a depth-first tree traversal to combine all the union children.
       private def collectUnionChildren(
           plans: mutable.Stack[LogicalPlan],
    -      children: Seq[LogicalPlan]): Seq[LogicalPlan] = {
    +      children: Seq[LogicalPlan],
    +      dedupDistinct: Boolean): Seq[LogicalPlan] = {
    --- End diff --
    Why is this method so complicated? It seems that we can do without the stack. A stack
only makes sense if you do not want use recursion (you can use a while loop instead). This
comment has nothing to do with this PR, but with the code as it was to begin with. Could you
fix it anyway?
    cc @gatorsmile as well

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message