spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kai-zeng <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-4485][SQL] (1) Add broadcast hash outer...
Date Tue, 07 Jul 2015 01:37:33 GMT
Github user kai-zeng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7162#discussion_r34001466
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SparkPlanTest.scala ---
    @@ -92,27 +153,25 @@ object SparkPlanTest {
        * @param expectedAnswer the expected result in a [[Seq]] of [[Row]]s.
        */
       def checkAnswer(
    -      input: DataFrame,
    -      planFunction: SparkPlan => SparkPlan,
    +      input: Seq[DataFrame],
    +      planFunction: Seq[SparkPlan] => SparkPlan,
           expectedAnswer: Seq[Row]): Option[String] = {
     
    -    val outputPlan = planFunction(input.queryExecution.sparkPlan)
    +    val outputPlan = planFunction(input.map(_.queryExecution.sparkPlan))
     
         // A very simple resolver to make writing tests easier. In contrast to the real resolver
         // this is always case sensitive and does not try to handle scoping or complex type
resolution.
    -    val resolvedPlan = outputPlan transform {
    -      case plan: SparkPlan =>
    -        val inputMap = plan.children.flatMap(_.output).zipWithIndex.map {
    -          case (a, i) =>
    -            (a.name, BoundReference(i, a.dataType, a.nullable))
    --- End diff --
    
    @JoshRosen Hi Josh. I still prefer my way of resolving attributes, for two reasons:
    (1) References are bound in each operator, that's certainly something we should test.
So in my opinion, we shouldn't bind the references manually in the test suite.
    (2) Manually binding the references isn't good for operators with two or more inputs.
For these operators, there actually could be different ways to binding references depending
on which implementation is used. The old implementation SparkPlanTest cannot handle operators
with two or more inputs. We can certainly fix the old implementation by fix binding for binary
operators, but that's gonna be tedious later, say if we are gona change some implementation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message