spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maropu <...@git.apache.org>
Subject [GitHub] spark pull request #20174: [SPARK-22951][SQL] aggregate should not produce e...
Date Sun, 07 Jan 2018 03:47:43 GMT
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20174#discussion_r160040256
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -666,4 +666,16 @@ class DataFrameAggregateSuite extends QueryTest with SharedSQLContext
{
           assert(exchangePlans.length == 1)
         }
       }
    +
    +  test("SPARK-22951: aggregation on empty data frame should only return initial values")
{
    +    // non code gen
    +    withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {
    +      assert(spark.emptyDataFrame.dropDuplicates.count == 0)
    +    }
    +
    +    // code gen
    +    withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {
    +      assert(spark.emptyDataFrame.dropDuplicates.count == 0)
    +    }
    +  }
    --- End diff --
    
    ```
    Seq("true", "false").foreach { codegen =>
        withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> codegen) {
          assert(spark.emptyDataFrame.dropDuplicates.count == 0)
        }
    }
    ```
    BTW, I think it is common patterns to check codegen and non-codegen paths, so we might
be better to add a helper function in test utility class like;
    ```
    checkExecution {
      assert(spark.emptyDataFrame.dropDuplicates.count == 0)
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message