spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From smurching <...@git.apache.org>
Subject [GitHub] spark pull request #19381: [SPARK-10884][ML] Support prediction on single in...
Date Wed, 25 Oct 2017 21:49:17 GMT
Github user smurching commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19381#discussion_r146986798
  
    --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
    @@ -267,6 +268,24 @@ class DecisionTreeClassifierSuite
           Vector, DecisionTreeClassificationModel](newTree, newData)
       }
     
    +  test("prediction on single instance") {
    +    val rdd = continuousDataPointsForMulticlassRDD
    +    val dt = new DecisionTreeClassifier()
    +      .setImpurity("Gini")
    +      .setMaxDepth(4)
    +      .setMaxBins(100)
    +    val categoricalFeatures = Map(0 -> 3)
    +    val numClasses = 3
    +
    +    val newData: DataFrame = TreeTests.setMetadata(rdd, categoricalFeatures, numClasses)
    +    val newTree = dt.fit(newData)
    +
    +    newTree.transform(newData).select(dt.getFeaturesCol, dt.getPredictionCol).collect().foreach
{
    +      case Row(features: Vector, prediction: Double) =>
    +        assert(prediction ~== newTree.predict(features) relTol 1E-5)
    --- End diff --
    
    Can we test exact equality (e.g. `prediction === newTree.predict(features)`) here and
in other unit tests?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message