spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From m...@apache.org
Subject spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline
Date Mon, 16 Feb 2015 04:51:43 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.3 db3c539f2 -> 9cf7d7088


[Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

If it's a last estimator in Pipeline there's no need to transform data, since there's no next
stage that would consume this data.

Author: Peter Rudenko <petro.rudenko@gmail.com>

Closes #4590 from petro-rudenko/patch-1 and squashes the following commits:

d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

(cherry picked from commit c78a12c4cc4d4312c4ee1069d3b218882d32d678)
Signed-off-by: Xiangrui Meng <meng@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9cf7d708
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9cf7d708
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9cf7d708

Branch: refs/heads/branch-1.3
Commit: 9cf7d7088d245b9b41ec78295cd2d6e3e395793d
Parents: db3c539
Author: Peter Rudenko <petro.rudenko@gmail.com>
Authored: Sun Feb 15 20:51:32 2015 -0800
Committer: Xiangrui Meng <meng@databricks.com>
Committed: Sun Feb 15 20:51:38 2015 -0800

----------------------------------------------------------------------
 mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/9cf7d708/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
index bb291e6..5607ed2 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
@@ -114,7 +114,9 @@ class Pipeline extends Estimator[PipelineModel] {
             throw new IllegalArgumentException(
               s"Do not support stage $stage of type ${stage.getClass}")
         }
-        curDataset = transformer.transform(curDataset, paramMap)
+        if (index < indexOfLastEstimator) {
+          curDataset = transformer.transform(curDataset, paramMap)
+        }
         transformers += transformer
       } else {
         transformers += stage.asInstanceOf[Transformer]


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message