spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From felixche...@apache.org
Subject spark git commit: [SPARK-21693][R][ML] Reduce max iterations in Linear SVM test in R to speed up AppVeyor build
Date Sun, 12 Nov 2017 22:37:23 GMT
Repository: spark
Updated Branches:
  refs/heads/master 9bf696dbe -> 3d90b2cb3


[SPARK-21693][R][ML] Reduce max iterations in Linear SVM test in R to speed up AppVeyor build

## What changes were proposed in this pull request?

This PR proposes to reduce max iteration in Linear SVM test in SparkR. This particular test
elapses roughly 5 mins on my Mac and over 20 mins on Windows.

The root cause appears, it triggers 2500ish jobs by the default 100 max iterations. In Linux,
`daemon.R` is forked but on Windows another process is launched, which is extremely slow.

So, given my observation, there are many processes (not forked) ran on Windows, which makes
the differences of elapsed time.

After reducing the max iteration to 10, the total jobs in this single test is reduced to 550ish.

After reducing the max iteration to 5, the total jobs in this single test is reduced to 360ish.

## How was this patch tested?

Manually tested the elapsed times.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19722 from HyukjinKwon/SPARK-21693-test.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3d90b2cb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3d90b2cb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3d90b2cb

Branch: refs/heads/master
Commit: 3d90b2cb384affe8ceac9398615e9e21b8c8e0b0
Parents: 9bf696d
Author: hyukjinkwon <gurwls223@gmail.com>
Authored: Sun Nov 12 14:37:20 2017 -0800
Committer: Felix Cheung <felixcheung@apache.org>
Committed: Sun Nov 12 14:37:20 2017 -0800

----------------------------------------------------------------------
 R/pkg/tests/fulltests/test_mllib_classification.R | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/3d90b2cb/R/pkg/tests/fulltests/test_mllib_classification.R
----------------------------------------------------------------------
diff --git a/R/pkg/tests/fulltests/test_mllib_classification.R b/R/pkg/tests/fulltests/test_mllib_classification.R
index a4d0397..ad47717 100644
--- a/R/pkg/tests/fulltests/test_mllib_classification.R
+++ b/R/pkg/tests/fulltests/test_mllib_classification.R
@@ -66,7 +66,7 @@ test_that("spark.svmLinear", {
   feature <- c(1.1419053, 0.9194079, -0.9498666, -1.1069903, 0.2809776)
   data <- as.data.frame(cbind(label, feature))
   df <- createDataFrame(data)
-  model <- spark.svmLinear(df, label ~ feature, regParam = 0.1)
+  model <- spark.svmLinear(df, label ~ feature, regParam = 0.1, maxIter = 5)
   prediction <- collect(select(predict(model, df), "prediction"))
   expect_equal(sort(prediction$prediction), c("0.0", "0.0", "0.0", "1.0", "1.0"))
 
@@ -77,10 +77,11 @@ test_that("spark.svmLinear", {
   trainidxs <- base::sample(nrow(data), nrow(data) * 0.7)
   traindf <- as.DataFrame(data[trainidxs, ])
   testdf <- as.DataFrame(rbind(data[-trainidxs, ], c(0, "the other")))
-  model <- spark.svmLinear(traindf, clicked ~ ., regParam = 0.1)
+  model <- spark.svmLinear(traindf, clicked ~ ., regParam = 0.1, maxIter = 5)
   predictions <- predict(model, testdf)
   expect_error(collect(predictions))
-  model <- spark.svmLinear(traindf, clicked ~ ., regParam = 0.1, handleInvalid = "skip")
+  model <- spark.svmLinear(traindf, clicked ~ ., regParam = 0.1,
+                           handleInvalid = "skip", maxIter = 5)
   predictions <- predict(model, testdf)
   expect_equal(class(collect(predictions)$clicked[1]), "list")
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message