spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sro...@apache.org
Subject spark git commit: [SPARK-5273][MLLIB][DOCS] Improve documentation examples for LinearRegression
Date Tue, 12 Jan 2016 13:26:43 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.6 3221a7d91 -> 4c67d55c0


[SPARK-5273][MLLIB][DOCS] Improve documentation examples for LinearRegression

Use a much smaller step size in LinearRegressionWithSGD MLlib examples to achieve a reasonable
RMSE.

Our training folks hit this exact same issue when concocting an example and had the same solution.

Author: Sean Owen <sowen@cloudera.com>

Closes #10675 from srowen/SPARK-5273.

(cherry picked from commit 9c7f34af37ef328149c1d66b4689d80a1589e1cc)
Signed-off-by: Sean Owen <sowen@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4c67d55c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4c67d55c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4c67d55c

Branch: refs/heads/branch-1.6
Commit: 4c67d55c0ccf086e91d1755b62c8526f2ff51f21
Parents: 3221a7d
Author: Sean Owen <sowen@cloudera.com>
Authored: Tue Jan 12 12:13:32 2016 +0000
Committer: Sean Owen <sowen@cloudera.com>
Committed: Tue Jan 12 13:26:37 2016 +0000

----------------------------------------------------------------------
 docs/mllib-linear-methods.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/4c67d55c/docs/mllib-linear-methods.md
----------------------------------------------------------------------
diff --git a/docs/mllib-linear-methods.md b/docs/mllib-linear-methods.md
index 20b3561..aac8f75 100644
--- a/docs/mllib-linear-methods.md
+++ b/docs/mllib-linear-methods.md
@@ -590,7 +590,8 @@ val parsedData = data.map { line =>
 
 // Building the model
 val numIterations = 100
-val model = LinearRegressionWithSGD.train(parsedData, numIterations)
+val stepSize = 0.00000001
+val model = LinearRegressionWithSGD.train(parsedData, numIterations, stepSize)
 
 // Evaluate model on training examples and compute training error
 val valuesAndPreds = parsedData.map { point =>
@@ -655,8 +656,9 @@ public class LinearRegression {
 
     // Building the model
     int numIterations = 100;
+    double stepSize = 0.00000001;
     final LinearRegressionModel model =
-      LinearRegressionWithSGD.train(JavaRDD.toRDD(parsedData), numIterations);
+      LinearRegressionWithSGD.train(JavaRDD.toRDD(parsedData), numIterations, stepSize);
 
     // Evaluate model on training examples and compute training error
     JavaRDD<Tuple2<Double, Double>> valuesAndPreds = parsedData.map(
@@ -706,7 +708,7 @@ data = sc.textFile("data/mllib/ridge-data/lpsa.data")
 parsedData = data.map(parsePoint)
 
 # Build the model
-model = LinearRegressionWithSGD.train(parsedData)
+model = LinearRegressionWithSGD.train(parsedData, iterations=100, step=0.00000001)
 
 # Evaluate the model on training data
 valuesAndPreds = parsedData.map(lambda p: (p.label, model.predict(p.features)))


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message