spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yli...@apache.org
Subject spark git commit: [SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use MLVector instead of MLlib Vector
Date Tue, 02 Aug 2016 14:31:42 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 9d9956e8f -> c5516ab60


[SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use MLVector instead of MLlib
Vector

## What changes were proposed in this pull request?

mllib.LDAExample uses ML pipeline and MLlib LDA algorithm. The former transforms original
data into MLVector format, while the latter uses MLlibVector format.

## How was this patch tested?

Test manually.

Author: Xusen Yin <yinxusen@gmail.com>

Closes #14212 from yinxusen/SPARK-16558.

(cherry picked from commit dd8514fa2059a695143073f852b1abee50e522bd)
Signed-off-by: Yanbo Liang <ybliang8@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c5516ab6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c5516ab6
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c5516ab6

Branch: refs/heads/branch-2.0
Commit: c5516ab60da860320693bbc245818cb6d8a282c8
Parents: 9d9956e
Author: Xusen Yin <yinxusen@gmail.com>
Authored: Tue Aug 2 07:28:46 2016 -0700
Committer: Yanbo Liang <ybliang8@gmail.com>
Committed: Tue Aug 2 07:31:32 2016 -0700

----------------------------------------------------------------------
 .../main/scala/org/apache/spark/examples/mllib/LDAExample.scala | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/c5516ab6/examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala
----------------------------------------------------------------------
diff --git a/examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala b/examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala
index 3fbf8e0..ef67841 100644
--- a/examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala
+++ b/examples/src/main/scala/org/apache/spark/examples/mllib/LDAExample.scala
@@ -24,8 +24,9 @@ import scopt.OptionParser
 import org.apache.spark.{SparkConf, SparkContext}
 import org.apache.spark.ml.Pipeline
 import org.apache.spark.ml.feature.{CountVectorizer, CountVectorizerModel, RegexTokenizer,
StopWordsRemover}
+import org.apache.spark.ml.linalg.{Vector => MLVector}
 import org.apache.spark.mllib.clustering.{DistributedLDAModel, EMLDAOptimizer, LDA, OnlineLDAOptimizer}
-import org.apache.spark.mllib.linalg.Vector
+import org.apache.spark.mllib.linalg.{Vector, Vectors}
 import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{Row, SparkSession}
 
@@ -225,7 +226,7 @@ object LDAExample {
     val documents = model.transform(df)
       .select("features")
       .rdd
-      .map { case Row(features: Vector) => features }
+      .map { case Row(features: MLVector) => Vectors.fromML(features) }
       .zipWithIndex()
       .map(_.swap)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message