spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pun <>
Subject How to run MLlib's word2vec in CBOW mode?
Date Thu, 28 Sep 2017 13:55:45 GMT
My understanding is that word2vec can be ran in two modes:
continuous bag-of-words (CBOW) (order of words does not matter) 
 continuous skip-gram (order of words matters)
I would like to run the *CBOW* implementation from Spark's MLlib, but it is
not clear to me from the documentation and their example how to do it. 
This is the example listed on their page.From:
import org.apache.spark.mllib.feature.{Word2Vec, Word2VecModel}val input =
sc.textFile("data/mllib/sample_lda_data.txt").map(line => line.split("
").toSeq)val word2vec = new Word2Vec()val model =
synonyms = model.findSynonyms("1", 5)for((synonym, cosineSimilarity) <-
synonyms) {  println(s"$synonym $cosineSimilarity")}
*My questions:*
Which of the two modes does this example use?
Do you know how I can run the model in the CBOW mode?
Thanks in advance!

Sent from:
View raw message