spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sro...@apache.org
Subject spark git commit: [SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causing OoM for long runs
Date Wed, 13 Jul 2016 10:39:35 GMT
Repository: spark
Updated Branches:
  refs/heads/master 3d6f679cf -> 51ade51a9


[SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causing OoM for long runs

## What changes were proposed in this pull request?

Unpersist broadcasted vars in Word2Vec.fit for more timely / reliable resource cleanup

## How was this patch tested?

Jenkins tests

Author: Sean Owen <sowen@cloudera.com>

Closes #14153 from srowen/SPARK-16440.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/51ade51a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/51ade51a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/51ade51a

Branch: refs/heads/master
Commit: 51ade51a9fd64fc2fe651c505a286e6f29f59d40
Parents: 3d6f679
Author: Sean Owen <sowen@cloudera.com>
Authored: Wed Jul 13 11:39:32 2016 +0100
Committer: Sean Owen <sowen@cloudera.com>
Committed: Wed Jul 13 11:39:32 2016 +0100

----------------------------------------------------------------------
 .../src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala  | 3 +++
 1 file changed, 3 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/51ade51a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
index f2211df..6b9c8ee 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
@@ -434,6 +434,9 @@ class Word2Vec extends Serializable with Logging {
       bcSyn1Global.unpersist(false)
     }
     newSentences.unpersist()
+    expTable.unpersist()
+    bcVocab.unpersist()
+    bcVocabHash.unpersist()
 
     val wordArray = vocab.map(_.word)
     new Word2VecModel(wordArray.zipWithIndex.toMap, syn0Global)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message