spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anthony Truchet (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
Date Thu, 21 Jul 2016 08:13:20 GMT

    [ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387342#comment-15387342
] 

Anthony Truchet commented on SPARK-16440:
-----------------------------------------

Regarding the try finally: we are computing numerous learning from within a same spark context
and some with vocabulary so large that they fail (yes we do try to filter out too big ones,
but too big is difficult to define).

So we are in a context where we do care about resource cleaning in case of error in order
to enable thousands of successive learnings some of with expected to fail.

As for core readability we can try to refactor the function to reduce the nesting or find
a "nice" scala solution: I'll propose a patch and I'll welcome any feedback on it.

> Undeleted broadcast variables in Word2Vec causing OoM for long runs 
> --------------------------------------------------------------------
>
>                 Key: SPARK-16440
>                 URL: https://issues.apache.org/jira/browse/SPARK-16440
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0
>            Reporter: Anthony Truchet
>            Assignee: Anthony Truchet
>             Fix For: 1.6.3, 2.0.1
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are never deleted
nor unpersisted. This seems to cause excessive memory consumption on the driver for a job
running hundreds of successive training.
> They are 
> {code}
>     val expTable = sc.broadcast(createExpTable())
>     val bcVocab = sc.broadcast(vocab)
>     val bcVocabHash = sc.broadcast(vocabHash)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message