spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yuhao yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit
Date Tue, 25 Jul 2017 22:13:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100860#comment-16100860
] 

yuhao yang commented on SPARK-21535:
------------------------------------

https://github.com/apache/spark/pulls 

> Reduce memory requirement for CrossValidator and TrainValidationSplit 
> ----------------------------------------------------------------------
>
>                 Key: SPARK-21535
>                 URL: https://issues.apache.org/jira/browse/SPARK-21535
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: yuhao yang
>
> CrossValidator and TrainValidationSplit both use 
> {code}models = est.fit(trainingDataset, epm) {code} to fit the models, where epm is Array[ParamMap].
> Even though the training process is sequential, current implementation consumes extra
driver memory for holding the trained models, which is not necessary and often leads to memory
exception for both CrossValidator and TrainValidationSplit. My proposal is to optimize the
training implementation, thus that used model can be collected by GC, and avoid the unnecessary
OOM exceptions.
> E.g. when grid search space is 12, old implementation needs to hold all 12 trained models
in the driver memory at the same time, while the new implementation only needs to hold 1 trained
model at a time, and previous model can be cleared by GC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message