spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Pentreath <nick.pentre...@gmail.com>
Subject Re: CrossValidation distribution - is it in the roadmap?
Date Wed, 29 Nov 2017 18:58:46 GMT
Hi Tomasz

Parallel evaluation for CrossValidation and TrainValidationSplit was added
for Spark 2.3 in https://issues.apache.org/jira/browse/SPARK-19357


On Wed, 29 Nov 2017 at 16:31 Tomasz Dudek <megatrontomaszdudek@gmail.com>
wrote:

> Hey,
>
> is there a way to make the following code:
>
> val paramGrid = new ParamGridBuilder().//omitted for brevity - lets say we
> have hundreds of param combinations here
>
> val cv = new
> CrossValidator().setNumFolds(3).setEstimator(pipeline).setEstimatorParamMaps(paramGrid)
>
> automatically distribute itself over all the executors? What I mean is
> to simultaneously compute few(or hundreds of it) ML models, instead of
> using all the computation power on just one model at time.
>
> If not, is such behavior in the Spark's road map?
>
> ...if not, do you think a person without prior Spark development
> experience(me) could do it? I'm using SparkML daily, since few months, at
> work. How much time would it take, approximately?
>
> Yours,
> Tomasz
>
>
>

Mime
View raw message