spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Pentreath <nick.pentre...@gmail.com>
Subject Re: [ML] Allow CrossValidation ParamGrid on SVMWithSGD
Date Fri, 19 Jan 2018 12:04:36 GMT
SVMWithSGD sits in the older "mllib" package and is not compatible directly
with the DataFrame API. I suppose one could write a ML-API wrapper around
it.

However, there is LinearSVC in Spark 2.2.x:
http://spark.apache.org/docs/latest/ml-classification-regression.html#linear-support-vector-machine

You should use that instead I would say.

On Fri, 19 Jan 2018 at 13:59 Tomasz Dudek <megatrontomaszdudek@gmail.com>
wrote:

> Hello,
>
> is there any way to use CrossValidation's ParamGrid with SVMWithSGD?
>
> usually, when e.g. using RandomForest you can specify a lot of parameters,
> to automatise the param grid search (when used with CrossValidation)
>
> val algorithm = new RandomForestClassifier()
> val paramGrid = { new ParamGridBuilder()
>   .addGrid(algorithm.impurity, Array("gini", "entropy"))
>   .addGrid(algorithm.maxDepth, Array(3, 5, 10))
>   .addGrid(algorithm.numTrees, Array(2, 3, 5, 15, 50))
>   .addGrid(algorithm.minInfoGain, Array(0.01, 0.001))
>   .addGrid(algorithm.minInstancesPerNode, Array(10, 50, 500))
>   .build()
> }
>
> with SGDWIthSGD however, the parameters are inside GradientDescent. You
> can explicitly tune the params, either by using SGDWithSGD's constructor or
> by calling setters here:
>
> val algorithm = new SVMWithSGD()
> algorithm.optimizer.setMiniBatchFraction(256)
>   .setNumIterations(200)
>   .setRegParam(0.01)
>
> those two ways however restrict me from using ParamGridBuilder correctly.
>
> There are no such things as algorithm.optimizer.numIterations or
> algorithm.optimizer.regParam, only setters(and ParamGrid requires Params,
> not setters)
>
> I could of course create each SVM model manually, create one huge Pipeline
> with each model saving its result to different column and then manually
> decide which performed the best. It requires a lot of coding and so far
> CrossValidation's ParamGrid did that job for me instead.
>
> Am I missing something? Is it WIP or is there any hack to do that?
>
> Yours,
> Tomasz
>

Mime
View raw message