spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dataginjaninja <rickett.stepha...@gmail.com>
Subject Standard preprocessing/scaling
Date Wed, 28 May 2014 12:03:45 GMT
I searched on this, but didn't find anything general so I apologize if this
has been addressed. 

Many algorithms (SGD, SVM...) either will not converge or will run forever
if the data is not scaled. Sci-kit has  preprocessing
<http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html>
 
that will subtract the mean and divide by standard dev. Of course there are
a few options with it as well.

Is there something in the works for this?



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Standard-preprocessing-scaling-tp6826.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Mime
View raw message