spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dataginjaninja <>
Subject Standard preprocessing/scaling
Date Wed, 28 May 2014 12:03:45 GMT
I searched on this, but didn't find anything general so I apologize if this
has been addressed. 

Many algorithms (SGD, SVM...) either will not converge or will run forever
if the data is not scaled. Sci-kit has  preprocessing
that will subtract the mean and divide by standard dev. Of course there are
a few options with it as well.

Is there something in the works for this?

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

View raw message