spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jinhong lu <lujinho...@gmail.com>
Subject how to retain part of the features in LogisticRegressionModel (spark2.0)
Date Sun, 19 Mar 2017 10:12:07 GMT

I train my LogisticRegressionModel like this,  I want my model to retain only some of the
features(e.g. 500 of them), not all the 5555 features. What shou I do? 
I use .setElasticNetParam(1.0), but still all the features is in lrModel.coefficients.

	  import org.apache.spark.ml.classification.LogisticRegression
	  val data=spark.read.format("libsvm").option("numFeatures","5555").load("/tmp/data/training_data3")

	  val Array(trainingData, testData) = data.randomSplit(Array(0.5, 0.5), seed = 1234L)

	  val lr = new LogisticRegression()
	  val lrModel = lr.fit(trainingData)
	  println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")

	  val predictions = lrModel.transform(testData)
	  predictions.show()


Thanks, 
lujinhong


Mime
View raw message