Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 18C41200C37 for ; Sun, 19 Mar 2017 11:19:02 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1741F160B7D; Sun, 19 Mar 2017 10:19:02 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5D9FC160B6E for ; Sun, 19 Mar 2017 11:19:01 +0100 (CET) Received: (qmail 10254 invoked by uid 500); 19 Mar 2017 10:18:54 -0000 Mailing-List: contact dev-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@spark.apache.org Received: (qmail 9447 invoked by uid 99); 19 Mar 2017 10:18:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Mar 2017 10:18:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 3628CC2B3B; Sun, 19 Mar 2017 10:18:53 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.648 X-Spam-Level: ** X-Spam-Status: No, score=2.648 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id D67DoFzKO0v8; Sun, 19 Mar 2017 10:18:52 +0000 (UTC) Received: from mail-pf0-f194.google.com (mail-pf0-f194.google.com [209.85.192.194]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id C39885FE16; Sun, 19 Mar 2017 10:18:51 +0000 (UTC) Received: by mail-pf0-f194.google.com with SMTP id o126so13566328pfb.1; Sun, 19 Mar 2017 03:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:subject:date:message-id:cc:to:mime-version; bh=6Dga3VQPqw7ThojpppYI/0yMRYDMMx7fHguPEUtWGMo=; b=rHMki/yvkx2MWRGH+rnNgk5/45Hlf/+XYJ8yLiZ2PPfc8Q7unZ/Yh8oOZ/SIYQzt6k mzobbIiTz9Jj2+beAqFlmoqdJzH09+pdNmpjDnRBRtTWunZZe5xHRyFuW10HfOYd2MfZ pm7e0A0b2ViJzTBd3zPXtK4yhH3C1XsUkYYrf3J5rNbo+3Ng5RT9U8/ZWrPbn39kJUiJ pD4Ukd8EIJmVWFYy00p8OWdPrOqjGfXCjeRV+PTvsc6r5VsVoHcs/mUtzJXbKXePrhZR gLq7levyqXm22xYj5xCEM2uMOpj43YZR6ZTnKFgqwWKQ0xHNz45IGP9cjxPUNboyN7Bv BWUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:date:message-id:cc:to:mime-version; bh=6Dga3VQPqw7ThojpppYI/0yMRYDMMx7fHguPEUtWGMo=; b=dj6Uxa2l5dE8jsB8r5RH5rLTSj99MsLgZr/lcCGmtLbAQcM293omXmo1md2d1vf/Yg z7svayjYCykMlBVFeZ29p62qW0i6/PWbUgair1HsMnw7Y18eF2SiXq5T4vag/a8fb5xG X7TL05TA+1eV8u8ujgt1iDm6TiG+1XRqOYGOkA2tgA/sOhfbU8d7ipHThazz6AWBsaL3 vsgIqZt15nIIoEHRBRxu7cCdHew7mzio00OJ6fx5Y3enYmrRHuSak/BcEFVs7iIl0JK6 w8orwdM7fgYkaXxOX3Wr1+sc0PLxWob4y2DKZS02XqYLe8yM7Xn8MZ4ef80KroDzY/HT HbyQ== X-Gm-Message-State: AFeK/H3Je659KYbqRdBdOhOMPiM45CVxuDgkKKypSSR+7cdXVPEu4ZvywB1xzU6O13XHJA== X-Received: by 10.84.232.9 with SMTP id h9mr32047698plk.102.1489918352710; Sun, 19 Mar 2017 03:12:32 -0700 (PDT) Received: from [10.62.0.28] (7c.ac.5177.ip4.static.sl-reverse.com. [119.81.172.124]) by smtp.gmail.com with ESMTPSA id a62sm26782471pgc.60.2017.03.19.03.12.30 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 19 Mar 2017 03:12:32 -0700 (PDT) From: jinhong lu Content-Type: multipart/alternative; boundary="Apple-Mail=_856D937A-FB63-4DEE-A8A0-9A678EA43091" Subject: how to retain part of the features in LogisticRegressionModel (spark2.0) Date: Sun, 19 Mar 2017 18:12:07 +0800 Message-Id: Cc: =?gb2312?B?wr299brp?= To: spark users , dev@spark.apache.org Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) archived-at: Sun, 19 Mar 2017 10:19:02 -0000 --Apple-Mail=_856D937A-FB63-4DEE-A8A0-9A678EA43091 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii I train my LogisticRegressionModel like this, I want my model to retain = only some of the features(e.g. 500 of them), not all the 5555 features. = What shou I do?=20 I use .setElasticNetParam(1.0), but still all the features is in = lrModel.coefficients. import org.apache.spark.ml.classification.LogisticRegression val = data=3Dspark.read.format("libsvm").option("numFeatures","5555").load("/tmp= /data/training_data3")=20 val Array(trainingData, testData) =3D = data.randomSplit(Array(0.5, 0.5), seed =3D 1234L) val lr =3D new LogisticRegression() val lrModel =3D lr.fit(trainingData) println(s"Coefficients: ${lrModel.coefficients} Intercept: = ${lrModel.intercept}") val predictions =3D lrModel.transform(testData) predictions.show() Thanks,=20 lujinhong --Apple-Mail=_856D937A-FB63-4DEE-A8A0-9A678EA43091 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii

I train my LogisticRegressionModel like this,  I want my = model to retain only some of the features(e.g. 500 of them), not all the = 5555 features. What shou I do? 
I = use .setElasticNetParam(1.0), but still all the features is = in lrModel.coefficients.

=  import = org.apache.spark.ml.classification.LogisticRegression
=  val = data=3Dspark.read.format("libsvm").option("numFeatures","5555").load("/tmp= /data/training_data3") 
=  val Array(trainingData, testData) =3D data.randomSplit(Array(0.5, = 0.5), seed =3D 1234L)

=  val lr =3D new LogisticRegression()
=  val lrModel =3D lr.fit(trainingData)
=  println(s"Coefficients: ${lrModel.coefficients} Intercept: = ${lrModel.intercept}")

=  val predictions =3D lrModel.transform(testData)
=  predictions.show()


Thanks, 
lujinhong

= --Apple-Mail=_856D937A-FB63-4DEE-A8A0-9A678EA43091--