spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-5890) Add FeatureDiscretizer
Date Thu, 13 Aug 2015 05:12:46 GMT

     [ https://issues.apache.org/jira/browse/SPARK-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiangrui Meng updated SPARK-5890:
---------------------------------
    Parent Issue: SPARK-9930  (was: SPARK-8521)

> Add FeatureDiscretizer
> ----------------------
>
>                 Key: SPARK-5890
>                 URL: https://issues.apache.org/jira/browse/SPARK-5890
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Xiangrui Meng
>            Assignee: Xusen Yin
>
> A `FeatureDiscretizer` takes a column with continuous features and outputs a column with
binned categorical features.
> {code}
> val fd = new FeatureDiscretizer()
>   .setInputCol("age")
>   .setNumBins(32)
>   .setOutputCol("ageBins")
> {code}
> This should an automatic feature discretizer, which uses a simple algorithm like approximate
quantiles to discretize features. It should set the ML attribute correctly in the output column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message