spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph K. Bradley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-5895) Add VectorSlicer
Date Fri, 24 Apr 2015 21:38:40 GMT

    [ https://issues.apache.org/jira/browse/SPARK-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511820#comment-14511820
] 

Joseph K. Bradley commented on SPARK-5895:
------------------------------------------

Based on the updated [https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark]
guide, it sounds like the practice is: Go ahead and post here that you're beginning work,
start working on it and submit the PR, and then the Assignee field will be filled after the
PR is merged.  So please go ahead---thanks!

> Add VectorSlicer
> ----------------
>
>                 Key: SPARK-5895
>                 URL: https://issues.apache.org/jira/browse/SPARK-5895
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Xiangrui Meng
>
> `VectorSlicer` takes a vector column and output a vector column with a subset of features.
> {code}
> val vs = new VectorSlicer()
>   .setInputCol("user")
>   .setSelectedFeatures("age", "salary")
>   .setOutputCol("usefulUserFeatures")
> {code}
> We should allow specifying selected features by indices and by names. It should preserve
the output names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message