spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-7809) MultivariateOnlineSummarizer should allow users to configure what to compute
Date Sun, 24 May 2015 10:04:17 GMT

    [ https://issues.apache.org/jira/browse/SPARK-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557695#comment-14557695
] 

Apache Spark commented on SPARK-7809:
-------------------------------------

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/6388

> MultivariateOnlineSummarizer should allow users to configure what to compute
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-7809
>                 URL: https://issues.apache.org/jira/browse/SPARK-7809
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.4.0
>            Reporter: Xiangrui Meng
>
> Now MultivariateOnlineSummarizer computes every summary statistics it can provide, which
is okay and convenient for small number of features. It the feature dimension is large, this
becomes expensive. So we should add setters to allow users to configure what to compute.
> {code}
> val summarizer = new MultivariateOnlineSummarizer()
>   .withMean(false)
>   .withMax(false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message