spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-23568) Silhouette should get number of features from metadata if available
Date Fri, 02 Mar 2018 17:32:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-23568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383849#comment-16383849
] 

Apache Spark commented on SPARK-23568:
--------------------------------------

User 'mgaido91' has created a pull request for this issue:
https://github.com/apache/spark/pull/20719

> Silhouette should get number of features from metadata if available
> -------------------------------------------------------------------
>
>                 Key: SPARK-23568
>                 URL: https://issues.apache.org/jira/browse/SPARK-23568
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.4.0
>            Reporter: Marco Gaido
>            Priority: Minor
>
> In Silhouette computation we need to know the number of features. This is done taking
the first row and checking the size of the features vector. Despite it works fine, if the
number of attributes is present in the metadata of the column, we can avoid the additional
job which is generated by using `first`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message