spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei Wang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-17836) Use cross validation to determine the number of clusters for EM or KMeans algorithms
Date Sat, 08 Oct 2016 08:40:20 GMT
Lei Wang created SPARK-17836:
--------------------------------

             Summary: Use cross validation to determine the number of clusters for EM or KMeans
algorithms
                 Key: SPARK-17836
                 URL: https://issues.apache.org/jira/browse/SPARK-17836
             Project: Spark
          Issue Type: Bug
          Components: ML
            Reporter: Lei Wang


Sometimes it's not easy for users to determine number of clusters.
It would be very useful If spark ml can support this. 
There are several methods to do this according to wiki https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set
Weka uses crossing validation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message