spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "RJ Nowling (JIRA)" <>
Subject [jira] [Created] (SPARK-2429) Hierarchical Implementation of KMeans
Date Thu, 10 Jul 2014 14:18:04 GMT
RJ Nowling created SPARK-2429:

             Summary: Hierarchical Implementation of KMeans
                 Key: SPARK-2429
             Project: Spark
          Issue Type: New Feature
          Components: MLlib
            Reporter: RJ Nowling
            Priority: Minor

Hierarchical clustering algorithms are widely used and would make a nice addition to MLlib.
 Clustering algorithms are useful for determining relationships between clusters as well as
offering faster assignment. Discussion on the dev list suggested the following possible approaches:

* Top down, recursive application of KMeans
* Reuse DecisionTree implementation with different objective function
* Hierarchical SVD

It was also suggested that support for distance metrics other than Euclidean such as negative
dot or cosine are necessary.

This message was sent by Atlassian JIRA

View raw message