commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nate Paymer (JIRA)" <>
Subject [jira] Created: (MATH-548) KMeansPlusPlusClusterer should run multiple trials
Date Sat, 12 Mar 2011 05:13:04 GMT
KMeansPlusPlusClusterer should run multiple trials

                 Key: MATH-548
             Project: Commons Math
          Issue Type: Improvement
            Reporter: Nate Paymer
            Priority: Minor

The interface and documentation for KMeansPlusPlusClusterer imply that a single call to cluster()
is sufficient to get the optimal set of clusters.  But this isn't true -- practically every
client should be calling cluster() multiple times, selecting the best resulting set of clusters.
 It seems to me that rather than forcing every client to implement this functionality, it
should be placed directly in the KMeansPlusPlusClusterer class.

I propose adding a new method to KMeansPlusPlusClusterer:
  List<Cluster<T>> cluster(Collection<T> points, int k, int numTrials, int
which calls the existing cluster() method numTrials times, returning the best result.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message