commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten Schäfer (JIRA) <j...@apache.org>
Subject [jira] [Commented] (MATH-1031) Refactoring: Move variance calculation of a centroid cluster to its class
Date Mon, 02 Sep 2013 01:06:51 GMT

    [ https://issues.apache.org/jira/browse/MATH-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755853#comment-13755853
] 

Thorsten Schäfer commented on MATH-1031:
----------------------------------------

Yes, if there is need for additional flexibility, your solution seems better. The ClusterEvaluation
could also be used in a divisive hierarchical cluster algorithm to choose the cluster which
needs to get split next.
                
> Refactoring: Move variance calculation of a centroid cluster to its class
> -------------------------------------------------------------------------
>
>                 Key: MATH-1031
>                 URL: https://issues.apache.org/jira/browse/MATH-1031
>             Project: Commons Math
>          Issue Type: Improvement
>    Affects Versions: 3.2
>            Reporter: Thorsten Schäfer
>            Priority: Minor
>         Attachments: centroid.patch
>
>
> Users might be interested in assessing the quality of each cluster in the calculated
clustering. This can be performed by calculating its variance. 
> The variance calculation is actually performed in other places (e.g. for the MultiKMeans),
but not available to end users. 
> I'd propose to add the functionality into the CentroidCluster. The one issue to consider
is that the cluster does not know based on which distance measure it was calculated. In the
implementation, I chose to parametrize the method with a distance measure which enables users
to also compare the quality based on various distance measures. Alternatively, it would be
possible to add the distance measure as a field, which is set by the clustering algorithm.
> In the patch I went for the first method and also changed the 2 other places where variance
calculation is performed to use the new feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message