commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Neidhart (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MATH-547) KMeansPlusPlusClusterer should not call equals()
Date Fri, 01 Apr 2011 14:59:05 GMT

     [ https://issues.apache.org/jira/browse/MATH-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Neidhart updated MATH-547:
---------------------------------

    Attachment: MATH-547.patch

Added a simple patch to prevent comparing cluster centroids as an exit condition. 
Instead, the assignments of the data points to the clusters are tracked using an int array.

> KMeansPlusPlusClusterer should not call equals()
> ------------------------------------------------
>
>                 Key: MATH-547
>                 URL: https://issues.apache.org/jira/browse/MATH-547
>             Project: Commons Math
>          Issue Type: Improvement
>    Affects Versions: 3.0
>            Reporter: Nate Paymer
>            Priority: Minor
>         Attachments: MATH-547.patch
>
>
> In determining whether the clusters have changed between iterations, the KMeansPlusPlusClusterer
currently calls equals to determine whether the cluster centers have changed.  It would be
better to avoid relying on equals by instead checking whether any points have moved between
clusters.
> equals can be problematic because floating point operations are not strictly commutative
or associative, so getCentroid may return slightly different values even when called with
the same set of inputs.  Additionally, the client may choose not to override equals at all,
since it's not clear that it's required.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message