commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nate Paymer (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MATH-546) Truncation issue in KMeansPlusPlusClusterer
Date Sun, 13 Mar 2011 13:14:59 GMT

     [ https://issues.apache.org/jira/browse/MATH-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nate Paymer updated MATH-546:
-----------------------------

    Attachment: MATH-546.txt

I've a patch to fix this bug.

This is my first contribution to this project, so apologies if I've screwed something up :)

> Truncation issue in KMeansPlusPlusClusterer
> -------------------------------------------
>
>                 Key: MATH-546
>                 URL: https://issues.apache.org/jira/browse/MATH-546
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.0
>            Reporter: Nate Paymer
>            Priority: Minor
>              Labels: cluster
>         Attachments: MATH-546.txt
>
>
> The for loop inside KMeansPlusPlusClusterer.chooseInitialClusters defines a variable
>   int sum = 0;
> This variable should have type double, rather than int.  Using an int causes the method
to truncate the distances between points to (square roots of) integers.  It's especially bad
when the distances between points are typically less than 1.
> As an aside, in version 2.2, this bug manifested itself by making the clusterer return
empty clusters.  I wonder if the EmptyClusterStrategy would still be necessary if this bug
were fixed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message