commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-1371) Provide accelerated kmeans++ implementation
Date Mon, 30 May 2016 23:57:12 GMT

    [ https://issues.apache.org/jira/browse/MATH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307022#comment-15307022
] 

Gilles commented on MATH-1371:
------------------------------

I imagine that we may have to modify the "pom.xml" file of the project in order to set up
this feature.
You are welcome to ask on the ML, and open a JIRA report to keep track of the request.

> Provide accelerated kmeans++ implementation
> -------------------------------------------
>
>                 Key: MATH-1371
>                 URL: https://issues.apache.org/jira/browse/MATH-1371
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Artem Barger
>            Assignee: Artem Barger
>         Attachments: ElkanKmeansPlusPlusClusterer.java
>
>
> There is an updated version of kmeans++ algorithm available, which is published in: Elkan,
Charles. "Using the triangle inequality to accelerate k-means." ICML. Vol. 3. 2003. paper.
> The main essence is to boost the kmeans iterations by avoiding computation of distances
between centers and points when there is no need for that. For example after the update cluster
center haven't moved too far from the point therefore no change in point assignment. The accelerated
algorithm avoids unnecessary distance calculations by applying the triangle inequality in
two different ways, and by keeping track of lower and upper bounds for distances
> between points and centers.
> Algorithm description is available in the paper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message