Return-Path: Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: (qmail 35230 invoked from network); 22 Oct 2010 16:49:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Oct 2010 16:49:40 -0000 Received: (qmail 39692 invoked by uid 500); 22 Oct 2010 16:49:39 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 39556 invoked by uid 500); 22 Oct 2010 16:49:38 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 39539 invoked by uid 99); 22 Oct 2010 16:49:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Oct 2010 16:49:38 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Oct 2010 16:49:37 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9MGnHgQ012115 for ; Fri, 22 Oct 2010 16:49:17 GMT Message-ID: <10776101.28321287766157261.JavaMail.jira@thor> Date: Fri, 22 Oct 2010 12:49:17 -0400 (EDT) From: "Luc Maisonobe (JIRA)" To: issues@commons.apache.org Subject: [jira] Commented: (MATH-429) KMeansPlusPlusClusterer breaks by division by zero In-Reply-To: <20268759.23331287734535844.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MATH-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923906#action_12923906 ] Luc Maisonobe commented on MATH-429: ------------------------------------ You have encountered one classical proble with k-means: at some stage (here at the first iteration), one of the clusters becomes empty. This case is currently no handled by commons-math (which is a bug, so we have to fix it). When a cluster is empty, a new centroid must be defined from the other clusters. There are different strategies: # take the point farthest from any cluster # select a random point from the cluster with the largest distance variance # select a random point from the cluster with the largest number of points My prefered choice would be 2, what do other people think ? > KMeansPlusPlusClusterer breaks by division by zero > -------------------------------------------------- > > Key: MATH-429 > URL: https://issues.apache.org/jira/browse/MATH-429 > Project: Commons Math > Issue Type: Bug > Affects Versions: 2.1 > Environment: Java, Windows > Reporter: Erik van Ingen > Priority: Blocker > Attachments: KMeansPlusPlusClustererTest.java > > Original Estimate: 3h > Remaining Estimate: 3h > > For a certain space, KMeansPlusPlusClusterer breaks. This is a blocker because this space occurs in our domain. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.