Return-Path: Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: (qmail 91529 invoked from network); 10 Apr 2011 22:17:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Apr 2011 22:17:44 -0000 Received: (qmail 85096 invoked by uid 500); 10 Apr 2011 22:17:43 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 85056 invoked by uid 500); 10 Apr 2011 22:17:43 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 85048 invoked by uid 99); 10 Apr 2011 22:17:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Apr 2011 22:17:43 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Apr 2011 22:17:42 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B0B5B9B521 for ; Sun, 10 Apr 2011 22:17:05 +0000 (UTC) Date: Sun, 10 Apr 2011 22:17:05 +0000 (UTC) From: "Hudson (JIRA)" To: dev@mahout.apache.org Message-ID: <1823529545.48126.1302473825720.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1059629581.15765.1299975899375.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAHOUT-626) T1 and T2 Values in Canopy (& MeanShift) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAHOUT-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018164#comment-13018164 ] Hudson commented on MAHOUT-626: ------------------------------- Integrated in Mahout-Quality #737 (See [https://hudson.apache.org/hudson/job/Mahout-Quality/737/]) MAHOUT-626: Added optional T3/T4 arguments to Canopy. Added new unit test. All tests run > T1 and T2 Values in Canopy (& MeanShift) > ----------------------------------------- > > Key: MAHOUT-626 > URL: https://issues.apache.org/jira/browse/MAHOUT-626 > Project: Mahout > Issue Type: Improvement > Components: Clustering > Affects Versions: 0.5 > Reporter: Jeff Eastman > Assignee: Jeff Eastman > Attachments: CanopyT3T4.patch > > > Users are reporting that the T1 and T2 threshold values which work in sequential mode don't work as well in the mapreduce mode because both the mapper and reducer are using the same values. The effect of coalescing a number of points into a single centroid done by the mapper changes the distances enough that independent threshold values are needed in the reducer. > Here is a patch which implements optional T3 and T4 threshold values which are only used by the canopy reducer. Convenience methods have been added for API compatibility and defaults included so that these values will default to T1 and T2. A new unit test confirms the thresholds are being set correctly. > If this works out as a positive improvement, I will make the same changes to MeanShift and commit them -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira