Return-Path: Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: (qmail 28068 invoked from network); 12 Aug 2010 22:34:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Aug 2010 22:34:59 -0000 Received: (qmail 58899 invoked by uid 500); 12 Aug 2010 22:34:58 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 58852 invoked by uid 500); 12 Aug 2010 22:34:58 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 58844 invoked by uid 99); 12 Aug 2010 22:34:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Aug 2010 22:34:58 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com designates 209.85.216.42 as permitted sender) Received: from [209.85.216.42] (HELO mail-qw0-f42.google.com) (209.85.216.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Aug 2010 22:34:50 +0000 Received: by qwb8 with SMTP id 8so2549760qwb.1 for ; Thu, 12 Aug 2010 15:34:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=MyrWgMVWVXz+n86T1kxK5zjaFpCyquP/E3xtaTto3FI=; b=A62fgJ53TCUZHT0diS7YyNKGAQzcny6M51Ao3I0kgyrPWJ0ztonnG9VUNomaKryL3/ xAodTewhoPyc5sC5BNHyZuMEhT3Lq8kL8FrbDxknHDU5SfGklQbTD3gO3hegQZbOjjQP gWU92qqKEk15Q09y1CIENs7iobe/VB2QuDI/o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=Fx4wH8dBm9/hDwM8+802rppd77XlKY2zon7Gz3ABhdq3IXzC4w1dLxZT+3JvKERdcM XdHpOk9GSIhTynQEckmD2bYaHukkK8dEfZ9Qbrzf+IxjLASJQQuUxW4tpBYoQFIJaCYM +W/a/6yngn3SBsjjEro5RHzAHOkEvFjSB4s7g= Received: by 10.224.69.16 with SMTP id x16mr433358qai.284.1281652469264; Thu, 12 Aug 2010 15:34:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.54.138 with HTTP; Thu, 12 Aug 2010 15:34:09 -0700 (PDT) In-Reply-To: References: From: Ted Dunning Date: Thu, 12 Aug 2010 15:34:09 -0700 Message-ID: Subject: Re: MeanShift Clustering Patch To: dev@mahout.apache.org Content-Type: multipart/alternative; boundary=001485e988e427f357048da7f80d X-Virus-Checked: Checked by ClamAV on apache.org --001485e988e427f357048da7f80d Content-Type: text/plain; charset=UTF-8 This is a great thing in general, and we were just discussing how the clustering and classification API's need to be made more coherent. One thing that I particularly want to have at the end of that exercise is to have clusters and classification models be unified. It should not matter (much) where a model came from, you should be able to classify new examples using it. You should also be able to save and restore the model in a pretty uniform way. This also implies that we need a consistent way to represent examples to be classified. What you are talking about so far is to make the construction of clustering models more consistent which is really, really good and important, but it needs to be in the large context of making clustering and classification coherent as well. What thoughts do you have on larger scale design issues? What would you like to see? Can you share some user stories about how you would like to use clustering? On Thu, Aug 12, 2010 at 3:08 PM, Chris Wailes wrote: > Lastly, I made an API change so that the MeanShiftClusterer behaved in a > more OO fashion. Now, instead of having a static method > MeanShiftClusterer.clusterPoints() that then creates a MeanShiftClusterer > object, there is an instance method called cluster(). It uses the same > code, but makes re-use a lot easier if you want to cluster several groups of > points using the same parameters. > --001485e988e427f357048da7f80d--