Return-Path: Delivered-To: apmail-lucene-mahout-user-archive@minotaur.apache.org Received: (qmail 74652 invoked from network); 12 Feb 2010 22:49:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 12 Feb 2010 22:49:56 -0000 Received: (qmail 92733 invoked by uid 500); 12 Feb 2010 22:49:56 -0000 Delivered-To: apmail-lucene-mahout-user-archive@lucene.apache.org Received: (qmail 92692 invoked by uid 500); 12 Feb 2010 22:49:56 -0000 Mailing-List: contact mahout-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mahout-user@lucene.apache.org Delivered-To: mailing list mahout-user@lucene.apache.org Received: (qmail 92682 invoked by uid 99); 12 Feb 2010 22:49:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Feb 2010 22:49:56 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.dunning@gmail.com designates 209.85.222.175 as permitted sender) Received: from [209.85.222.175] (HELO mail-pz0-f175.google.com) (209.85.222.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Feb 2010 22:49:49 +0000 Received: by pzk5 with SMTP id 5so4530936pzk.29 for ; Fri, 12 Feb 2010 14:49:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=OzSfggMJz43vZ5yHxyPwvxSWXighTyNzvmbLhlcAYM0=; b=ifFkQb0gmlzp8kDDLcdC+ddHrqjoma7OCHIjGQKJUt9v3u7D0y2oHh8Jd1jKZwxlTN plKRypYsR3Y9NR+ubE4EOGhK6hsUx1QJO2Et1AmFfO4G+had+jc04BwJHGn8vsF7zwS6 L295eYOAKWlQ8WdRtSbTzL9SKHaC27m5iUVGs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=jVujfMK0k0bPD/M9yjyw5Qgbe5UHU8fLVWqpvlzKmSJJ5/gwa0fBSkY9EB6mz8r1UX iTvxh43ZO29DQg/iOZTUU89EsVF3ie4cTYrp+EwEsOgTi0UtFFTiIpxK+++ST5LhIKaC EBTn7aSGwkiqapXFZektrK5vgemSWeRt1GHnU= MIME-Version: 1.0 Received: by 10.141.13.5 with SMTP id q5mr1347919rvi.224.1266014969071; Fri, 12 Feb 2010 14:49:29 -0800 (PST) In-Reply-To: <7d7600c51002120427wa68ee5dk665a4e6976bb081a@mail.gmail.com> References: <7d7600c51002100219l4ec345edybc33b6bbb2ecca6@mail.gmail.com> <995905631002110016g74d3f2bek3f88c198f6424bf1@mail.gmail.com> <7d7600c51002110031k66952f29na763e99853fd5938@mail.gmail.com> <995905631002110039t461343c3r5e8b5458d3b9d9be@mail.gmail.com> <4B73C94A.4050403@tis.bz.it> <995905631002110146m7c1dccc5xd8d741d7f0049d25@mail.gmail.com> <7d7600c51002120052m5febfd73if9c28769ec98dbb6@mail.gmail.com> <70A868C1-7E2E-4FB8-9472-89B879B24B25@apache.org> <7d7600c51002120427wa68ee5dk665a4e6976bb081a@mail.gmail.com> From: Ted Dunning Date: Fri, 12 Feb 2010 14:49:09 -0800 Message-ID: Subject: Re: Mahout Usage and Beyond To: mahout-user@lucene.apache.org Content-Type: multipart/alternative; boundary=000e0cd11772830936047f6f1414 --000e0cd11772830936047f6f1414 Content-Type: text/plain; charset=UTF-8 On Fri, Feb 12, 2010 at 4:27 AM, Robin Anil wrote: > 1. Locally Weighted Linear Regression > Not sure how important this one is. > 2. Naive Bayes(We have this and CBayes as a bonus) > 3. Gaussian Discriminative Analysis (GDA) > DP clustering does this, effectively, I think. > 4. Logistic Regression (LR) (In development) > SGD. In dev as you say. > 5. k-means(we have this and kmeans++ is in development) > 6. Neural Network (NN) > SGD could implement this if we like. Not sure that we need M/R to get speed here. > 7. Principal Components Analysis (PCA) > = SVD and Jake's contribution. > 8. Independent Component Analysis (ICA) > 9. Expectation Maximization (EM) (We have it in pig script and in couple > of algorithms not generic yet) > DP clustering is a version of this for some applications. > 10. Support Vector Machine (SVM)(In development - The pegasus version) > So I think that we are actually at about 7 or 8 / 10 with several interesting additions. More than the original 10, we need realistic and simple examples. -- Ted Dunning, CTO DeepDyve --000e0cd11772830936047f6f1414--