Return-Path: Delivered-To: apmail-lucene-mahout-user-archive@minotaur.apache.org Received: (qmail 72176 invoked from network); 15 Jan 2010 06:27:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Jan 2010 06:27:08 -0000 Received: (qmail 45765 invoked by uid 500); 15 Jan 2010 06:27:07 -0000 Delivered-To: apmail-lucene-mahout-user-archive@lucene.apache.org Received: (qmail 45688 invoked by uid 500); 15 Jan 2010 06:27:07 -0000 Mailing-List: contact mahout-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mahout-user@lucene.apache.org Delivered-To: mailing list mahout-user@lucene.apache.org Received: (qmail 45675 invoked by uid 99); 15 Jan 2010 06:27:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2010 06:27:07 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.dunning@gmail.com designates 209.85.222.194 as permitted sender) Received: from [209.85.222.194] (HELO mail-pz0-f194.google.com) (209.85.222.194) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2010 06:26:57 +0000 Received: by pzk32 with SMTP id 32so401047pzk.29 for ; Thu, 14 Jan 2010 22:26:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type; bh=7QAUvWQ83IfhqG0eShbln8YXrF6CeHSY/kj076mCguI=; b=cCUChHT+8TVc+G9rXwd+Ec7zrQqEFikMuMF9DMRheZ9FJgLv9e/J3gZgct/8C2u7tg qoP0Jhc0ZDDWQ1oV7p5UPu42tF/Lc9DX1cT7sxpCb1wmO8Y4ejln0KK5GqH753PPRoJv sLuBf6DGgj7LKgION3GMlzFv1dG+0mi6o9+h4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=himzAe7DJ4mVEcEeYveCx3hzMDOIu4NljcxIttvVuMvradBXrGr71vYZn4KlAgve+D bQB2JvWG/ccLg1g7VmW3rzN6YyBGPkc9x+tYE/66UqnqE99VMKtU59ISC1JB9g2qtAEy txyVzG9JacPmA9yxFOz/6TJX3qLCQIuRH03XQ= MIME-Version: 1.0 Received: by 10.114.138.12 with SMTP id l12mr1333761wad.118.1263536797212; Thu, 14 Jan 2010 22:26:37 -0800 (PST) In-Reply-To: <201001142209.31527.janert@ieee.org> References: <201001141957.38062.janert@ieee.org> <201001142209.31527.janert@ieee.org> From: Ted Dunning Date: Thu, 14 Jan 2010 22:26:17 -0800 Message-ID: Subject: Re: [Slightly OT] SVD etc on Map/Reduce? To: janert@ieee.org Cc: mahout-user@lucene.apache.org Content-Type: multipart/alternative; boundary=00504502d609f55ee9047d2e15aa --00504502d609f55ee9047d2e15aa Content-Type: text/plain; charset=UTF-8 On Thu, Jan 14, 2010 at 10:09 PM, Philipp K. Janert wrote: > > If you mean matrix factorization, take a look at this: > > http://arxiv.org/abs/0909.4061v1 > > That seems to support my earlier hunch that > efficient implementations of such factorizations > on M/R would likely be approximate only or > partial (ie yielding the largest of the eigenvalues, > not necessarily the entire spectrum). > For very large sparse problems, approximate decompositions are generally preferred. Due to limited accuracy in the input, only the first several eigenvectors can be extracted at all. Moreover, many important problems have very large apparent dimensionality, but limited actual rank. Neither of these characteristics is a characteristic of map-reduce in the slightest. > It is the common sense of those on this mailing list that these kinds of > > algorithms could be done using map-reduce. > > I am not sure what you are trying to tell me here. I am trying to say that we don't yet have working implementations. There should be a k-means implementation that use these techniques before long. You would be very welcome to try your hand at some other of the algorithms and I am sure that you would have quite a lot of support from the mailing list. You comments puzzle me, though. Do you have an application in mind? Was there something you were particularly looking for? -- Ted Dunning, CTO DeepDyve --00504502d609f55ee9047d2e15aa--