Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9E55E7BD6 for ; Tue, 6 Sep 2011 19:36:14 +0000 (UTC) Received: (qmail 40587 invoked by uid 500); 6 Sep 2011 19:36:13 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 40516 invoked by uid 500); 6 Sep 2011 19:36:12 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 40508 invoked by uid 99); 6 Sep 2011 19:36:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2011 19:36:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ted.dunning@gmail.com designates 209.85.216.42 as permitted sender) Received: from [209.85.216.42] (HELO mail-qw0-f42.google.com) (209.85.216.42) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2011 19:36:05 +0000 Received: by qwi4 with SMTP id 4so5742500qwi.1 for ; Tue, 06 Sep 2011 12:35:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=oJXjYbEGBJxvy6nvWMpPpArXS/53ro3n/60icnzSgQs=; b=lwAq5SsjIRGyPlKYAaFCnPSq6E/R8bjcCIPK9Os4ATCsU6nXs/nbKXwEVMpQMpzLR/ lhHui2RRFU63pkyd3OyiJ346mR+A62w1pz/wkw3fm4OYgl6HK3A8ahH9V6jFuuB/+tkd XClIU+vYHU4RIVaYPLCMToJl7iX00EbGLzgCI= Received: by 10.224.174.77 with SMTP id s13mr4443368qaz.223.1315337744100; Tue, 06 Sep 2011 12:35:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.80.136 with HTTP; Tue, 6 Sep 2011 12:35:24 -0700 (PDT) In-Reply-To: References: <1315312426.51915.YahooMailNeo@web39422.mail.mud.yahoo.com> From: Ted Dunning Date: Tue, 6 Sep 2011 19:35:24 +0000 Message-ID: Subject: Re: how to run PCA from Mahout To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=20cf3030bcbdff2d1f04ac4aeed5 X-Virus-Checked: Checked by ClamAV on apache.org --20cf3030bcbdff2d1f04ac4aeed5 Content-Type: text/plain; charset=UTF-8 Another option is to invent another kind of matrix that knows about an offset. Then a special method for times may give the right performance. A third option is to do a little algebra on the PCA algorithm to propagate the mean offset into the stochastic projection algorithm. On Tue, Sep 6, 2011 at 7:24 PM, Ted Dunning wrote: > Sure. > > Do the subtraction after the B = Q'A step in the random projection! > > On Tue, Sep 6, 2011 at 7:16 PM, Dmitriy Lyubimov wrote: > >> On Tue, Sep 6, 2011 at 12:11 PM, Ted Dunning >> wrote: >> > Note that normally subtracting anything fills in sparse matrices. >> >> is there a way to cope with this without changing SVD contracts? >> > > --20cf3030bcbdff2d1f04ac4aeed5--