Return-Path: Delivered-To: apmail-lucene-mahout-user-archive@minotaur.apache.org Received: (qmail 18389 invoked from network); 10 Jul 2009 22:57:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Jul 2009 22:57:07 -0000 Received: (qmail 8842 invoked by uid 500); 10 Jul 2009 22:57:17 -0000 Delivered-To: apmail-lucene-mahout-user-archive@lucene.apache.org Received: (qmail 8784 invoked by uid 500); 10 Jul 2009 22:57:17 -0000 Mailing-List: contact mahout-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mahout-user@lucene.apache.org Delivered-To: mailing list mahout-user@lucene.apache.org Received: (qmail 8774 invoked by uid 99); 10 Jul 2009 22:57:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2009 22:57:17 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.dunning@gmail.com designates 209.85.217.205 as permitted sender) Received: from [209.85.217.205] (HELO mail-gx0-f205.google.com) (209.85.217.205) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2009 22:57:09 +0000 Received: by gxk1 with SMTP id 1so535084gxk.5 for ; Fri, 10 Jul 2009 15:56:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=KsZVbYKBaykTgKUMbvW9A/t17mr5qcf15GXrdyxBzPE=; b=bYel73fOwUMW8Gi8TAvtwSceeEOn05/49OxLjBuSuJe/d5IFfYzxL0EnaZZEBAx4sC ooXue6N2o+MZxHTek0/NCimWUn/kIa6XQVZ4sMkUSYHtocHjAqc11/yZQ7u791TFUIjx EiFnqAIC3rcuZKfqTuyLCwP2GFvnZ/SZLs3fI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=rOqxjdXKKL2kSLw9NZxbki+5YOEzKajnucjF5vHx7aY5G7vuM5NRRoYnpFfL64pzTA 3lji2MgUqUxt2JeS7apMJyKSkGt7KVSqUvDkoOeR/bXJEVhpuuzOrQIR6DYmUHs18X/F CyGzug7V8C+fqNSzKTe5pV1GsqOaTIkOyYmHk= MIME-Version: 1.0 Received: by 10.150.144.9 with SMTP id r9mr4107020ybd.289.1247266608146; Fri, 10 Jul 2009 15:56:48 -0700 (PDT) In-Reply-To: <85d3c3b60907101550v4f843090ka9f69f1143ab4897@mail.gmail.com> References: <4A5703F8.8050603@mufin.com> <4A57319F.5090008@mufin.com> <85d3c3b60907101248p49b36dcau5cb54fec13fa1a1f@mail.gmail.com> <85d3c3b60907101550v4f843090ka9f69f1143ab4897@mail.gmail.com> From: Ted Dunning Date: Fri, 10 Jul 2009 15:56:28 -0700 Message-ID: Subject: Re: Memory and Speed Questions for Item-Based-Recommender To: mahout-user@lucene.apache.org Content-Type: multipart/alternative; boundary=000e0cd56a121e6f23046e61e3ed X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd56a121e6f23046e61e3ed Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Yes. One gotcha is that you generally have to limit document size a bit to get good performance. This is not a big deal because document normalization makes it hard for these documents to be retrieved in any case. Also, these are typically not good second order recommendations. First order recommendations are the top-40 kinds of things and make poor recommendations for a bunch of reasons. Second order recommendations are those that are based on your history. They make much better recommendations. On Fri, Jul 10, 2009 at 3:50 PM, Jason Rutherglen < jason.rutherglen@gmail.com> wrote: > Interesting. So we're creating the item-item matrix using one of the Mahout > algorithms (like Taste?), then dumping it into Lucene. > --000e0cd56a121e6f23046e61e3ed--