Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 46432 invoked from network); 1 Feb 2011 16:28:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Feb 2011 16:28:04 -0000 Received: (qmail 81865 invoked by uid 500); 1 Feb 2011 16:28:03 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 81688 invoked by uid 500); 1 Feb 2011 16:28:01 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 81660 invoked by uid 99); 1 Feb 2011 16:28:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 16:28:00 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ted.dunning@gmail.com designates 209.85.160.170 as permitted sender) Received: from [209.85.160.170] (HELO mail-gy0-f170.google.com) (209.85.160.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Feb 2011 16:27:55 +0000 Received: by gyf2 with SMTP id 2so2373072gyf.1 for ; Tue, 01 Feb 2011 08:27:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=uhVqPVA+b/GM3Z80KU1LqyClAFv+XIuipe6nFRhquLk=; b=XhPo1+/h+aaQ3vebJL7tvF0kWU7dQBELa1L2/cUwAsD2QshnrnHf+fWjZOJbGA2PiE uJBEo7Oz7AlsmH0uP+SqyZ7IrgjHB0pZLLuidp7hTjHENVWX4sUV4n2avDpk/LBt+o5J 1O9Fin1fx/8qwyqDFCzGYwZyoP+IxTU0OT/x0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=nrsvwwJQSGXMPxvN0fpOq2tyRj7URsf/Zw+3s5GDf2PTteP3h3y7Rc652PL3z/LX1T gzVVth40AqMhb58yrkkK+b2hvi1Ul60fzPp/xxplIjSiVPsvDDANNZYDQxP6A3UllQVA 8DBiHLPMeuIxWRCAc6kcDR1bzeczJIDYvkhZI= Received: by 10.236.108.135 with SMTP id q7mr16450983yhg.20.1296577654084; Tue, 01 Feb 2011 08:27:34 -0800 (PST) MIME-Version: 1.0 Received: by 10.236.103.5 with HTTP; Tue, 1 Feb 2011 08:26:50 -0800 (PST) In-Reply-To: References: From: Ted Dunning Date: Tue, 1 Feb 2011 08:26:50 -0800 Message-ID: Subject: Re: Recommeding on Dynamic Content To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=90e6ba5bc9d97edeb4049b3b02ea --90e6ba5bc9d97edeb4049b3b02ea Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Here is a pointer to the Menon and Elkan paper: http://arxiv.org/abs/1006.2156 Also, see chapter 17 of Mahout in Action for a description of how you can use the SGD classifiers already in Mahout for this kind of work. You lose the very cool recommendations framework that Mahout has, but you gain the ability to recommend in high churn situations. On Tue, Feb 1, 2011 at 1:52 AM, Sean Owen wrote: > One approach is to use user-user similarities. Those build up over time > based on historical data, but can be used to produce recommendations for > brand-new items going forward. > > It still has a cold-start problem; until anyone connects to one of those > new > items, it can't be recommended. > > Another approach is to use the item's characteristics to determine some > notion of similarity, in the absence of clicks. That's what you're doing > and > it's a great approach. > > You can also consider hybrid approaches. You could try to mix > recommendations based on two different approaches -- clicks-based and > content-based. The problem is knowing how to mix things since the scores > are > not at all comparable. > > That Elkan / Menon paper has an elegant theoretical formulation of a > recommender that uses both ratings and side info at the same time. > > > On Mon, Jan 31, 2011 at 11:26 PM, G=C3=B6khan =C3=87apan wrote: > > > Hi, > > > > I've made a search, sorry in case this is a double post. > > Also, this question may not be directly related to Mahout. > > > > Within a domain which is enitrely user generated and has a very big ite= m > > churn (lots of new items coming, while some others leaving the system), > > what > > do you recommend to produce accurate recommendations using Mahout (Not > just > > Taste)? > > > > I mean, as a concrete example, in the eBay domain, not Amazon's. > > > > Currently I am creating item clusters using LSH with MinHash (I am not > sure > > if it is in Mahout, I can contribute if it is not), and produce > > recommendations using these item clusters (profiles). When a new item > > arrives, I find its nearest profile, and recommend the item where its > > belonging profile is recommended to. Do you find this approach good > enough? > > > > If you have a theoretical idea, could you please point me to some relat= ed > > papers? > > > > (As an MSc student, I can implement this as a Google Summer of Code > > project, > > with your mentoring.) > > > > Thanks in advance > > > > -- > > Gokhan > > > --90e6ba5bc9d97edeb4049b3b02ea--