Return-Path: Delivered-To: apmail-lucene-mahout-dev-archive@locus.apache.org Received: (qmail 48073 invoked from network); 3 Sep 2008 17:55:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Sep 2008 17:55:47 -0000 Received: (qmail 90035 invoked by uid 500); 3 Sep 2008 17:55:44 -0000 Delivered-To: apmail-lucene-mahout-dev-archive@lucene.apache.org Received: (qmail 89865 invoked by uid 500); 3 Sep 2008 17:55:44 -0000 Mailing-List: contact mahout-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mahout-dev@lucene.apache.org Delivered-To: mailing list mahout-dev@lucene.apache.org Received: (qmail 89832 invoked by uid 99); 3 Sep 2008 17:55:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Sep 2008 10:55:43 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [208.97.132.145] (HELO spunkymail-a8.g.dreamhost.com) (208.97.132.145) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Sep 2008 17:54:42 +0000 Received: from [192.168.0.3] (adsl-074-229-189-244.sip.rmo.bellsouth.net [74.229.189.244]) by spunkymail-a8.g.dreamhost.com (Postfix) with ESMTP id 6F9CD109FA9 for ; Wed, 3 Sep 2008 10:55:20 -0700 (PDT) Message-Id: <671F1CEB-919C-4E06-9833-23BAF35B55F7@apache.org> From: Grant Ingersoll To: mahout-dev@lucene.apache.org In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: Re : FYI Cloud Computing Resources Date: Wed, 3 Sep 2008 13:54:41 -0400 References: <840344.23640.qm@web26306.mail.ukl.yahoo.com> X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org On Sep 3, 2008, at 4:34 AM, Sean Owen wrote: > Yeah it's almost over unfortunately. :) I tried this a while ago with > a slope-one recommender, and was only about able to match Netflix's > current performance. I published some support code for people who > wanted to play with it but removed it from Mahout's copy as legacy > code. Hmm, probably useful to keep the code around, even if it's just used =20 as a sample of how to do things w/ Taste. I imagine the Netflix data =20= will live on for quite some time. > > > I didn't really have time to investigate more. Some of the insights > that have fallen out from the competition are pretty great. For > example: one person took advantage of a sort of "memory effect" for > recommendations.... people tend to at times over-rate movies and at > times under-rate movies. So if you kind of correct for this -- that a > sequence of 5-star ratings may not be as meaningful as a 5-star rating > in the middle of several 2-star ratings, you get much better > performance. > > This nugget of knowledge may be specific to Netflix, not sure. But it > was interesting. > > On Wed, Sep 3, 2008 at 9:28 AM, deneche abdelhakim =20 > wrote: >> I came across the following competition >> >> http://www.netflixprize.com/index >> >> >> It's about recommender systems, so I think it's a Taste stuff. The =20= >> training dataset consists of more than 100M ratings. >> >> >> ----- Message d'origine ---- >> De : Josh Myer >> =C0 : mahout-dev@lucene.apache.org >> Envoy=E9 le : Mercredi, 30 Juillet 2008, 18h19mn 25s >> Objet : Re: FYI Cloud Computing Resources >> >> On Wed, Jul 30, 2008 at 11:26:29AM -0400, Grant Ingersoll wrote: >>> http://research.yahoo.com/node/2328 >>> >>> It _MAY_ (stressed, emphasized, etc.) be possible for Mahouters (or >>> are we just Mahouts?) to get some access to these resources. One =20= >>> big >>> question is where can we get some fairly large data sets (large, but >>> not super large, I think, but am not sure) >>> >>> If you have ideas, etc. please let us know. >>> >> >> It's worth plugging (theinfo), http://theinfo.org/. It's a project =20= >> to >> collect references to datasets, and may help here. Unfortunately, it >> seems to be laggy at the moment. I'll poke Aaron about that =3D) >> >> HtH, >> -- >> Josh Myer >> josh@joshisanerd.com >> >> >> >> >> -------------------------- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ