Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 70852 invoked from network); 25 Nov 2010 10:13:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Nov 2010 10:13:00 -0000 Received: (qmail 2696 invoked by uid 500); 25 Nov 2010 10:12:59 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 2488 invoked by uid 500); 25 Nov 2010 10:12:59 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 2477 invoked by uid 99); 25 Nov 2010 10:12:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Nov 2010 10:12:58 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dineshbvadhia@hotmail.com designates 65.55.34.14 as permitted sender) Received: from [65.55.34.14] (HELO col0-omc1-s4.col0.hotmail.com) (65.55.34.14) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Nov 2010 10:12:49 +0000 Received: from COL103-DS10 ([65.55.34.7]) by col0-omc1-s4.col0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 25 Nov 2010 02:12:28 -0800 X-Originating-IP: [95.148.146.107] X-Originating-Email: [dineshbvadhia@hotmail.com] Message-ID: From: "Dinesh B Vadhia" To: References: In-Reply-To: Subject: Re: Matrix-based recommendation analysis Date: Thu, 25 Nov 2010 02:12:29 -0800 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0035_01CB8C46.36041E30" X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 14.0.8064.206 X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8064.206 X-OriginalArrivalTime: 25 Nov 2010 10:12:28.0252 (UTC) FILETIME=[439641C0:01CB8C89] X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_0035_01CB8C46.36041E30 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hello! Have looked at the presentation and trying to get my head around = it: i. is collaborative filtering being bypassed? ii. are new entries (observations) added dynamically or as a batch = process? =20 From: Ted Dunning=20 Sent: Tuesday, November 23, 2010 8:10 AM To: user@mahout.apache.org=20 Subject: Re: Matrix-based recommendation analysis For cross recommender comprehension, I recommend something like the = example in my slide show. In that example, users issued query terms (giving the = u x q matrix B) and they watched videos (giving the u x v matrix A). The = cross recommendation is a smoothed version of A' B (which result is v x q). This matrix could be used to take query terms and recommend videos. = That is (A' B) q =3D v. With suitable cleanup of the A'B to suppress spurious entries, this makes a workable search engine. Bringing a concept like days of the week into the mix is a bit = confusing. That could give you a smoothed popularity of content per day of the = week, but that is normally done by much simpler means. The biggest difference = is that you can't pick three days of the week, but you can put three terms = into a query. On Tue, Nov 23, 2010 at 12:15 AM, Lance Norskog = wrote: > I'm not trying to distinguish them. That is the "find the Netflix = user" > paper :) > > I just want to understand the cross-recommender concept, that's all. > Yes, this sample is too small to impute "enthusiasm"- the numbers are > recommendation values. > > (If the rest of you want to follow along: > http://www.slideshare.net/tdunning/intelligent-search , slides 35-36) > > On Mon, Nov 22, 2010 at 11:50 PM, Sean Owen wrote: > > (PS I don't think that link from Ted is publicly visible but try > > http://www.slideshare.net/tdunning ) > > > > Maybe I'm walking into half of a another conversation but what's the > > question or goal here? > > > > I don't think the matrix product contains quite what you're saying. > > For example U1 records only 2 ratings but has some "enthusiasm" on 3 > > separate days in the matrix product. The product is mashing together > > item-day associations from all users and applying them to each user. > > > > Conceptually user-item-day is the 3-dimensional matrix that it = sounds > > like, if you want to distinguish associations from different users = to > > different items on different days. > > > > > > On Tue, Nov 23, 2010 at 7:24 AM, Lance Norskog > wrote: > >> The GroupLens dataset has User, Item, Rating and Timestamp. > >> We will use the rating of 1-5 as-is, but will reduce the timestamp > >> field to day of the week. > >> The lack of a rating defaults two 3 (neutral). There are 5 ratings > >> total in the sample: > >> > >> U1, I1, 2, ? > >> U1, I3, 4, ? > >> U2, I1, 4, ? > >> U2, I2, 5, T > >> U2, I3, 3, ? > >> > >> (We'll get to the question marks later.) > >> Now, make two matrices, User v.s. Item and Item v.s. Day of the = Week. > >> User v.s. Item contains ratings, and Item v.s. Day of the Week > >> contains the number of rating records for that item on that day of = the > >> week: ratings only cover Sunday, Monday and Tuesday. > >> > >> Formatting tables in kerned fonts just plain doesn't work, thus the > >> alternate format. > >> > >> 2 Users v.s. 3 Items: > >> I1,I2,I3 > >> { > >> U1 {2,3,4} > >> U2 {4,5,3} > >> } > >> > >> 3 Items v.s. 7 Days of the Week > >> S,M,T,W,T,F,S > >> { > >> I1 {1,0,1,0,0,0,0} > >> I2 {0,0,1,0,0,0,0} > >> I3 {0,1,1,0,0,0,0} > >> } > >> > >> Now, multiply these two matrices. The product is 2 Users v.s. 7 = Days > >> of the Week: > >> S,M,T,W,T,F,S > >> { > >> U1 {2,4,9,0,0,0,0} > >> U2 {4,3,12,0,0,0,0} > >> } > >> > >> This matrix carries the total amount of enthusiasm for each user on > >> each day. To get the average enthusiasm of each user, divide each = row > >> by the total number of ratings per day: > >> S,M,T,W,T,F,S > >> { > >> U1 {2,4,3,0,0,0,0} > >> U2 {4,3,4,0,0,0,0} > >> } > >> > >> Did I get this right, Ted? > >> > >> BTW, where are your slides for this topic? I've seen them a couple = of > >> times in presentations (live and on Fora.tv), but can't find them. > >> > >> -- > >> Lance Norskog > >> lance.norskog@gmail.com > >> > > > > > > -- > Lance Norskog > goksron@gmail.com > ------=_NextPart_000_0035_01CB8C46.36041E30--