mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Taste-GenericItemBasedRecommender
Date Thu, 10 Sep 2009 01:12:35 GMT
Very close.  You are conceptually exactly correct.

If A contains binary visit or view data, then A'A contains counts that must
be reduced to binary values or weights using some statistical procedure.  I
prefer LLR and binary results.

If A contains counts weighted by inverse user frequency, then your dot
product is roughly usable as a similarity score.  This is especially true if
rows of A are normalized somehow to account for over-active users.

On Wed, Sep 9, 2009 at 3:42 PM, Gökhan Çapan <> wrote:

> A is the user x item history matrix.  Each row is a user history.
> >
> > A' is the transposed user x item matrix which is of the shape item x
> user.
> >
> > A' A is the user-level item cooccurrence matrix and has the shape item x
> > item.
> >
> Then (A' A)ij is a similarity weight between ith and jth items.
> if Aij   is the "rating of ith user for jth item",  the highest value of
> "ith row of A' A" is the most similar item for "ith item".
> if the values in A are binary, then (A' A)ij   is  number of users who have
> rated/clicked/viewed  both item i and item j.

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message