mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Schilling <>
Subject Re: user-user recommendations
Date Sat, 19 Feb 2011 09:02:10 GMT
Hey Sean,

Thank you for the detailed reply.  Interesting points.  I think I have approached some of
these points in my subsequent emails. 

You bring up the case where all the users hate the same item.  What about the case where very
few (a single?) similar users loves a place?  In that case, is this really a better  recommendation
than the popular vote?  Where is the middle ground.  I think its an interesting point.   Ill
see how the SVD performs.

On Feb 18, 2011, at 11:20 PM, Sean Owen wrote:

> User-user similarity is based on these counts? That sounds a bit like
> the Tanimoto / Jaccard coefficient.See TanimotoCoeffcientSimilarity.
> Yes you can use that though log-likelihood is probably a more
> sophisticated choice.
> Recommending an item that occurs most in the neighborhood? Sure you
> can make it work that way. It probably works "OK" in practice though
> you can see possible problems with it. What if everyone in the
> neighborhood hates an item? this would recommend it highly. It's also
> throwing away the degree of similarity to the user who likes an item.
> The conventional wisdom in recommenders is that you want to fight the
> tendency to always recommend well-known items. People probably already
> know about the well-known items even if they've not rated them yet. It
> also makes the recommendations less personalized in a sense -- the
> recommendation result approaches the one you'd get by just
> recommending the globally most-preferred items.
> If your goal is to fight sparseness, start looking at SVD-based
> methods. This is really the point of SVDs, to "summarize" a very
> high-dimensional user-item matrix in a much lower-dimensional "user
> group" - "item group" matrix. Maybe you don't have enough information
> to recommend Bauhaus to Joan, a teenage goth, but, the SVD lets you
> sort of draw conclusions like "gothy teens like Peter Murphy's
> albums". That is the summary is much less sparse and so works better
> for recommendation for users/items with little connection to the rest
> of the matrix otherwise.
> On Sat, Feb 19, 2011 at 2:43 AM, Chris Schilling <> wrote:
>> Hello again,
>> Very simple question here:  I am also testing the user-user cf in mahout.  So, once
I define my user neighborhood, is it possible to select the recommendations from that based
on the number of preferences per item rather than a weighted average?  Basically, I'd like
to recommend the items with the most preferences.  It would be simple to implement, so I was
curious if this was already possible.  I understand that in this case, the counts become dependent
on the size of the neighborhood. This is something I'd want to use for testing.
>> Thanks
>> Chris

View raw message