mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Perfect neighbours
Date Mon, 01 Jun 2009 21:31:27 GMT
A 1.0 similarity doesn't usually mean that the users have rated
exactly the same items -- it means -of the items they co-rated-, there
is a perfect linear relationship between their ratings. So in general,
no I wouldn't discount such pairs of users.

It is pretty rare of course. It is less rare, I think, in your case,
since you are simply using a 'boolean' seen/not seen value for each
item, plus something like the Tanimoto similarity metric. In that
case, a 1.0 similarity actually does mean they have seen the same

If it is therefore an issue for you, we could work out some reasonable
way to inject this observation into the framework.

I think Otis is doing a sort of form of 'dimension reduction' already
here, by removing items from consideration that don't add much
information. That's kind of the same thing in this simplified scenario
where we don't have rating vectors really, just seen/not seen values.
Indeed it speeds things up a lot. I am under the impression Otis needs
real-time recommendations in his case.

On Mon, Jun 1, 2009 at 10:06 PM, Otis Gospodnetic
<> wrote:
> Hello,
> I was stepping through Taste and noticed that users with 1.0 similarity to the target
user end up in that user's neighbourhood.  1.0 similarity between users means users are exactly
the same, so is there a point in collecting them?  Since they are exactly the same as the
target user, we can't really get any new items to recommend from them.  Is this correct?
> It's probably not a frequent case to have users with identical item preferences, but
imagine a case where you are computing recommendations from top 10 most similar users and
those 10 most similar users happen to be all perfectly similar users, thus yielding no recommendations.
> Thoughts?
> Otis
> --
> Sematext -- -- Lucene - Solr - Nutch

View raw message