mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Popularity of recommender items
Date Thu, 06 Feb 2014 13:15:04 GMT
I have always defined popularity as just the number of ratings/prefs,
yes. You could rank on some kind of 'net promoter score' -- good
ratings minus bad ratings -- though that becomes more like 'most

How do you get popularity from similarity -- similarity to what?
Ranking by sum of similarities seems more like a measure of how much
the item is the 'centroid' of all items. Not necessarily most popular
but 'least eccentric'.

On Thu, Feb 6, 2014 at 7:41 AM, Tevfik Aytekin <> wrote:
> Well, I think what you are suggesting is to define popularity as being
> similar to other items. So in this way most popular items will be
> those which are most similar to all other items, like the centroids in
> K-means.
> I would first check the correlation between this definition and the
> standard one (that is, the definition of popularity as having the
> highest number of ratings). But my intuition is that they are
> different things. For example. an item might lie at the center in the
> similarity space but it might not be a popular item. However, there
> might still be some correlation, it would be interesting to check it.
> hope it helps
> On Wed, Feb 5, 2014 at 3:27 AM, Pat Ferrel <> wrote:
>> Trying to come up with a relative measure of popularity for items in a recommender.
Something that could be used to rank items.
>> The user - item preference matrix would be the obvious thought. Just add the number
of preferences per item. Maybe transpose the preference matrix (the temp DRM created by the
recommender), then for each row vector (now that a row = item) grab the number of non zero
preferences. This corresponds to the number of preferences, and would give one measure of
popularity. In the case where the items are not boolean you'd sum the weights.
>> However it might be a better idea to look at the item-item similarity matrix. It
doesn't need to be transposed and contains the "important" similarities--as calculated by
LLR for example. Here similarity means similarity in which users preferred an item. So summing
the non-zero weights would give perhaps an even better relative "popularity" measure. For
the same reason clustering the similarity matrix would yield "important" clusters.
>> Anyone have intuition about this?
>> I started to think about this because transposing the user-item matrix seems to yield
a fromat that cannot be sent directly into clustering.

View raw message