mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Weimer <>
Subject Re: Two learning competitions that might be of interest for Mahout
Date Tue, 15 Feb 2011 19:04:50 GMT

Given that I published (one of?) the first recommender algorithms that
aimed at directly predicting the top-k for a given question, I am
somewhat sympathetic to your point. It is true that in most recommender
systems, one in the end only produces a ranked list of items, so why not
go for it?

However, the underlying data collected from Yahoo! Music over the past
couple of years consists of the ratings given by the users, so there is
little to change about the underlying data.

Regarding the evaluation measure: While I do believe that ranking
measures such as NDCG are more indicative (can't contradict myself here
;-) ), there is no generally accepted ranking measure to use. I have
also been convinced both by my own experiments and the community that
just about any ranking measure correlates well with RMSE.

Hope this helps,


On 2/15/11 3:19 AM, Chen_1st wrote:
> Hi, Markus,
> I am curious why the competition still tries to predict the rating
> values, now that top k recommendation is more practical in real life
> applications, and it's illustrated by many papers that rating value
> prediction is not so useful for discovery of top k items.
> Best Regards.
> Chen
> 2011/2/12, Markus Weimer <>:
>> Hi,
>> go for it! I'd do it myself but the rules we wrote prohibit me from
>> doing so ;-)
>> Take care,
>> Markus
>> On 2/11/11 4:36 AM, Sean Owen wrote:
>>> While I may not spend time trying to win the first one, I'd be happy
>>> to run it through what we have so far and enter the results. It would
>>> be an interesting benchmark.
>>> On Fri, Feb 11, 2011 at 12:26 PM, Isabel Drost <> wrote:
>>>>> KDD-Cup 2011: Recommending Music Items based on the Yahoo! Music
>>>>> Dataset We challenge participants to identify user tastes in music by
>>>>> analyzing real ratings of Yahoo! Music anonymized users. The dataset
>>>>> represents a snapshot of the community's preferences for various
>>>>> musical items.
>>>>> The goal of the prize is to develop a predictive algorithm that can
>>>>> identify patients who will be admitted to the hospital within the
>>>>> next year, using historical claims data.
>>>> Isabel

View raw message