mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs
Date Mon, 22 Feb 2010 17:22:27 GMT


Sean Owen commented on MAHOUT-305:

Say I've made the following ratings:

5 stars: Harry Potter
5 stars: Harry Potter 2
1 star: Maid in Manhattan

Say I remove Maid in Manhattan as test data. I run recommendations and it recommends to me
Harry Potter 3 (which presumably I would rate highly). The implementation would be penalized
for not returning Maid in Manhattan, when that's surely not what it should have returned.

Even if you take out only the most highly-rated movies as test data (this is what the existing
CF precsion/recall evaluator does), this phenomenon can still occur: the recommender could
return a movie that's better than anything you've yet seen but that would be considered 'bad'
by this evaluation style. It's still not a fair test, but it's less un-fair.

Yes you could take the 20% most-highly-rated movies from each user as test data if you like,
not just 5-star.

Say I ask for 10 recommendations. Precision @ 10 is the proportion of those 10 that were in
the users' history (top ratings). Recall @ 10 is the proportion of all top-rated items that
appeared in those 10. I think this is a little different than what you're saying?

> Combine both cooccurrence-based CF M/R jobs
> -------------------------------------------
>                 Key: MAHOUT-305
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>            Reporter: Sean Owen
>            Assignee: Ankur
>            Priority: Minor
> We have two different but essentially identical MapReduce jobs to make recommendations
based on item co-occurrence:{item,cooccurrence}. They ought
to be merged. Not sure exactly how to approach that but noting this in JIRA, per Ankur.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message