mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (JIRA)" <>
Subject [jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs
Date Tue, 23 Feb 2010 18:44:27 GMT


Ted Dunning commented on MAHOUT-305:

My own experience is that all that counts in recommendations is the probability of click (interest)
on a set of recommendations.  As such, the best analog is probably precision at 10 or 20.
 I don't think that recall at 10 or 20 makes any sense at all (with a depth limited situation
like this, you have given up on recall and are only looking at precision).

Ankur's suggestion about keeping the most recent 4's and 5's as test data seems right to me.
 My only beefs are that you don't need recall@10 and what to do with the unrated items.  Presumably
a new style algorithm could surface items that the user hadn't thought of, but really likes.
 In practice, I think that counting unrated items in the results as misses isn't a big deal
in the Netflix data.  In the real world where test data is more scarce, I would count unrated
items as misses in off-line evaluation, but try to run as many alternatives as possible against
live users.


> Combine both cooccurrence-based CF M/R jobs
> -------------------------------------------
>                 Key: MAHOUT-305
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>            Reporter: Sean Owen
>            Assignee: Ankur
>            Priority: Minor
> We have two different but essentially identical MapReduce jobs to make recommendations
based on item co-occurrence:{item,cooccurrence}. They ought
to be merged. Not sure exactly how to approach that but noting this in JIRA, per Ankur.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message