mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: problems with GenericRecommenderIRStatsEvaluator:
Date Thu, 05 Nov 2009 12:16:07 GMT
It doesn't simulate "training" and "test", that's what I'm saying.
This concept exists in RecommenderEvaluator, not
RecommenderIRStatsEvaluator. They're reasonably different things.

In RecommenderIRStatsEvaluator, there is instead a "relevance
threshold" parameter.

But the final parameter, which you refer to, is something else still.
It simply controls what percentage of all data to use. It's a simple
way to use a lot less data to produce a result faster.

You are right that in your 'boolean' data, all preference values are
effectively 1.0. So passing a 1.0 means that all items are considered
relevant. That's fine, that's reasonable. While the framework
typically removes all relevant items from a user for test purposes, it
will remove only up to "at" items -- that is, if you are evaluating
precision at 5, it will remove up to 5 items. In this case they are
effectively randomly chosen since all items are equal.

How would you like it to choose the relevant and not relevant items in
this case? we can figure out how to do it then.


On Thu, Nov 5, 2009 at 12:06 PM, michal shmueli
<> wrote:
>    >>  I still don't get why this parameter simulates the "training" and
> the "test". In addition, since my data is Boolean, ain't it mean that anyway
> what is 1 is relevant ? Is there another way to tell the recommender how to
> chose the training and test sets?

View raw message