mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Mahout Binary Recommender Evaluator
Date Mon, 25 Jul 2011 09:55:28 GMT
On Mon, Jul 25, 2011 at 10:05 AM, MT <mael.thomas@telecom-bretagne.eu>wrote:
>
>
> In fact, correct me if I'm wrong, but to me the evaluator will invariably
> give us the same value for precision and recall. Since the items are all
> rated with the binary 1.0 value, we give the recommender a threshold lower
> than 1, thus for each user at items are considered relevant and removed
> from
> the user's preferences to compute at recommendations. Precision and recall
> are then computed with the two sets : relevant and retrieved items. Which
> leads (I guess unless the recommender cannot compute at items) to precision
> and recall being equal.
>

I think that's right in this case, where there are no ratings. It's pretty
artificial to define 'relevant' here based on ratings!
This isn't true if you have ratings.


>
> Results are still useful though, since a value of 0.2 for precision tells
> us
> that among the at recommended items, 20% were effectively bought by the
> user. Although one can wonder if those items are the best recommendations,
> the least we can say is that it somehow corresponds to the user's
> preferences.
>

Right.


I read this topic and I fully understand that IRStatsEvaluator is different
> from classic evaluators (giving the MAE for example), but I feel that it
> makes sense to have a parameter trainingPercentage that divides users'
> preferences in two subsets of items. The first (typically 20%) are
> considered as relevant items, which are to be predicted using the second
> subset. This task is at the moment defined by at, resulting in often equal
> numbers of items in the relevant and retrieved subset. This at value would
> still be a parameter used to define the number of items retrieved. The
> evaluator could then be run varying these two parameters to find the best
> compromise between precision and recall.
>

I think it already has this parameter? it already accepts an "at" value. Is
this what you mean? maybe an example or patch would clarify.


>
> Furthermore, should the dataset contain a timestamp for each purchase,
> would
> it not be logic to set the test set as the last items bought by the user ?
> The evaluator would then follow what happens in real calculations.
>

Yes that sounds like a great improvement. The only difficulty is including
it in a clean way. Up for a patch?



>
> Finaly, I believe the documentation page has some mistakes in the last code
> excerpt :
>
> evaluator.evaluate(builder, myModel, null, 3,
> RecommenderIRStatusEvaluator.CHOOSE_THRESHOLD,
>        &sect;1.0);
>
> should be
> evaluator.evaluate(builder, null, myModel, null, 3,
> GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1.0);
>
>
> OK will look at that.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message