mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marko Ciric <ciric.ma...@gmail.com>
Subject Re: Mahout Binary Recommender Evaluator
Date Mon, 25 Jul 2011 10:16:09 GMT
Hi,

First of all, it's rather easy to implement the evaluator not to remove all
the items (which is the case when working with boolean preferences data
set). The easiest implementation would be to use relevanceThreshold argument
as the percent of the whole user's preference data set. For example if it is
0,4, you can remove 40% perecent of items that are preferred and so on.

The better way to do it is to implement an evaluator which accepts the
collection of items that are relevant.


On 25 July 2011 11:55, Sean Owen <srowen@gmail.com> wrote:

> On Mon, Jul 25, 2011 at 10:05 AM, MT <mael.thomas@telecom-bretagne.eu
> >wrote:
> >
> >
> > In fact, correct me if I'm wrong, but to me the evaluator will invariably
> > give us the same value for precision and recall. Since the items are all
> > rated with the binary 1.0 value, we give the recommender a threshold
> lower
> > than 1, thus for each user at items are considered relevant and removed
> > from
> > the user's preferences to compute at recommendations. Precision and
> recall
> > are then computed with the two sets : relevant and retrieved items. Which
> > leads (I guess unless the recommender cannot compute at items) to
> precision
> > and recall being equal.
> >
>
> I think that's right in this case, where there are no ratings. It's pretty
> artificial to define 'relevant' here based on ratings!
> This isn't true if you have ratings.
>
>
> >
> > Results are still useful though, since a value of 0.2 for precision tells
> > us
> > that among the at recommended items, 20% were effectively bought by the
> > user. Although one can wonder if those items are the best
> recommendations,
> > the least we can say is that it somehow corresponds to the user's
> > preferences.
> >
>
> Right.
>
>
> I read this topic and I fully understand that IRStatsEvaluator is different
> > from classic evaluators (giving the MAE for example), but I feel that it
> > makes sense to have a parameter trainingPercentage that divides users'
> > preferences in two subsets of items. The first (typically 20%) are
> > considered as relevant items, which are to be predicted using the second
> > subset. This task is at the moment defined by at, resulting in often
> equal
> > numbers of items in the relevant and retrieved subset. This at value
> would
> > still be a parameter used to define the number of items retrieved. The
> > evaluator could then be run varying these two parameters to find the best
> > compromise between precision and recall.
> >
>
> I think it already has this parameter? it already accepts an "at" value. Is
> this what you mean? maybe an example or patch would clarify.
>
>
> >
> > Furthermore, should the dataset contain a timestamp for each purchase,
> > would
> > it not be logic to set the test set as the last items bought by the user
> ?
> > The evaluator would then follow what happens in real calculations.
> >
>
> Yes that sounds like a great improvement. The only difficulty is including
> it in a clean way. Up for a patch?
>
>
>
> >
> > Finaly, I believe the documentation page has some mistakes in the last
> code
> > excerpt :
> >
> > evaluator.evaluate(builder, myModel, null, 3,
> > RecommenderIRStatusEvaluator.CHOOSE_THRESHOLD,
> >        &sect;1.0);
> >
> > should be
> > evaluator.evaluate(builder, null, myModel, null, 3,
> > GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1.0);
> >
> >
> > OK will look at that.
>



-- 
--
Marko Ćirić
ciric.marko@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message