mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From michal shmueli <michal.shmu...@gmail.com>
Subject Re: problems with GenericRecommenderIRStatsEvaluator:
Date Thu, 05 Nov 2009 13:28:24 GMT
On Thu, Nov 5, 2009 at 3:14 PM, Sean Owen <srowen@gmail.com> wrote:

> You are again partly describing what RecommenderEvaluator does, not
> RecommenderIRStatsEvaluator.
>
> The main difference between what you are describing, and what happens,
> is there is no "70%" -- instead there is a relevance threshold.
> However in *RecommenderEvaluator* there is a parameter than controls
> what percent of data is used for training. But you are not using this
> code.
>
>    - which recommenderEvaluator would you suggest for Boolean data? does it
also mean that I need to change the recommender that i'm using (which use
the Tanimoto similarity) ?


> What ranking are you talking about, that is ignored?
>

  - It seems that you give the same results if you report on the best-2
(ranked 1st and 2nd) out of k=10 or the worst-2 (9th and 10th) from the 10,
but again , I might be wrong.

Cheers,
Michal

>
> Sean
>
> On Thu, Nov 5, 2009 at 1:08 PM, michal shmueli <michal.shmueli@gmail.com>
> wrote:
> > The way i envision this is the follow: assume user rates 10 items, this
> 10
> > are the correct items. Further assume that for recommendation we use
> subset
> > of this 10 items, say 70% (leave us with 30% for test) to build the
> > similarity, etc. Now, during evaluation, we ask from the recommneder for
> say
> > k items, and we check how many from the 3 correct item (the 30% of the
> > tests) are within the k recommended items.
> > This solutions ignore the ranking on the different items, however, this
> > could be also added later.
> >
> > Does it make sense?
> >
> > thanks,
> > Michal
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message