mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Jordan <n...@influen.se>
Subject Re: Item Based Recommendation Evaluation based on Number of Preferences
Date Wed, 04 Jan 2012 15:05:33 GMT
Thanks.  Appreciate you pointing me in the right direction.

On Wed, Jan 4, 2012 at 10:00 AM, Sean Owen <srowen@gmail.com> wrote:

> I think you want to do that with a custom ItemSimilarity perhaps, one that
> wraps another implementation and returns NaN when you don't want it to have
> a value. Remember that there's an equal issue with users with few ratings.
> If you do the same thing you'll also be unable to recommend before the user
> has a few ratings.  In general you shouldn't have to do things like this,
> so just make sure it's really solving some issue and that it's not really a
> symptom of something else.
>
> IDRescorer rescores the output of estimatePreference(), conceptually, so it
> does not accept such an object itself. That's fine, you would be using
> IDRescorer to return NaN and filter the item, not modify
> estimatePreference().
>
> On Wed, Jan 4, 2012 at 2:52 PM, Nick Jordan <nick@influen.se> wrote:
>
> > Thanks for the feedback.  In my particular scenario, I'd rather that the
> > Recommender only return recommendations for items where the expected
> margin
> > of error were smaller, even if that meant for a specific set of users no
> > recommendations were made or that a specific set of items could never be
> > recommended.  Maybe what I'm describing is my own home grown Recommender,
> > which is fine but I just want to confirm.
> >
> > It also appears that evaluator uses estimatePreference in the Recommender
> > to produce it's output and estimatePreference doesn't take a Rescorer
> > parameter, so even if I handled this in Rescorer the Evaluator would not
> > pick it up as part of its output.  Is that also correct?
> >
> > Nick
> >
> > On Wed, Jan 4, 2012 at 8:53 AM, Sean Owen <srowen@gmail.com> wrote:
> >
> > > After thinking about it more, I think your theory is right.
> > >
> > > You really should use more like 90% of your data to train, and 10% to
> > test,
> > > rather than the other way around. Here it seems fairly clear that the
> 10%
> > > training test is returning a result that isn't representative of the
> real
> > > performance. That's how I'd really "fix" this, plain and simple.
> > >
> > > Sean
> > >
> > > On Wed, Jan 4, 2012 at 11:42 AM, Nick Jordan <nick@influen.se> wrote:
> > >
> > > > Yeah, I'm a little perplexed.  By low-rank items I mean items that
> > have a
> > > > low number of preferences not a low average preference.  Basically if
> > we
> > > > don't have some level of confidence in our ItemSimilarity based on
> the
> > > fact
> > > > that not many people have given a preference good or bad, don't
> > recommend
> > > > them.  To your point though LogLikelihood may already account for
> that
> > > > making these results even more surprising.
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message