mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Item Based Recommendation Evaluation based on Number of Preferences
Date Wed, 04 Jan 2012 15:00:37 GMT
I think you want to do that with a custom ItemSimilarity perhaps, one that
wraps another implementation and returns NaN when you don't want it to have
a value. Remember that there's an equal issue with users with few ratings.
If you do the same thing you'll also be unable to recommend before the user
has a few ratings.  In general you shouldn't have to do things like this,
so just make sure it's really solving some issue and that it's not really a
symptom of something else.

IDRescorer rescores the output of estimatePreference(), conceptually, so it
does not accept such an object itself. That's fine, you would be using
IDRescorer to return NaN and filter the item, not modify
estimatePreference().

On Wed, Jan 4, 2012 at 2:52 PM, Nick Jordan <nick@influen.se> wrote:

> Thanks for the feedback.  In my particular scenario, I'd rather that the
> Recommender only return recommendations for items where the expected margin
> of error were smaller, even if that meant for a specific set of users no
> recommendations were made or that a specific set of items could never be
> recommended.  Maybe what I'm describing is my own home grown Recommender,
> which is fine but I just want to confirm.
>
> It also appears that evaluator uses estimatePreference in the Recommender
> to produce it's output and estimatePreference doesn't take a Rescorer
> parameter, so even if I handled this in Rescorer the Evaluator would not
> pick it up as part of its output.  Is that also correct?
>
> Nick
>
> On Wed, Jan 4, 2012 at 8:53 AM, Sean Owen <srowen@gmail.com> wrote:
>
> > After thinking about it more, I think your theory is right.
> >
> > You really should use more like 90% of your data to train, and 10% to
> test,
> > rather than the other way around. Here it seems fairly clear that the 10%
> > training test is returning a result that isn't representative of the real
> > performance. That's how I'd really "fix" this, plain and simple.
> >
> > Sean
> >
> > On Wed, Jan 4, 2012 at 11:42 AM, Nick Jordan <nick@influen.se> wrote:
> >
> > > Yeah, I'm a little perplexed.  By low-rank items I mean items that
> have a
> > > low number of preferences not a low average preference.  Basically if
> we
> > > don't have some level of confidence in our ItemSimilarity based on the
> > fact
> > > that not many people have given a preference good or bad, don't
> recommend
> > > them.  To your point though LogLikelihood may already account for that
> > > making these results even more surprising.
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message