mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Loy <ketera...@gmail.com>
Subject Re: Recommendations from flat data
Date Tue, 28 Apr 2009 00:45:06 GMT
Hi Sean,

thanks for the tips. Will give it a go tomorrow.

Paul.

On Mon, Apr 27, 2009 at 10:17 PM, Sean Owen <srowen@gmail.com> wrote:

> Yeah the problem here is that all the ratings are '1', and a
> correlation-based similarity metric like Pearson will return a "NaN"
> for the similarity between all users as a result.
>
> You want to take advantage of the situation by using the bits of code
> that assume you are in this situation, where all the ratings are the
> same or 1 or don't matter. Support for this mode is still a bit
> evolving, but basically you want to:
>
> - Use BooleanTanimotoCoefficientSimilarity instead of Pearson.
> - Omit the ",1" in the data file -- in fact you need to to get this to
> work.
> - Also separately I might generally discourage people from trying
> PreferenceInferrer unless you know you need or want it; I don't really
> like this technique. In fact for the similarity implementation above
> it won't be supported. So just remove that line.
>
> If any problems come up write back, might have missed a detail there.
>
> 2009/4/27 Paul Loy <keteracel@gmail.com>:
> > Hi,
> >
> > I want to create recommendations for my customers based on boolean data.
> > Essencially whether they bought a product.
> >
> > So this will create a csv containing:
> >
> > acctId, itemId, 1
> >
> > There is an entry in the CSV for each sale. So all entries will have a
> > 'rating' of 1. Using the following example:
> >
> >        DataModel model = new FileDataModel(new File("data.txt"));
> >
> >        PearsonCorrelationSimilarity userSimilarity = new
> > PearsonCorrelationSimilarity(model);
> >        userSimilarity.setPreferenceInferrer(new
> > AveragingPreferenceInferrer(model));
> >
> >        UserNeighborhood neighborhood =
> >            new NearestNUserNeighborhood(1, userSimilarity, model);
> >
> >        Recommender recommender =
> >            new GenericUserBasedRecommender(model, neighborhood,
> > userSimilarity);
> >        Recommender cachingRecommender = new
> > CachingRecommender(recommender);
> >
> >        List<RecommendedItem> recommendations =
> >            cachingRecommender.recommend("1967128", 10);
> >
> >        for (RecommendedItem item : recommendations) {
> >            System.out.println(item);
> >        }
> >
> > I get 0 recommendations even when I have seeded the file with obvious
> > correlations. I'm guessing this is because all 'ratings' are 1. Is there
> any
> > way to infer that all other items have a rating of 0, thus giving the
> > algorithms something to correlate?
> >
> > Thanks,
> >
> > Paul
> >
> >
> >
> > --
> > ---------------------------------------------
> > Paul Loy
> > paul@keteracel.com
> > http://www.keteracel.com/paul
> >
>



-- 
---------------------------------------------
Paul Loy
paul@keteracel.com
http://www.keteracel.com/paul

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message