mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Cooccurrence to align different categorization systems (many to many occurrence)
Date Mon, 19 Jul 2010 17:32:18 GMT
Yeah that's fine. You could do this too. You're not actually making
recommendations, just computing most similar items instead of most
similar users, so lots of stuff works here.

On Mon, Jul 19, 2010 at 2:55 PM, Chantal Ackermann
<chantal.ackermann@btelligent.de> wrote:
> Hi,
>
> mainly for the records:
>
> I've now mapped my items onto what in Mahout is called "User", and
> mapped the categories onto Mahout "Items", instead of mapping my items
> onto "Item" and the categories onto "User".
>
> I changed the plan because that way, it was easier to create the
> GenericBooleanPrefDataModel from my input. I actually think that it fits
> better that way - what's your opinion?
>
> The input to the data model looks a bit like this (I've shortened it for
> the sake of readability):
> [id=15901,title=Infamous] CAT1={3=Drama}
> [id=15888,title=Millions] CAT1={3=Drama, 4=Crime, 8=Thriller}
> [id=16421,title=The Departed] CAT1={3=Drama, 8=Thriller}
>
> NOTE that the data from the second category system is MISSING!
> (I have not yet all data accumulated, but while waiting for it I am
> preparing the code to process the similarities.) It would come as an
> additional list per item:
> CAT2={<id>=<value> ...}
> Where id is in a distinctly different range from the ids used for CAT1.
>
> I am using the code from Grant Ingersoll's article:
>
> // prefs is:
> // FastByIDMap with id:=itemId,
> // FastIDSet := list of CAT1 (and CAT2) ids
> DataModel dataModel = new GenericBooleanPrefDataModel(prefs);
> ItemSimilarity itemSimilarity = new LogLikelihoodSimilarity(dataModel);
> ItemBasedRecommender recommender =
>        new GenericItemBasedRecommender(dataModel, itemSimilarity);
> //Get the recommendations for the Item
> // loop over all items
> for (items in CAT1) {
>        List<RecommendedItem> simItems =
>        recommender.mostSimilarItems(id, numRecs);
>        // filter out CAT1, keep only CAT2
> }
>
> I've run the code but as CAT2 is missing, currently, I am not filtering
> the results. It seems fine, from what I can tell.
>
> Thanks again for your help!
> Chantal
>
>

Mime
View raw message