mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Roy <abhishekr...@gmail.com>
Subject Re: Custom Item Similarity :datamodel not sure
Date Thu, 27 Sep 2012 18:14:15 GMT
Sean Owen <srowen <at> gmail.com> writes:


> 
> The input to the recommender remains the same -- user,item,rating.
> Your similarities are used as weights in a weighted average to make
> recommendations. This is unrelated -- or rather, not necessarily
> related at all -- to whatever custom similarity metric you create.
> Your similarities do not need to be precomputed. You could, but it's
> not necessary.
> 
> On Thu, Sep 27, 2012 at 6:48 PM, Abhishek Roy <abhishekroy8 <at> gmail.com>

wrote:
> > Sean Owen <srowen <at> gmail.com> writes:
> >
> >>
> >> File for FileDataModel? This does not change. But that input does not
> >> consist of item item pairs. Are you talking about something else?
> >> On Sep 27, 2012 5:10 PM, "Abhishek Roy" <abhishekroy8 <at> gmail.com>

wrote:
> >>
> >> > Hi Sean,
> >> > For using a custom ItemSimilarity what should my data model file(item 
id1,
> >> > item
> >> > id2) include ?
> >> >
> >> > Please advise.
> >> >
> >> > Thanks,
> >> > Roy
> >> >
> >> >
> >>
> >
> > Thanks for the quick response Sean.
> > My end goal (short term) is to show "related / similar" items for my site 
when
> > the user(any user, including unregistered user) is looking at a particular 
item.
> > Basically I am looking at (rather created) a custom ItemSimilarity using 
domain
> > specific attributes that computes a similarity score between a pair of 
items. I
> > am using a GenericItemBasedRecommender and then calling n mostSimilarItems() 
to
> > get my recommendations. The problem is, and I didn't see anything on that in 
the
> > book as well as the forum, that I am not sure about the data model to feed 
to
> > the GenericItemBasedRecommender. I did a brute force, computed 
nC2(combinations)
> > of {item id, item id} pairs and fed that as the data model. Works, but
> > definitely not scalable and sensible. What data model does this kind of a 
system
> > need ? I am not having preference data(very little), and since this is 
content
> > based recommendation, am puzzled about the data to be encapsulated by the
> > datamodel. I hope I am clear..
> > Please suggest...
> >
> >
> >
> 
> 

Sean, let me clarify. I am in a way, trying to recommend "similar" items to a 
particular item. 100% Content based. 
And I don't have the user,item,rating data. No user angle at all. No 
"preference" angle at all. 

All I have is the set of all items in the system, and their 
attributes(genre,title,description etc). I have read and realized that the 
user,item,rating data can as well be : item,item data ...(rating/preference) 
absent. Hence the confusion. So, in this case, what data do I give as an input ? 
Do I compute item,item entries based on a certain criteria ? What is the least 
data I can give the system as an input to get my n most similar items based on 
my custom ItemSimilarity ? 




Mime
View raw message