mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Nute" <mike.n...@gmail.com>
Subject Re: Implicit feedback with varying significance
Date Thu, 17 Mar 2011 15:35:08 GMT
I could be missing some context here but can you calculate the recommendation based first on
purchase feedback, then based on view feedback, then based on the grand mean? So in my world
(insurance, for now), we refer to this as credibility, but its really just the linear approximation
to the bayesian posterior. But you'd essentially figure out some Z in [0,1] related to number
of purchase-related feedback items (typically this is n/(n+k) where n is the # of items and
k is some constant), then use Z*(purely purchased based recommendation) + (1-Z)*(something
else). In this case the something else could actually be a repeat of this where the next Z
(call it Z') is calculated based on the # of viewing-based feedbacks, and the new something
else could be e.g. the grand mean.

So you'd have:
n/(n+k)=Z
m/(m+j)=Z'
Where n = # of purchases, m=# of views, k,j constants chosen in advance.
And your recommender score for the next item becomes: 
Z*(purchase driven score)+(1-Z)*[Z'*(view driven score)+(1-Z')*(grand mean)]

Does that make sense or help? There's more theory in there but that's the basics of how I'd
approach it.

Mike
-----Original Message-----
From: Sebastian Schelter <ssc@apache.org>
Date: Thu, 17 Mar 2011 11:28:54 
To: <user@mahout.apache.org>
Reply-To: user@mahout.apache.org
Subject: Implicit feedback with varying significance

Hi,

I have some questions about how to handle implicit feedback with varying 
significance in an e-commerce environment.

Say I have an onlineshop and I track views and purchases of products.

Tracked views are like a two-edged sword then, on the one hand they are 
very useful because you get a lot of them quickly and can use them to 
tackle the cold-start problem (you should already have enough data to 
find similar items from view data the day after the product was put 
online). On the other hand the most co-viewed stuff will be from the 
same category of things and will narrow the similar items to that 
category. This might become worse over time as recommenders tend to 
amplify themselves.

After some time we should have purchase data for that new items, which 
is expected to have a higher significance because a higher engagement of 
the users is involved. I'd like to give these signals a much greater 
weight to broaden up the recommendations, especially to find interesting 
cross-category similarities. My worry is that these are currently 
"overruled" by the sheer amount of view data points and I found out that 
simple procedures like applying business specific rules to filter the 
similar items to only include cross-category pairs doesn't really help 
with that problem.

Does somebody have an idea (or better something learned from experience) 
how to proceed to solve that problem?

A simple approach I have in mind would be to separately handle similar 
items based on views and similar items based on purchases.

--sebastian
Mime
View raw message