mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Mix of Content Based and Collaborative Filtering
Date Mon, 05 Nov 2012 16:05:52 GMT
I think that payloads are a bad idea here.  My rationale is that you really
want to index these signals if at all possible.

Also, payloads (as of a while ago) were not accessed very efficiently.
 This can massively slow down scoring.


On Mon, Nov 5, 2012 at 7:01 AM, shubham srivastava <shubham.k@gmail.com>wrote:

> http://sujitpal.blogspot.in/2011/01/payloads-with-solr.html
>
> On Fri, Nov 2, 2012 at 12:13 PM, Johannes Schulte <
> johannes.schulte@gmail.com> wrote:
>
> > Hi,
> >
> > i can also encourage to go the simple way with a solr or lucene index. It
> > gives you almost unlimited possibilities when you want include new
> > "relevance signals" and even more important, have business requirements
> > like filtering etc.
> >
> > I'm using a plain lucene index to combine stuff. The pre-calculated
> > Item-To-Item similarities are stored as payload fields so the
> similarities
> > can be used in the scoring process. This way you can easy issue a query
> > like "contains x and is similar to items a,b,c".
> >
> > You can even use boosting different parts of the query to fade between
> the
> > signals. Only question is how much you can achieve "by hand". Probably
> you
> > want to somehow learn which weights on the signals perform best. Maybe
> this
> > blog article by netflix is a good start
> >
> >
> >
> http://techblog.netflix.com/2012/06/netflix-recommendations-beyond-5-stars.html
> >
> >
> >
> > Cheers,
> > Johannes
> >
> >
> > On Fri, Nov 2, 2012 at 6:21 AM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> >
> > > Speaking with no principles in hand at all, I find that it is possible
> to
> > > encode multiple item similarity matrices together in a SolR instance
> and
> > > then do very nice coordinated recommendations from multiple sources of
> > > information.
> > >
> > > Abusing a text retrieval engine this way has only vague basis in
> theory,
> > > but it can be particularly nice from a practical point of view.
> > >
> > > On Thu, Nov 1, 2012 at 10:41 AM, Sean Owen <srowen@gmail.com> wrote:
> > >
> > > > There is not a very direct way to do this in Mahout, but, you can
> piece
> > > > together a solution that reuses a lot of what Mahout has.
> > > >
> > > > It sounds like you should look at this as an item-item
> similarity-based
> > > > recommender to start. You have two sources of similarity. First is
> > based
> > > on
> > > > interactions (no ratings); for this, you can use
> > LogLikelihoodSimilarity
> > > > and an existing DataModel. This much is straightforward.
> > > >
> > > > You can also make an ItemSimilarity based on item properties. There
> is
> > no
> > > > pre-packaged solution for this. You can make up a similarity metric,
> or
> > > > export some similarities based on, say, descriptions, maybe from Solr
> > > yes.
> > > >
> > > > Then you can combine them. There's no great principled answer. You
> > could
> > > > make an ItemSimilarity that just returns the product of these two
> > > > similarity measures (assuming they are both >= 0).
> > > >
> > > > And then the rest is a matter of using GenericItemBasedRecommender
> with
> > > > your hybrid ItemSimilarity.
> > > >
> > > > This isn't a distributed solution but is a good start.
> > > >
> > > > Sean
> > > >
> > > >
> > > > On Thu, Nov 1, 2012 at 5:33 PM, shubham srivastava <
> > shubham.k@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am looking into designing implementing a recommendation engine
> >  with
> > > > the
> > > > > below use cases . There is no specific rating's etc given by user's
> > as
> > > > such
> > > > > for items accessed.
> > > > >
> > > > > 1. Item's viewed by other user's who viewed this particular Item
> > > > >
> > > > > 2. Item's booked by other user's who viewed this particular Item
> > > > >
> > > > > 3. Most viewed item('s) viewed by other user's who viewed this
> > > particular
> > > > > Item
> > > > >
> > > > > The idea behind is the below :
> > > > >
> > > > > 1.I want to interpret user behavior where recommendation would be
> > based
> > > > on
> > > > > the other user's patterns which falls into the bracket of CF(item
> > based
> > > > > similarities or user based) .
> > > > >
> > > > > 2.I want to exploit item item similarity which is based on N number
> > of
> > > > > attributes. The attributes can be say :
> > price,location,features(1...n)
> > > as
> > > > > so on.
> > > > >
> > > > > The recommendation should be a mix of both of the above.
> > > > >
> > > > > A) For 1 given that I don't have an explicit rating my initial
> > thought
> > > > was
> > > > > around interpreting ratings as based on what user does for a
> product
> > eg
> > > > >
> > > > > If he only views it I give a 1 rating
> > > > > If he further sees the details I give 2 rating
> > > > > If he goes to the booking page I give him 3 rating
> > > > > If he books it I give him 4 rating etc
> > > > >
> > > > > And when I have the same I would go for a standard CF item-item
> > > > similarity
> > > > > implemented through Mahout
> > > > >
> > > > > B) For 2. I was looking into our search framework(Solr) to give the
> > > same
> > > > > i.e Solr's MoreLikeThis feature. Also carrot also seems to make it
> > > better
> > > > > but I don't how much would that be scalable etc.
> > > > >
> > > > > Idea is to get an intersection if A and B to get started with.
>  Also
> > I
> > > > need
> > > > > to figure out the processing and latency part of getting the
> results
> > as
> > > > > well.
> > > > >
> > > > > I guess the group user's must have solved a similar problem more
> > > > > efficiently and could advise better.
> > > > >
> > > > > Please let me know the same.
> > > > >
> > > > > Regards,
> > > > > Shubham
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message