lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saikat Kanjilal <sxk1...@hotmail.com>
Subject RE: Content based recommender using lucene/solr
Date Fri, 28 Jun 2013 18:07:00 GMT
You could build a custom recommender in mahout to accomplish this, also just out of curiosity
why the content based approach as opposed to building a recommender based on co-occurence.
 One other thing, what is your data size, are you looking at scale where you need something
like hadoop?

> From: lcguerrerocovo@gmail.com
> Date: Fri, 28 Jun 2013 13:02:00 -0500
> Subject: Re: Content based recommender using lucene/solr
> To: solr-user@lucene.apache.org
> CC: java-user@lucene.apache.org
> 
> Hey saikat, thanks for your suggestion. I've looked into mahout and other
> alternatives for computing k nearest neighbors. I would have to run a job
> and computer the k nearest neighbors and track them in the index for
> retrieval. I wanted to see if this was something I could do with lucene
> using lucene's scoring function and solr's morelikethis component. The job
> you specifically mention is for Item based recommendation which would
> require me to track the different items users have viewed. I'm looking for
> a content based approach where I would use a distance measure to establish
> how near items are (how similar) and have some kind of training phase to
> adjust weights.
> 
> 
> On Fri, Jun 28, 2013 at 12:42 PM, Saikat Kanjilal <sxk1969@hotmail.com>wrote:
> 
> > Why not just use mahout to do this, there is an item similarity algorithm
> > in mahout that does exactly this :)
> >
> >
> > https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/hadoop/similarity/item/ItemSimilarityJob.html
> >
> > You can use mahout in distributed and non-distributed mode as well.
> >
> > > From: lcguerrerocovo@gmail.com
> > > Date: Fri, 28 Jun 2013 12:16:57 -0500
> > > Subject: Content based recommender using lucene/solr
> > > To: solr-user@lucene.apache.org; java-user@lucene.apache.org
> > >
> > > Hi,
> > >
> > > I'm using lucene and solr right now in a production environment with an
> > > index of about a million docs. I'm working on a recommender that
> > basically
> > > would list the n most similar items to the user based on the current item
> > > he is viewing.
> > >
> > > I've been thinking of using solr/lucene since I already have all docs
> > > available and I want a quick version that can be deployed while we work
> > on
> > > a more robust recommender. How about overriding the default similarity so
> > > that it scores documents based on the euclidean distance of normalized
> > item
> > > attributes and then using a morelikethis component to pass in the
> > > attributes of the item for which I want to generate recommendations? I
> > know
> > > it has its issues like recomputing scores/normalization/weight
> > application
> > > at query time which could make this idea unfeasible/impractical. I'm at a
> > > very preliminary stage right now with this and would love some
> > suggestions
> > > from experienced users.
> > >
> > > thank you,
> > >
> > > Luis Guerrero
> >
> >
> 
> 
> 
> -- 
> Luis Carlos Guerrero Covo
> M.S. Computer Engineering
> (57) 3183542047
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message