Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 02303107FF for ; Fri, 28 Jun 2013 17:45:35 +0000 (UTC) Received: (qmail 46459 invoked by uid 500); 28 Jun 2013 17:45:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 46024 invoked by uid 500); 28 Jun 2013 17:45:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Delivered-To: moderator for java-user@lucene.apache.org Received: (qmail 39117 invoked by uid 99); 28 Jun 2013 17:42:59 -0000 X-ASF-Spam-Status: No, hits=3.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sxk1969@hotmail.com designates 65.54.190.204 as permitted sender) X-TMN: [5HnXdMZEhlZDOB+8HLvhTJhH8Slm4XeN] X-Originating-Email: [sxk1969@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_d7088968-26ea-4ee5-a71c-bb880c6267a2_" From: Saikat Kanjilal To: "solr-user@lucene.apache.org" , "java-user@lucene.apache.org" Subject: RE: Content based recommender using lucene/solr Date: Fri, 28 Jun 2013 10:42:32 -0700 Importance: Normal In-Reply-To: References: MIME-Version: 1.0 X-OriginalArrivalTime: 28 Jun 2013 17:42:32.0601 (UTC) FILETIME=[DE366890:01CE7426] X-Virus-Checked: Checked by ClamAV on apache.org --_d7088968-26ea-4ee5-a71c-bb880c6267a2_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Why not just use mahout to do this=2C there is an item similarity algorithm= in mahout that does exactly this :) https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/t= aste/hadoop/similarity/item/ItemSimilarityJob.html You can use mahout in distributed and non-distributed mode as well. > From: lcguerrerocovo@gmail.com > Date: Fri=2C 28 Jun 2013 12:16:57 -0500 > Subject: Content based recommender using lucene/solr > To: solr-user@lucene.apache.org=3B java-user@lucene.apache.org >=20 > Hi=2C >=20 > I'm using lucene and solr right now in a production environment with an > index of about a million docs. I'm working on a recommender that basicall= y > would list the n most similar items to the user based on the current item > he is viewing. >=20 > I've been thinking of using solr/lucene since I already have all docs > available and I want a quick version that can be deployed while we work o= n > a more robust recommender. How about overriding the default similarity so > that it scores documents based on the euclidean distance of normalized it= em > attributes and then using a morelikethis component to pass in the > attributes of the item for which I want to generate recommendations? I kn= ow > it has its issues like recomputing scores/normalization/weight applicatio= n > at query time which could make this idea unfeasible/impractical. I'm at a > very preliminary stage right now with this and would love some suggestion= s > from experienced users. >=20 > thank you=2C >=20 > Luis Guerrero = --_d7088968-26ea-4ee5-a71c-bb880c6267a2_--