Return-Path: Delivered-To: apmail-lucene-lucene-net-user-archive@www.apache.org Received: (qmail 636 invoked from network); 6 May 2010 17:33:14 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 May 2010 17:33:14 -0000 Received: (qmail 61050 invoked by uid 500); 6 May 2010 17:33:14 -0000 Delivered-To: apmail-lucene-lucene-net-user-archive@lucene.apache.org Received: (qmail 60945 invoked by uid 500); 6 May 2010 17:33:14 -0000 Mailing-List: contact lucene-net-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucene-net-user@lucene.apache.org Delivered-To: mailing list lucene-net-user@lucene.apache.org Received: (qmail 60937 invoked by uid 500); 6 May 2010 17:33:14 -0000 Delivered-To: apmail-incubator-lucene-net-user@incubator.apache.org Received: (qmail 60934 invoked by uid 99); 6 May 2010 17:33:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 May 2010 17:33:14 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nitins@coreobjects.com designates 67.89.120.9 as permitted sender) Received: from [67.89.120.9] (HELO colaex01.coreobjects.com) (67.89.120.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 May 2010 17:33:05 +0000 Received: from copuex01.coreobjects.com (192.168.11.18) by colaex01.coreobjects.com (192.168.1.16) with Microsoft SMTP Server (TLS) id 8.1.393.1; Thu, 6 May 2010 10:32:34 -0700 Received: from copuex01.coreobjects.com ([192.168.11.18]) by copuex01.coreobjects.com ([192.168.11.18]) with mapi; Thu, 6 May 2010 23:00:40 +0530 From: Nitin Shiralkar To: "lucene-net-user@incubator.apache.org" Date: Thu, 6 May 2010 23:00:56 +0530 Subject: Lucene search relevancy: Optimization Thread-Topic: Lucene search relevancy: Optimization Thread-Index: AcrtQeLARtLitPiKSB67ZROL+ewM5w== Message-ID: <33A0EEB8D50198449794A1AB9DEB1E01FD6B61B2FF@copuex01.coreobjects.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_33A0EEB8D50198449794A1AB9DEB1E01FD6B61B2FFcopuex01coreo_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_33A0EEB8D50198449794A1AB9DEB1E01FD6B61B2FFcopuex01coreo_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, We are using Lucene in a knowledge management system for legal domain. We i= ndex lot of legal documents into Lucene as free-text and also index some me= tadata information (extracted from document) into separate fields. Now we w= ant to optimize relevancy for following cases: 1. Searching with free-text query: We want to provide a google-like simple search interface accepting a free-t= ext query. However in order to achieve better relevancy, we want to map tha= t query against metadata fields. For example, if the user searches for "Cal= ifornia Merger Agreement for telecom" document then we want to internally s= earch for "California" against State metadata field, "Merger agreement" aga= inst document type field and also complete text as full-text index. What wo= uld be the best way to do that? 2. Returning high-rated documents on top: We have some high-rated documents in the system and we do store this high-v= alue field in index. For any type of searches, we want high-value documents= to appear on top if they satisfy search criteria. One of the ways that we = are thinking is to sort on high-value field to get those on top. Is there a= ny other way like boosting etc? I know these are pretty specific application questions, but any guidance wo= uld be appreciated. Thanks in advance ! Nitin --_000_33A0EEB8D50198449794A1AB9DEB1E01FD6B61B2FFcopuex01coreo_--