Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A424988A for ; Mon, 14 Nov 2011 09:41:06 +0000 (UTC) Received: (qmail 3950 invoked by uid 500); 14 Nov 2011 09:40:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 3882 invoked by uid 500); 14 Nov 2011 09:40:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 3874 invoked by uid 99); 14 Nov 2011 09:40:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Nov 2011 09:40:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of samuelgmartinez@gmail.com designates 209.85.220.176 as permitted sender) Received: from [209.85.220.176] (HELO mail-vx0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Nov 2011 09:40:51 +0000 Received: by vcbfk14 with SMTP id fk14so8208921vcb.35 for ; Mon, 14 Nov 2011 01:40:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=yNoh0hjpRv5uAVE+porhXwh0enb0MKSppP9IGiUMe5w=; b=Lk0oCVTq3KmQcOeXk0MZ54cVmaFJCBemgq+hWDrWvIOa5Hiw3vtxZjG0JzHFdUL2Fv 7sKohIb+7hkS2A/Ka2Qe5y+9DMd57J2OJQ4LGYJgffqRdVGj3h6ASx6S2JtD0XyL6PK1 Hf7oM+SWfAuB1I4wcow1gLAsc+1dUqMiOW4/c= MIME-Version: 1.0 Received: by 10.52.88.231 with SMTP id bj7mr33992530vdb.81.1321263630635; Mon, 14 Nov 2011 01:40:30 -0800 (PST) Received: by 10.220.95.204 with HTTP; Mon, 14 Nov 2011 01:40:30 -0800 (PST) Date: Mon, 14 Nov 2011 10:40:30 +0100 Message-ID: Subject: How to determine result quality From: =?ISO-8859-1?Q?Samuel_Garc=EDa_Mart=EDnez?= To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=20cf3071c8805bd46604b1aea91d --20cf3071c8805bd46604b1aea91d Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi list, I have been searching about score normalization few days (now i know this can't be done) in Lucene using this list, wiki, blogposts, etc. I'm going to expose my problem because I'm not sure that score normalization is what our project need. *Background*: In our project, we are using Solr on top of Lucene with custom RequestHandlers and SearchComponents. For a given query, we need to detect when a query got poor results to trigger different actions. *Assumptions*: Inmutable index (once indexed, it is not updated) and Same query tipology (dismax qparser with same field boosting, without boost functions nor boost queries). *Problem*: We know that score normalization is not implementable. But is there any way to determine (using TF/IDF and boost field assumptions) when search results match quality are poor? *Example: *We've got an index with science papers and other one with medcare centre's info. When a user query against first index and got poor results (inferring it from score?), we want to query second index and merge results using some threshold (score threshold?) Thanks in advance --=20 Un saludo, Samuel Garc=EDa. --20cf3071c8805bd46604b1aea91d--