Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5FD2F200CB4 for ; Tue, 27 Jun 2017 16:39:40 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5E8D7160BDC; Tue, 27 Jun 2017 14:39:40 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A3F82160BC6 for ; Tue, 27 Jun 2017 16:39:39 +0200 (CEST) Received: (qmail 83916 invoked by uid 500); 27 Jun 2017 14:39:38 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 83905 invoked by uid 99); 27 Jun 2017 14:39:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jun 2017 14:39:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id CB831CA888 for ; Tue, 27 Jun 2017 14:39:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.988 X-Spam-Level: X-Spam-Status: No, score=-0.988 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id EwF4MKXVTvoW for ; Tue, 27 Jun 2017 14:39:36 +0000 (UTC) Received: from mail.sd-datasolutions.de (serv2.sd-datasolutions.de [78.47.65.36]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C3C385F60D for ; Tue, 27 Jun 2017 14:39:35 +0000 (UTC) Received: from VEGA (fw1.marum.de [134.102.234.1]) by mail.sd-datasolutions.de (Postfix) with ESMTPSA id DCDE24800C7 for ; Tue, 27 Jun 2017 14:39:27 +0000 (UTC) X-NSA-Greeting: Dear NSA, have fun with reading and analyzing this e-mail! From: "Uwe Schindler" To: References: <1498556749425-4342991.post@n3.nabble.com> In-Reply-To: <1498556749425-4342991.post@n3.nabble.com> Subject: RE: Is it possible to normalise BM25 scores in the query level? Date: Tue, 27 Jun 2017 16:39:25 +0200 Message-ID: <01f201d2ef53$2ce317d0$86a94770$@thetaphi.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Content-Language: de Thread-Index: AQLDaQoF48qnxJe2nIKhV4yf6nKPo6BX02nw archived-at: Tue, 27 Jun 2017 14:39:40 -0000 Hi, Once you have executed the query the TopDocs collector gives you the = maximum score. Then you just need to normalize on your own. But keep in mind: This is not always a good idea, because the maximum = score and the score of the first document does not mean to be useful to = compare. E.g. if the first, top-ranking result is a very bad match and = no better ones are there, there is not reason to say it's a 100% hit! BTW: Lucene up to version 6 had some internal normalization in place, = but this was removed in Lucene 7. The reason is simple: The scores are = calculated just to compare them inside the same result set. They were = never implemented to be used across different indexes or queries. ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Rifat [mailto:rifatozcan1981@gmail.com] > Sent: Tuesday, June 27, 2017 11:46 AM > To: java-user@lucene.apache.org > Subject: Is it possible to normalise BM25 scores in the query level? >=20 > Hi all, >=20 > I searched for this a lot but could not find a clear answer, yet. is = there a > way such that Lucene (or Elasticsearch) provides query level = normalization > of BM25 scores. Because BM25 scores varies considerably across = queries. For > example, is it possible to get scores normalised by the max score for = that > query? Since lucene processes docs one at a time and return score for = that > document, at that moment, it seems not easy to do the normalisation. >=20 > thanks, > rifat >=20 >=20 >=20 >=20 > -- > View this message in context: = http://lucene.472066.n3.nabble.com/Is-it- > possible-to-normalise-BM25-scores-in-the-query-level-tp4342991.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org