Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1DCEE987B for ; Sat, 28 Jan 2012 10:32:03 +0000 (UTC) Received: (qmail 55549 invoked by uid 500); 28 Jan 2012 10:31:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 54924 invoked by uid 500); 28 Jan 2012 10:31:44 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 54904 invoked by uid 99); 28 Jan 2012 10:31:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 28 Jan 2012 10:31:40 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=SPF_PASS,URIBL_DBL_REDIR X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of uwe@thetaphi.de designates 188.138.97.18 as permitted sender) Received: from [188.138.97.18] (HELO mail.sd-datasolutions.de) (188.138.97.18) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 28 Jan 2012 10:31:32 +0000 Received: from VEGA (port-92-196-62-27.dynamic.qsc.de [92.196.62.27]) by mail.sd-datasolutions.de (Postfix) with ESMTPSA id 6DE0D14AA325; Sat, 28 Jan 2012 10:31:12 +0000 (UTC) From: "Uwe Schindler" To: , References: <4F23C0B9.2040401@fastmail.fm> <005f01ccdda0$4bd3a850$e37af8f0$@thetaphi.de> <4F23C76A.6060700@fastmail.fm> In-Reply-To: <4F23C76A.6060700@fastmail.fm> Subject: RE: Does Fuzzy Search scores the same as Exact Match Date: Sat, 28 Jan 2012 11:31:26 +0100 Message-ID: <006001ccdda7$fd527be0$f7f773a0$@thetaphi.de> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQHAFquI5ceFcs+vBVJR1sIdFoA5twF4+bV2AXC7rC+WI9MKIA== Content-Language: de X-Virus-Checked: Checked by ClamAV on apache.org ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Paul Taylor [mailto:paul_t100@fastmail.fm] > Sent: Saturday, January 28, 2012 11:01 AM > Cc: java-user@lucene.apache.org > Subject: Re: Does Fuzzy Search scores the same as Exact Match > > On 28/01/2012 09:36, Uwe Schindler wrote: > > Hi, > > > >> -----Original Message----- > >> From: Paul Taylor [mailto:paul_t100@fastmail.fm] > >> Sent: Saturday, January 28, 2012 10:33 AM > >> To: 'java-user@lucene.apache.org' > >> Subject: Does Fuzzy Search scores the same as Exact Match > >> > >> All things being equal does a fuzzy match give the same score as an > >> exact match. > >> i.e if I do a search for farmin and it matches two docs one on term > > farmin, the > >> other on term farming, will it score farming higher or score both the > >> same > > ? > > > > YES, depends on the Fuzzy configuration (rewrite method,...), but the > > default does so! > > > > Uwe > > > > > So how do I change it, seems like a funny default to have. Maybe I was not clear, it should score "farming" higher than "farmin" by default, but the default rewrite mode also takes TF/IDF into account (in addition). You can change that by a different rewrite method: The default is: http://goo.gl/JhHOA (which combines the standard vector model with additionally boosting exact matches - we have that for backwards compatibility only, its not what most users expect) The better one is: http://goo.gl/0eJ47, which does not take TF/IDF into account and only boosts by levensthein distance. You can disable fuzzy boosting altogether: Additionally http://goo.gl/VWlkW provides two other scoring models (TF/IDF only, no boosting - or constant score at all) Uwe --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org