Return-Path: X-Original-To: apmail-lucenenet-user-archive@www.apache.org Delivered-To: apmail-lucenenet-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1BD1D1093A for ; Thu, 12 Dec 2013 16:23:36 +0000 (UTC) Received: (qmail 97341 invoked by uid 500); 12 Dec 2013 16:23:35 -0000 Delivered-To: apmail-lucenenet-user-archive@lucenenet.apache.org Received: (qmail 96981 invoked by uid 500); 12 Dec 2013 16:23:30 -0000 Mailing-List: contact user-help@lucenenet.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@lucenenet.apache.org Delivered-To: mailing list user@lucenenet.apache.org Received: (qmail 96284 invoked by uid 500); 12 Dec 2013 16:23:29 -0000 Delivered-To: apmail-lucene-lucene-net-user@lucene.apache.org Received: (qmail 96256 invoked by uid 99); 12 Dec 2013 16:23:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Dec 2013 16:23:27 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of prvs=0510d8f5e=Brad.Allan@fiserv.com designates 204.95.150.32 as permitted sender) Received: from [204.95.150.32] (HELO mail1.checkfree.com) (204.95.150.32) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Dec 2013 16:23:22 +0000 X-IronPort-AV: E=Sophos;i="4.93,879,1378872000"; d="scan'208,217";a="241020402" Received: from iwpdlpem01.corp.checkfree.com (HELO iwpexht01.corp.checkfree.com) ([10.132.91.25]) by iapiron01.corp.checkfree.com with ESMTP; 12 Dec 2013 11:23:01 -0500 Received: from JWPKEXHT01.corp.checkfree.com (10.141.82.33) by iwpexht01.corp.checkfree.com (10.132.91.140) with Microsoft SMTP Server (TLS) id 8.3.279.5; Thu, 12 Dec 2013 11:23:01 -0500 Received: from JWPKEXMBX03.corp.checkfree.com ([169.254.5.208]) by JWPKEXHT01.corp.checkfree.com ([10.141.82.33]) with mapi id 14.02.0342.003; Thu, 12 Dec 2013 11:23:00 -0500 From: "Allan, Brad (Bracknell)" To: "lucene-net-user@lucene.apache.org" Subject: Getting fuzzy match information Thread-Topic: Getting fuzzy match information Thread-Index: Ac73R8E4Zoz/y58HQdWEqoJL/ZgQlQ== Date: Thu, 12 Dec 2013 16:22:59 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.25.68.12] Content-Type: multipart/alternative; boundary="_000_C3382910233E3D4A8287A71884FD1A701435776FJWPKEXMBX03corp_" MIME-Version: 1.0 X-CFilter-Loop: True X-Virus-Checked: Checked by ClamAV on apache.org --_000_C3382910233E3D4A8287A71884FD1A701435776FJWPKEXMBX03corp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Has anyone done or know of work done that would help me to get detailed inf= ormation about my hits with regard to fuzzy matches? Also very happy to rec= eive suggestions :). I'm looking to obtain the similarity percentage of each token in the each h= it. Example: fuzzy query looks something like this: (name:80% similar to "john" or name:80% similar to "henry" or name:80% simi= lar to "smith") And I get hits: * Jon George Smythe * John Joe Henry * Smith John & Carter engineering All valid hits, however my users want to be able to view the similarity and= indeed prioritise certain actions by being able to compare the results of = 2 different searches (and therefore normalised scores are not as useful as = knowing the actual similarity information). Clearly this sort of ability does not make sense when one is searching in l= arge amounts of data (documents), but in my case I'm searching through a se= t of names and some additional person information. Options could be to post process the hits and use/lift the FuzzyTermEnum lo= gic to re-compute the similarity value. Or perhaps extend the FuzzyQuery to= register a 'listener' to receive the information? Other ideas? Thoughts? ________________________________ CheckFree Solutions Limited (trading as Fiserv) Registered Office: Eversheds House, 70 Great Bridgewater Street, Manchester= , M15 ES Registered in England: No. 2694333 --_000_C3382910233E3D4A8287A71884FD1A701435776FJWPKEXMBX03corp_--