Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BDAAF200BD0 for ; Wed, 30 Nov 2016 13:29:20 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B95BE160B13; Wed, 30 Nov 2016 12:29:20 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0CE06160B08 for ; Wed, 30 Nov 2016 13:29:19 +0100 (CET) Received: (qmail 80449 invoked by uid 500); 30 Nov 2016 12:29:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 80422 invoked by uid 99); 30 Nov 2016 12:29:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Nov 2016 12:29:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A89821A0933 for ; Wed, 30 Nov 2016 12:29:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.88 X-Spam-Level: * X-Spam-Status: No, score=1.88 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id nW59wZfwIPlV for ; Wed, 30 Nov 2016 12:29:14 +0000 (UTC) Received: from mail-yw0-f169.google.com (mail-yw0-f169.google.com [209.85.161.169]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 54B565FAE0 for ; Wed, 30 Nov 2016 12:29:14 +0000 (UTC) Received: by mail-yw0-f169.google.com with SMTP id r204so156568059ywb.0 for ; Wed, 30 Nov 2016 04:29:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=aSsCUoD/trTsTb7pIF3hBvJb/EGSNWk8k4O9loIMv8s=; b=Ct1zuzJYD/WWvsgZrSyoUpr+DINcb1rlLi36hegOuwJ8lr4Qif8nDuOfn5AJRQ37Ff qXGdiSRntSJd2EigtQlPkbZDAWACXaDgz2kyXRYTqChHOs7bIZGxses8X5SE2mKqfAon JbEbHgM3N8AA1HoF2XtKg1ecR/GYk9PoeiNqfdWFAEVbFNLDp8kPA6YXnq79f33JVvIQ Ur2xqwUjVat0Fvar9nP1tdDSV8CEpEnI7AGvZ9b2XXu1dAMLpSW0GJZVOdje8Czk5BGg j0l9DhuddKCSgH1CXQTFFF6k992V5jVmBeFjIwzTfjytdQneEmKHgISNmVPz7qEUofR8 WJ4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=aSsCUoD/trTsTb7pIF3hBvJb/EGSNWk8k4O9loIMv8s=; b=nKjhRdvnncxUCsR62SiTTF8h1CIiupRjwOU0GVl4Ic+LjcX8IJI/DmGW7dHnXr7Yut FpAYHm9vhbGeO+2WIz9T2xGx8V8s4dsZJItU1AInFYEym0StUJoxQVRZfa9yX5kVIsGc KoNzGNi6T21ynD0G3Tgx4SLq30Jweri8RfaCTpk3rEqLFR7vkS8aD30bsNXMZjvbAgUy wq8A7A7srHzVDY/NbjrBBZekO0fLcUZspTc0HNwENgkOwtRQt2fiV40Ei67KOX62j4Er /nSod7hvU0wa3CwXdBkG1NtN3rpqkNl7pur3cwD65wVOaSs44NUaAWlT+s+NaF2538Wo 99Fg== X-Gm-Message-State: AKaTC01Ji2wM9iNLiWYRbJaU6He1lQsdvmsIE/LZyjr5uC20TLIoGo/9yhaPkRcvfnDt9/AdtuDPJm+Tuzu66g== X-Received: by 10.129.146.70 with SMTP id j67mr34717916ywg.275.1480508953368; Wed, 30 Nov 2016 04:29:13 -0800 (PST) MIME-Version: 1.0 Received: by 10.129.146.143 with HTTP; Wed, 30 Nov 2016 04:29:13 -0800 (PST) In-Reply-To: References: From: hariram ravichandran Date: Wed, 30 Nov 2016 17:59:13 +0530 Message-ID: Subject: Re: Query expansion To: Michael McCandless Cc: Lucene Users Content-Type: multipart/alternative; boundary=94eb2c09216c40a0f3054283dadf archived-at: Wed, 30 Nov 2016 12:29:20 -0000 --94eb2c09216c40a0f3054283dadf Content-Type: text/plain; charset=UTF-8 I am overriding getFieldQuery(String field, String fieldText,boolean quoted). And in case of phrase query, getFieldQuery(String field, String queryText, int slop) will be called. And prefix query will not be my use case. So, we can ignore prefix query. Assume this is my only case. Sequence of words (apple orange mango) as input, and i need result for (apple~ orange~ mango~). And I use default conjunction operator as AND (parser.setDefaultOperator( QueryParser.Operator.AND)) for providing better relevance in results. That method works as I expected. Is there any drawbacks of using this? And is there any better method to expand query like this? On Wed, Nov 30, 2016 at 4:37 AM, Michael McCandless < lucene@mikemccandless.com> wrote: > This is likely tricky to do correctly. > > E.g., MultiFieldQueryParser.getFieldQuery is invoked on whole chunks > of text. If you search for: > > apple orange > > I suspect it won't do what you want, since the whole string "apple > orange" is passed to getFieldQuery. > > How do you want to handle e.g. a phrase query (user types "apple > orange", with the double quotes)? Or a prefix query (app*)? > > Maybe you could instead override newTermQuery? In the example above > it would be invoked twice, once for apple and once for orange. > > Finally, all this being said, making everything fuzzy is likely a big > performance hit and often poor results (massive recall, poor > precision) to the user! > > Mike McCandless > > http://blog.mikemccandless.com > > > On Mon, Nov 28, 2016 at 6:24 AM, hariram ravichandran > wrote: > > I need to perform *fuzzy search* for the whole search term. I > > extended MultiFieldQueryParser and overridden getFieldQuery() > > > > > > protected Query getFieldQuery(String field, String fieldText,boolean > > quoted) throws ParseException{ > > return *super.getFuzzyQuery(field,fieldText,3.0f); > > //constructing fuzzy query* > > } > > > > For example, If i give search term as "(apple AND orange) OR (mango)", > the > > query should be expanded as "(apple~ AND orange~) OR (mango~)". > > > > I need to search in multiple fields and also i need to implement this > > without affecting any of the lucene features. Is there any other simple > way? > --94eb2c09216c40a0f3054283dadf--