From java-user-return-53998-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Thu Nov 1 19:55:13 2012 Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CDC8CDDF6 for ; Thu, 1 Nov 2012 19:55:13 +0000 (UTC) Received: (qmail 98902 invoked by uid 500); 1 Nov 2012 19:55:11 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 98853 invoked by uid 500); 1 Nov 2012 19:55:11 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 98844 invoked by uid 99); 1 Nov 2012 19:55:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 19:55:11 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 19:55:04 +0000 Received: by mail-ob0-f176.google.com with SMTP id x4so3349439obh.35 for ; Thu, 01 Nov 2012 12:54:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=eqgAHsj6ygS0BDC8FA03oE6Df3aM9yss6sPMVJ6HByM=; b=PcfhBbxoIMuzKgGrRP/PRFMnK5c9kC/x5MaSvRIO6nC64YG7Jb7+gfkYu1aLVwQIP9 r+WBpykd4LwbZEupSVvhvB7dZINNLCnfq3T4yhJQGXfpGXnAENgamTqK4axymivLXNyN koB5HOE2X1e+zxVbCHYwZp2APw8PbKr6EPQXIalHplHUJSxXiYT6lQp4LO+Dame82Pzy huwY1iuyO+w6mIbv+NUqi2+MOYow6rL01flqDTxk+HYyBR63R3+ZnIkznVcg/TKbyon6 LuGwa5nAZIYkEduhElg6vhW4CxFYiJJ7Szju7d8HudTRJavzcwsuqhk1QMyTOFof4qJd jjxQ== MIME-Version: 1.0 Received: by 10.60.172.48 with SMTP id az16mr34549180oec.64.1351799682450; Thu, 01 Nov 2012 12:54:42 -0700 (PDT) Received: by 10.76.112.16 with HTTP; Thu, 1 Nov 2012 12:54:42 -0700 (PDT) X-Originating-IP: [76.103.244.140] In-Reply-To: References: Date: Thu, 1 Nov 2012 12:54:42 -0700 Message-ID: Subject: Re: ComplexPhraseQueryParser and stop words From: Brandon Mintern To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmxBNP8JKQsAyjP9E0xrmXfW5eS8PhA73AJg6wleqymPlj9e0r/wG9cgjZZaZI7CU2nM6Jn X-Virus-Checked: Checked by ClamAV on apache.org We are still having the issue where ComplexPhraseQueryParser fails on quoted expressions that include stop words. Does the original developer of this class still contribute to Lucene? On Fri, Oct 26, 2012 at 3:37 PM, Brandon Mintern wrote: > We recently switched from QueryParser to ComplexPhraseQueryParser > (from lucene-queryparser-3.6.0.jar), and we've come across two > separate problems. > > The first is that because it parses quoted expressions twice, it is > necessary to double-escape any escaped characters. So if I do not want > to allow users to include : in their search, I have to escape it as > \:, but when it is in quotes, I have to escape it as \\:, because the > first parse will turn \\ into \ and then the second time around will > do the proper escape. Likewise, I need to escape \ because it shows up > frequently in paths. When not in quotes, this is simply \\. In quotes, > it must be \\\\. > > So that was a minor issue, but we were able to work around it without > too much trouble. This next problem, though, does not seem to have an > easy answer. > > Our searches for quoted phrases which include stop words no longer > match. If a document contained the phrase "time to leave", only "time" > and "leave" get indexed, but their positions are maintained so that a > later search for "time to leave" works correctly. With the standard > QueryParser, this worked just fine. With the ComplexPhraseQueryParser, > it no longer works at all. Searching for time AND leave works, but > "time to leave" simply fails. > > Does anyone know where I should start in solving this issue? > > Thanks, > Brandon --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org