Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 68905 invoked from network); 24 Jun 2008 13:36:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Jun 2008 13:36:54 -0000 Received: (qmail 88730 invoked by uid 500); 24 Jun 2008 13:36:50 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 88458 invoked by uid 500); 24 Jun 2008 13:36:49 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 88447 invoked by uid 99); 24 Jun 2008 13:36:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jun 2008 06:36:49 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.200.169 as permitted sender) Received: from [209.85.200.169] (HELO wf-out-1314.google.com) (209.85.200.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jun 2008 13:35:56 +0000 Received: by wf-out-1314.google.com with SMTP id 28so2544258wfc.20 for ; Tue, 24 Jun 2008 06:36:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=5OTwGfTLLfQH44Hkq6LhtVNXrf0sex6QjWijmMiM6ZM=; b=rt5FUMMbhOdnZPaz8S0M4tFr+SzxuAZRaGhuTo/QuFJlsiEvRLtdzo10RmdCT5zA5Y 5oKh/3erVlQucOqWOBQpR2pH8agRpLJ94CNHxdS09/mGlHHGcKlyz9v8cvbaEXhml8Cy BIijJEnhKyTQHhHpzpmgdqTtRZpCCh9phKxY0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=pN+KNdxy2fmLOX8c4iwj2q80K7ru8WML4pXb0bdj1OiZkWoAzv3xU+st/ZQr8esZ7p C9hOQvufNoNwtCr+djHzcFT5ay7vAdUjBdq8d6z6MeFfyhzfHMdQsnmUDIvKk42nQyeO ttFDjqxZDhn1UqYV/LVhkxouBEQWFEXsHo/aA= Received: by 10.143.2.19 with SMTP id e19mr5551339wfi.90.1214314575960; Tue, 24 Jun 2008 06:36:15 -0700 (PDT) Received: by 10.150.201.6 with HTTP; Tue, 24 Jun 2008 06:36:15 -0700 (PDT) Message-ID: <359a92830806240636i44cbf077w7bdffd0a2f253dbc@mail.gmail.com> Date: Tue, 24 Jun 2008 09:36:15 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: Wildcard and Literal Searches combined In-Reply-To: <18089950.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_19906_17911994.1214314575858" References: <18089950.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_19906_17911994.1214314575858 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Do you require that the words be right next to each other? You can, of course, set your default to AND (it's OR unless you change it explicitly). That'll give you documents that have both Dublin and City. You don't need wildcards at all in this case. If you require exact matches, you can use PhraseQuery or SpanQuery to get words right next to each other. That is, "dublin city" would be found but not "dublin is a very large city". You can set slop (the distance between terms that counts as a hit) variously, and you can also specify whether order is important. Again, you don't need wildcards at all (and shouldn't use them unless you really need to). Beware that phrase queries do NOT go through wildcard expansion. That is, "Dubli* city" (as a PhraseQuery) doesn't do what you might think. I'd really look over the query syntax carefully for more insights. See http://lucene.apache.org/java/docs/queryparsersyntax.html Also note that SpanQueries are something you construct yourself, you don't send text through the query parser and magically get SpanQueries back. Best Erick On Tue, Jun 24, 2008 at 8:28 AM, mick l wrote: > > Folks, > My users require wildcard searches. Sometimes their search phrases contain > spaces. I am having trouble trying to implement a wildcard search on > strings > containing spaces, so if the term includes spaces I force a literal search > by adding double quotes to the search term. > So the search string for 'Dublin' becomes search term (Dublin*) > whereas search string 'Dublin City' becomes ("Dublin City") > > > If I use (Dublin City*) I get all instances of Dublin OR City in the > results > which is not what I am looking for. > > Is there any way I can combine the wildcard search and the literal? > > Heres my existing code. Its in c# with Lucene.Net > > //if input has spaces we do a literal search > if (sSearchQuery.IndexOf(" ") < 0) > { > sSearchQuery = "(" + sSearchQuery + "*)"; > } > else > { > sSearchQuery = "(\"" + sSearchQuery + "\")"; > } > IndexSearcher searcher = new IndexSearcher(sIndexLocation); > Hits oHitColl = searcher.Search(oParser.Parse(sSearchQuery)); > > Thanks folks > -- > View this message in context: > http://www.nabble.com/Wildcard-and-Literal-Searches-combined-tp18089950p18089950.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_19906_17911994.1214314575858--