Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 13660 invoked from network); 17 Oct 2008 07:43:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Oct 2008 07:43:37 -0000 Received: (qmail 17638 invoked by uid 500); 17 Oct 2008 07:43:38 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 17620 invoked by uid 500); 17 Oct 2008 07:43:38 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 17609 invoked by uid 99); 17 Oct 2008 07:43:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Oct 2008 00:43:38 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [74.55.86.74] (HELO smtp.webfaction.com) (74.55.86.74) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Oct 2008 07:42:27 +0000 Received: from [192.168.213.119] (unknown [213.94.211.166]) by smtp.webfaction.com (Postfix) with ESMTP id 9CAC21C70197 for ; Fri, 17 Oct 2008 02:42:31 -0500 (CDT) Message-ID: <48F841D8.9070906@therogueprocess.net> Date: Fri, 17 Oct 2008 08:42:16 +0100 From: John Byrne User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) MIME-Version: 1.0 To: general@lucene.apache.org Subject: Re: question about wildcard like search References: <4fe4c4f50810161155q53b01c08t4970f6fe733ffc9c@mail.gmail.com> In-Reply-To: <4fe4c4f50810161155q53b01c08t4970f6fe733ffc9c@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi, I think you'll normally get quicker answers for this type of question on the main Lucene users mailing list: java-user@lucene.apache.org Anyway, you just need to call the QueryParser method 'setAllowLeadingWildcard(true)' to allow leading wildcards. However, once you do that, your leading wildcard queries will probably expand into many terms, and you should also therefore call 'BooleanQuery.setMaxClauseCount' with a large number - you could use Integer.MAX_VALUE, but as far as I know this can cause a problem if you use FuzzyQuerys. The number of terms in your index is the largest you'll need - but presumably that can grow. I would just set it at 1 million or something like that. You're problaby never going to have a million terms. -John ChadDavis wrote: > I need to do a query where i'm looking for strings that are embedded into a > single word in one of the fields. In other words, a field my have a phrase > like: > > Bob,Tom,Kevin,Jeff > > or > > Tom,Doug,Steven,Bob > > > I would like to be able to use the wildcard query to search for any document > that has the name "Tom" embedded, in any fashion, in this field. > > I would like to have built a WildCardQuery like "*Tom*", but it doesn't > accept * as the first character, due to performance reasons the > documentation explains. > > So, how do I do such a query? I'm looking into the fuzzy logic query, right > now. > > > ------------------------------------------------------------------------ > > > No virus found in this incoming message. > Checked by AVG - http://www.avg.com > Version: 8.0.173 / Virus Database: 270.8.1/1728 - Release Date: 10/16/2008 7:38 AM > >