From general-return-791-apmail-lucene-general-archive=lucene.apache.org@lucene.apache.org Fri Jul 25 14:04:35 2008 Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 51180 invoked from network); 25 Jul 2008 14:04:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Jul 2008 14:04:35 -0000 Received: (qmail 13098 invoked by uid 500); 25 Jul 2008 14:04:33 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 13080 invoked by uid 500); 25 Jul 2008 14:04:33 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 13069 invoked by uid 99); 25 Jul 2008 14:04:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jul 2008 07:04:33 -0700 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jul 2008 14:03:37 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1KMNtq-0002By-O4 for general@lucene.apache.org; Fri, 25 Jul 2008 07:04:02 -0700 Message-ID: <18652365.post@talk.nabble.com> Date: Fri, 25 Jul 2008 07:04:02 -0700 (PDT) From: JBTech To: general@lucene.apache.org Subject: Re: issues with wildcard search and snowball english analyzer In-Reply-To: <826061.84991.qm@web35602.mail.mud.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: jb4tech@gmail.com References: <18641947.post@talk.nabble.com> <826061.84991.qm@web35602.mail.mud.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org Hi Andrew, Thanks for your quick reply. I tried with e*t and that did not return any results. I am using Lucene 2.2. The full word elephant returned one hit as I am using the same analayzer for indexing and searching. I uploaded the java class I used for testing this. Thanks JB Andrew Gilmartin-2 wrote: > > --- On Thu, 7/24/08, JBTech wrote: > >> Is there a way to avoid stemming in certain cases? > > As a general rule, make the query intelligent and not the index. > Therefore, index your text verbatim. Small changes like changing terms to > lowercase and removing possessives are fine. You now have an index upon > which you can make intelligent queries. > > An intelligent query requires keeping track of several collections of > term-to-term(s) mappings. For example, stemmed-term to verbatim-term(s). > Now, convert the users search for "elephant is a big animal" into > something akin to > > ( (elephant^10) OR (A) OR (B) ) AND > ( (big^10) OR (C) ) AND > ( (animal^10) OR (D) ) > > Where A and B are other terms with the same stemming as elephant, C is > another term with the same stemming as big, and D is a another term with > the same stemming as animal. Adding the boost ensures that a verbatim > match pushes the document's rank higher and so ensure that what the user > asked for is closer to the top. > > This basic idea of making the queries more intelligent by broadening them > and boosting term weights gives you a lot of control over the query and > how results are ranked. The same control is not possible by making the > index more intelligent. > > Don't worry about Lucene's performance with complex queries. My experience > is that it is very fast. > > And to answer your specific question, search for "e*t" will work as is. > > -- Andrew > > > > > http://www.nabble.com/file/p18652365/Testing.java Testing.java -- View this message in context: http://www.nabble.com/issues-with-wildcard-search-and-snowball-english-analyzer-tp18641947p18652365.html Sent from the Lucene - General mailing list archive at Nabble.com.