Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 38281 invoked from network); 21 May 2010 13:04:54 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 May 2010 13:04:54 -0000 Received: (qmail 62887 invoked by uid 500); 21 May 2010 13:04:51 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 62834 invoked by uid 500); 21 May 2010 13:04:51 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 62826 invoked by uid 99); 21 May 2010 13:04:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 May 2010 13:04:50 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [98.136.167.5] (HELO web113305.mail.gq1.yahoo.com) (98.136.167.5) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 21 May 2010 13:04:42 +0000 Received: (qmail 17048 invoked by uid 60001); 21 May 2010 13:04:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1274447059; bh=RncYJr6Xc4PJTvIFmZ6xbHRadDtGVPqfANFFGb86FC0=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=Pg4ISdpd4qAV5jXYRy/x2W9x8su0bjE+ZqMXnhEWw9E3W7+6Hz1GVrFUUfLYHeloBSuhuudhJeb3oVeRJ/swnUTiiZqbbNy7u2ygggH8j6TROfhSMvq+3mVy30pzDd0zVpep0hRshhSHLSplUv6H645LgDEXv6jEtVPhtCPN8hg= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=Rk231IeuVQzSncath+8mYklTxQNfaZH9gpWn5KIrM3LrTb76CEHaDFCYRnUQuZZGiAwmztWKb6lqDnPEvni6t5GDFKC0jV6RwVEaUPFdcw3YBLiUJQ5R6vFsjNBw6cZB/2Qyf2zItcsop4TvinrMtBi+lvTyorlp/RPVmdmgTVE=; Message-ID: <924678.15937.qm@web113305.mail.gq1.yahoo.com> X-YMail-OSG: q_v9NV4VM1m2nR2ua4O21ZbnF8xlQzO65ET5jKGUvsPp7eN Tx8enc9MeTmloMbZuO6jtEu2DnNbbgzcsFwJZHiBVV1Qoe8HkPJTmiVULKFX uo.74oj5u5qWgjO2Y1ucju362rTuYHsssa_qF7L8lfPMKlutf1Nyv36xy.vr KKT497ScgUuEYB6oGrw255GV5_892KMpAG0LYOePLVr8_XpgfJCx9DaytsA3 FFVrm1VNJiKRD6vhVpkLgrPqNwoDWwb8gr80Yots.ljxsjlqnNxRugu46xmo SPIEcjza3IePpOSbPBckyduo644IFjEALnMW3IKeDCGg63JcZUQnoAZSVS1h FNm5NOMHmVjMFq.5PUTOFo8uCZsA- Received: from [99.150.141.64] by web113305.mail.gq1.yahoo.com via HTTP; Fri, 21 May 2010 06:04:19 PDT X-Mailer: YahooMailClassic/11.0.8 YahooMailWebService/0.8.103.269680 Date: Fri, 21 May 2010 06:04:19 -0700 (PDT) From: Ivan Provalov Subject: Re: Stemming and Wildcard Queries To: java-user@lucene.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Thanks, everyone!=0A=0A--- On Thu, 5/20/10, Herbert Roitblat wrote:=0A=0A> From: Herbert Roitblat =0A> Subject: R= e: Stemming and Wildcard Queries=0A> To: java-user@lucene.apache.org=0A> Da= te: Thursday, May 20, 2010, 4:48 PM=0A> At a general level, we have found= =0A> that stemming during indexing is not advisable.=A0=0A> Sometimes users= want the exact form and if you have removed=0A> the exact form during inde= xing, obviously, you cannot=0A> provide that.=A0 Rather, we have found that= stemming=0A> during search is more useful, or maybe it should be called=0A= > anti-stemming.=A0 For any given input for which the user=0A> wants to ste= m, we could derive the variations during the=0A> query processing.=A0 E.g.,= plan can be expanded to=0A> include plans, planning, planned, etc.=0A> =0A= > In our application we provide a feature that is sometimes=0A> called a wo= rd wheel.=A0 When someone enters plan in this=0A> tool, we show all of the = words in the index that start with=0A> plan. Here are some of the related w= ords:=0A> plan=0A> plane=0A> planes=0A> planet=0A> planificaci=0A> planned= =0A> plannedoutages.xls=0A> planner=0A> planners=0A> =0A> Just a thought.= =0A> Herb=0A> =0A> ----- Original Message ----- From: "Ivan Provalov" =0A> To: =0A> Sent: Thursday, M= ay 20, 2010 1:16 PM=0A> Subject: Stemming and Wildcard Queries=0A> =0A> =0A= > > Is there a good way to combine the wildcard queries=0A> and stemming?= =0A> > =0A> > As is, the field which is stemmed at index time, won't=0A> wo= rk with some wildcard queries.=0A> > =0A> > We were thinking to create two = separate index fields -=0A> one stemmed, one non-stemmed, but we are having= issues with=0A> our SpanNear queries (they require the same field).=0A> > = =0A> > We thought to try combining the stemmed and=0A> non-stemmed terms in= the same field, but we are concerned=0A> about the stats being skewed as a= result of this (especially=0A> for the TermVector stats).=A0 Can overloadi= ng the=0A> non-stemmed field with stemmed terms cause any issues with=0A> t= he TermVector?=0A> > =0A> > Any suggestions?=0A> > =0A> > Ivan Provalov=0A>= > =0A> > =0A> > =0A> > =0A> >=0A> ----------------------------------------= -----------------------------=0A> > To unsubscribe, e-mail: java-user-unsub= scribe@lucene.apache.org=0A> > For additional commands, e-mail: java-user-h= elp@lucene.apache.org=0A> > =0A> > =0A> =0A> =0A> -------------------------= --------------------------------------------=0A> To unsubscribe, e-mail: ja= va-user-unsubscribe@lucene.apache.org=0A> For additional commands, e-mail: = java-user-help@lucene.apache.org=0A> =0A> =0A=0A=0A --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org