Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 96314 invoked from network); 19 Apr 2010 16:18:35 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Apr 2010 16:18:35 -0000 Received: (qmail 37253 invoked by uid 500); 19 Apr 2010 16:18:33 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 37196 invoked by uid 500); 19 Apr 2010 16:18:32 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 37188 invoked by uid 99); 19 Apr 2010 16:18:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Apr 2010 16:18:32 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [80.240.225.48] (HELO SAMRE007.hmc.telekom.at) (80.240.225.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Apr 2010 16:18:26 +0000 User-Agent: Microsoft-Entourage/12.24.0.100205 Date: Mon, 19 Apr 2010 18:18:02 +0200 Subject: Re: Combining PrefixQuery and FuzzyQuery From: Lukas =?ISO-8859-1?B?1nN0ZXJyZWljaGVy?= To: Message-ID: Thread-Topic: Combining PrefixQuery and FuzzyQuery Thread-Index: AcrfzqsUXudgxM6dy06hZn0dRHE04AAALRfgAABo0dgAAG0rAAACSreZ In-Reply-To: <027401cadfd2$ce98ff60$6bcafe20$@de> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable X-OriginalArrivalTime: 19 Apr 2010 16:18:05.0144 (UTC) FILETIME=[E41D5580:01CADFDB] X-Virus-Checked: Checked by ClamAV on apache.org Update to my last response with a sample of what I thought you might mean: This does not work. Original query up till now: +(item.name:the* item.name:the) New query would look like this (which states Match item.name where a term exists that is either Exactly the or starts with the): +(item.name:the*~0.79 item.name:the~0.79) Until now these matched (I apply ignoring of cases as mentioned): On The Run The Final Cut Us And Them With the change "Us And Them" will not match anymore. What I want is a change so it would even match "Us and Th=E9m" Lukas Am 19.04.10 17:13 schrieb "Uwe Schindler" unter : > Dont use PrefixQuery, only FuzzyQuery. There you pass in the whole term (= with > prefix) and define how many characters are the prefix. >=20 > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: uwe@thetaphi.de >=20 >=20 >> -----Original Message----- >> From: Lukas =D6sterreicher [mailto:lukas.oesterreicher@austria.real.com] >> Sent: Monday, April 19, 2010 5:00 PM >> To: java-user@lucene.apache.org >> Subject: Re: Combining PrefixQuery and FuzzyQuery >>=20 >> Well, how would this look like in code? >> Currently I have the prefix query like this: >>=20 >> BooleanQuery bQuery =3D new BooleanQuery(); >> PrefixQuery prefixQuery =3D new PrefixQuery(new Term("item.name", >> termText)); >> bQuery.add( prefixQuery, Occur.MUST); >>=20 >> I dont see any class named PrefixTerm. >> I'd appreciate it if you could show me how it is done in java code. >>=20 >> Lukas >>=20 >> Am 19.04.10 16:48 schrieb "Uwe Schindler" unter : >>=20 >>> How about a fuzzy query with a prefix term? Its configureable. >>>=20 >>> ----- >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: uwe@thetaphi.de >>>=20 >>>=20 >>>> -----Original Message----- >>>> From: Lukas =D6sterreicher >> [mailto:lukas.oesterreicher@austria.real.com] >>>> Sent: Monday, April 19, 2010 4:43 PM >>>> To: java-user@lucene.apache.org >>>> Subject: Combining PrefixQuery and FuzzyQuery >>>>=20 >>>> Hello. >>>>=20 >>>> Is it possible to combine PrefixQuery and FuzzyQuery? >>>> The search on a term should both be fuzzy but also match with >> results >>>> that >>>> jut begin with that token (or an approximation of that token). >>>>=20 >>>> If it is possible, can you give me an example on how to achieve >> this? >>>>=20 >>>> Currently I only use the PrefixQuery and performance is ok. >>>> Would performance with such a combination be much worse? >>>>=20 >>>> I would not even need a complete fuzzy search, it would suffice >>>> To have the matching be done without caring for cases (this I >> already >>>> have >>>> present by using a modified WhitespaceTokenizer which filters >>>> To lower cases) and with also matching characters where accents >>>> Also match, so e would match =E9 and =E8. >>>>=20 >>>> Finally, I would like to know how much sorting a string field >>>> Which is not too long (containing track or album title) affects >>>> performance >>>> Copared to not providing any sorting parameters. >>>>=20 >>>> Thanx in advance, >>>> Lukas >>>=20 >>>=20 >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>=20 >>=20 >>=20 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >=20 >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org