From java-user-return-49575-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Tue May 3 09:10:36 2011 Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C14A27B7 for ; Tue, 3 May 2011 09:10:36 +0000 (UTC) Received: (qmail 23748 invoked by uid 500); 3 May 2011 09:10:34 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 23691 invoked by uid 500); 3 May 2011 09:10:34 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 23683 invoked by uid 99); 3 May 2011 09:10:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 May 2011 09:10:34 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [212.243.6.182] (HELO mail.mysigninternational.com) (212.243.6.182) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 May 2011 09:10:27 +0000 Received: from localhost (localhost [127.0.0.1]) by mail.mysigninternational.com (Postfix) with ESMTP id 9A909C2180 for ; Tue, 3 May 2011 11:10:07 +0200 (CEST) Received: from mail.mysigninternational.com ([127.0.0.1]) by localhost (mysign-postfix1.INTERNET [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KZ511DdPUrE7 for ; Tue, 3 May 2011 11:10:07 +0200 (CEST) Received: from Exchange2007.mysigndomain.corp (unknown [192.168.13.8]) by mail.mysigninternational.com (Postfix) with ESMTP id 5A7A4C2117 for ; Tue, 3 May 2011 11:10:07 +0200 (CEST) Received: from Exchange2007.mysigndomain.corp ([fe80::b93b:88fd:f694:bc31]) by Exchange2007.mysigndomain.corp ([fe80::b93b:88fd:f694:bc31%10]) with mapi; Tue, 3 May 2011 11:10:07 +0200 From: Clemens Wyss To: "java-user@lucene.apache.org" Date: Tue, 3 May 2011 11:10:05 +0200 Subject: AW: "fuzzy prefix" search Thread-Topic: "fuzzy prefix" search Thread-Index: AcwJcV2XJMP8TubnTUiHInqXSUI3/wAAF4Wg Message-ID: References: <000001cc08bf$19583a50$4c08aef0$@thetaphi.de> In-Reply-To: Accept-Language: de-DE, de-CH Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: de-DE, de-CH Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Unfortunately lowercasing doesn't help.=20 Also, doesn't the FuzzyQuery ignore casing? > -----Urspr=FCngliche Nachricht----- > Von: Ian Lea [mailto:ian.lea@gmail.com] > Gesendet: Dienstag, 3. Mai 2011 11:06 > An: java-user@lucene.apache.org > Betreff: Re: "fuzzy prefix" search >=20 > Mer !=3D mer. The latter will be what is indexed because StandardAnalyze= r > calls LowerCaseFilter. >=20 > -- > Ian. >=20 >=20 > On Tue, May 3, 2011 at 9:56 AM, Clemens Wyss > wrote: > > Sorry for coming back to my issue. Can anybody explain why my "simple" > unit test below fails? Any hint/help appreciated. > > > > Directory directory =3D new RAMDirectory(); IndexWriter indexWriter =3D > > new IndexWriter( directory, new StandardAnalyzer( Version.LUCENE_31 ), > > IndexWriter.MaxFieldLength.UNLIMITED ); Document document =3D new > > Document(); document.add( new Field( "test", "Merlot", > > Field.Store.YES, Field.Index.ANALYZED ) ); indexWriter.addDocument( > > document ); IndexReader indexReader =3D indexWriter.getReader(); > > IndexSearcher searcher =3D new IndexSearcher( indexReader ); Query q = =3D > > new FuzzyQuery( new Term( "test", "Mer" ), 0.5f, 0, 10 ); // or Query > > q =3D new FuzzyQuery( new Term( "test", "Mer" ), 0.5f); TopDocs result = =3D > > searcher.search( q, 10 ); Assert.assertEquals( 1, result.totalHits ); > > > > - Clemens > > > >> -----Urspr=FCngliche Nachricht----- > >> Von: Clemens Wyss [mailto:clemensdev@mysign.ch] > >> Gesendet: Montag, 2. Mai 2011 23:01 > >> An: java-user@lucene.apache.org > >> Betreff: AW: "fuzzy prefix" search > >> > >> Is it the combination of FuzzyQuery and Term which makes the search > >> to go for "word boundaries"? > >> > >> > -----Urspr=FCngliche Nachricht----- > >> > Von: Clemens Wyss [mailto:clemensdev@mysign.ch] > >> > Gesendet: Montag, 2. Mai 2011 14:13 > >> > An: java-user@lucene.apache.org > >> > Betreff: AW: "fuzzy prefix" search > >> > > >> > I tried this too, but unfortunately I only get hits when the search > >> > term is a least as long as the word to be looked up. > >> > > >> > E.g.: > >> > ... > >> > Directory directory =3D new RAMDirectory(); IndexWriter indexWriter = =3D > >> > new IndexWriter( directory, IndexManager.getIndexingAnalyzer( > >> LOCALE_DE ), > >> > =A0 =A0 =A0 =A0 =A0 =A0 IndexWriter.MaxFieldLength.UNLIMITED ); > >> > > >> > Document document =3D new Document(); document.add( new Field( > >> > "test", "Merlot", > >> > =A0 =A0 =A0 =A0 =A0 =A0 Field.Store.YES, Field.Index.ANALYZED ) ); > >> indexWriter.addDocument( > >> > document ); > >> > > >> > IndexReader indexReader =3D indexWriter.getReader(); IndexSearcher > >> > searcher =3D new IndexSearcher( indexReader ); > >> > > >> > Query q =3D new FuzzyQuery( new Term( "test", "Mer" ), 0.6f, 1 ); > >> > TopDocs result =3D searcher.search( q, 10 ); Assert.assertEquals( 1, > >> result.totalHits ); ... > >> > > >> > > -----Urspr=FCngliche Nachricht----- > >> > > Von: Uwe Schindler [mailto:uwe@thetaphi.de] > >> > > Gesendet: Montag, 2. Mai 2011 13:50 > >> > > An: java-user@lucene.apache.org > >> > > Betreff: RE: "fuzzy prefix" search > >> > > > >> > > Hi, > >> > > > >> > > You can pass an integer to FuzzyQuery which defines the number of > >> > > characters that are seen as prefix. So all terms must match this > >> > > prefix and the rest of each term is matched using fuzzy. > >> > > > >> > > Uwe > >> > > > >> > > ----- > >> > > Uwe Schindler > >> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de > >> > > eMail: uwe@thetaphi.de > >> > > > >> > > > -----Original Message----- > >> > > > From: Clemens Wyss [mailto:clemensdev@mysign.ch] > >> > > > Sent: Monday, May 02, 2011 1:47 PM > >> > > > To: java-user@lucene.apache.org > >> > > > Subject: "fuzzy prefix" search > >> > > > > >> > > > I'd like to search fuzzily but not on a full term. > >> > > > E.g. > >> > > > I have a text "Merlot del Ticino" > >> > > > I'd like > >> > > > "mer", "merr", "melo", ... to match. > >> > > > > >> > > > If I use FuzzyQuery only "merlot, =A0"merlott" hit. What > >> > > > Query-combination should I use? > >> > > > > >> > > > Thx > >> > > > Clemens > >> > > > > >> > > > > >> > > > --------------------------------------------------------------- > >> > > > --- > >> > > > -- > >> > > > - To unsubscribe, e-mail: > >> > > > java-user-unsubscribe@lucene.apache.org > >> > > > For additional commands, e-mail: > >> > > > java-user-help@lucene.apache.org > >> > > > >> > > > >> > > > >> > > ----------------------------------------------------------------- > >> > > --- > >> > > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> > > For additional commands, e-mail: java-user-help@lucene.apache.org > >> > > >> > > >> > ------------------------------------------------------------------- > >> > -- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> > For additional commands, e-mail: java-user-help@lucene.apache.org > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org