lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: "fuzzy prefix" search
Date Tue, 03 May 2011 09:44:29 GMT
Then why not do that?  Add a PrefixQuery and a FuzzyQuery to a
BooleanQuery and use that.


--
Ian.


On Tue, May 3, 2011 at 10:25 AM, Clemens Wyss <clemensdev@mysign.ch> wrote:
>>PrefixQuery
> I'd like the combination of prefix and fuzzy ;-) because people could also type "menlo"
or "märl" and in any of these cases I'd like to get a hit on Merlot (for suggesting Merlot)
>
>> -----Ursprüngliche Nachricht-----
>> Von: Ian Lea [mailto:ian.lea@gmail.com]
>> Gesendet: Dienstag, 3. Mai 2011 11:22
>> An: java-user@lucene.apache.org
>> Betreff: Re: "fuzzy prefix" search
>>
>> I'd assumed that FuzzyQuery wouldn't ignore case but I could be wrong.
>>  What would be the edit distance between "mer" and "merlot"? Would it be
>> less that 1.5 which I reckon would be the value of length(term)*0.5 as
>> detailed in the javadocs?  Seems unlikely, but I don't really know anything
>> about the Levenshtein (edit distance) algorithm as used by FuzzyQuery.
>> Wouldn't a PrefixQuery be more appropriate here?
>>
>>
>> --
>> Ian.
>>
>> On Tue, May 3, 2011 at 10:10 AM, Clemens Wyss <clemensdev@mysign.ch>
>> wrote:
>> > Unfortunately lowercasing doesn't help.
>> > Also, doesn't the FuzzyQuery ignore casing?
>> >
>> >> -----Ursprüngliche Nachricht-----
>> >> Von: Ian Lea [mailto:ian.lea@gmail.com]
>> >> Gesendet: Dienstag, 3. Mai 2011 11:06
>> >> An: java-user@lucene.apache.org
>> >> Betreff: Re: "fuzzy prefix" search
>> >>
>> >> Mer != mer.  The latter will be what is indexed because
>> >> StandardAnalyzer calls LowerCaseFilter.
>> >>
>> >> --
>> >> Ian.
>> >>
>> >>
>> >> On Tue, May 3, 2011 at 9:56 AM, Clemens Wyss
>> <clemensdev@mysign.ch>
>> >> wrote:
>> >> > Sorry for coming back to my issue. Can anybody explain why my
>> "simple"
>> >> unit test below fails? Any hint/help appreciated.
>> >> >
>> >> > Directory directory = new RAMDirectory(); IndexWriter indexWriter =
>> >> > new IndexWriter( directory, new StandardAnalyzer(
>> Version.LUCENE_31
>> >> > ), IndexWriter.MaxFieldLength.UNLIMITED ); Document document =
>> new
>> >> > Document(); document.add( new Field( "test", "Merlot",
>> >> > Field.Store.YES, Field.Index.ANALYZED ) ); indexWriter.addDocument(
>> >> > document ); IndexReader indexReader = indexWriter.getReader();
>> >> > IndexSearcher searcher = new IndexSearcher( indexReader ); Query q
>> >> > = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f, 0, 10 ); // or
>> >> > Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.5f); TopDocs
>> >> > result = searcher.search( q, 10 ); Assert.assertEquals( 1,
>> >> > result.totalHits );
>> >> >
>> >> > - Clemens
>> >> >
>> >> >> -----Ursprüngliche Nachricht-----
>> >> >> Von: Clemens Wyss [mailto:clemensdev@mysign.ch]
>> >> >> Gesendet: Montag, 2. Mai 2011 23:01
>> >> >> An: java-user@lucene.apache.org
>> >> >> Betreff: AW: "fuzzy prefix" search
>> >> >>
>> >> >> Is it the combination of FuzzyQuery and Term which makes the
>> >> >> search to go for "word boundaries"?
>> >> >>
>> >> >> > -----Ursprüngliche Nachricht-----
>> >> >> > Von: Clemens Wyss [mailto:clemensdev@mysign.ch]
>> >> >> > Gesendet: Montag, 2. Mai 2011 14:13
>> >> >> > An: java-user@lucene.apache.org
>> >> >> > Betreff: AW: "fuzzy prefix" search
>> >> >> >
>> >> >> > I tried this too, but unfortunately I only get hits when the
>> >> >> > search term is a least as long as the word to be looked up.
>> >> >> >
>> >> >> > E.g.:
>> >> >> > ...
>> >> >> > Directory directory = new RAMDirectory(); IndexWriter
>> >> >> > indexWriter = new IndexWriter( directory,
>> >> >> > IndexManager.getIndexingAnalyzer(
>> >> >> LOCALE_DE ),
>> >> >> >             IndexWriter.MaxFieldLength.UNLIMITED );
>> >> >> >
>> >> >> > Document document = new Document(); document.add( new Field(
>> >> >> > "test", "Merlot",
>> >> >> >             Field.Store.YES, Field.Index.ANALYZED )
);
>> >> >> indexWriter.addDocument(
>> >> >> > document );
>> >> >> >
>> >> >> > IndexReader indexReader = indexWriter.getReader(); IndexSearcher
>> >> >> > searcher = new IndexSearcher( indexReader );
>> >> >> >
>> >> >> > Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.6f,
1 );
>> >> >> > TopDocs result = searcher.search( q, 10 ); Assert.assertEquals(
>> >> >> > 1,
>> >> >> result.totalHits ); ...
>> >> >> >
>> >> >> > > -----Ursprüngliche Nachricht-----
>> >> >> > > Von: Uwe Schindler [mailto:uwe@thetaphi.de]
>> >> >> > > Gesendet: Montag, 2. Mai 2011 13:50
>> >> >> > > An: java-user@lucene.apache.org
>> >> >> > > Betreff: RE: "fuzzy prefix" search
>> >> >> > >
>> >> >> > > Hi,
>> >> >> > >
>> >> >> > > You can pass an integer to FuzzyQuery which defines the
number
>> >> >> > > of characters that are seen as prefix. So all terms must
match
>> >> >> > > this prefix and the rest of each term is matched using
fuzzy.
>> >> >> > >
>> >> >> > > Uwe
>> >> >> > >
>> >> >> > > -----
>> >> >> > > Uwe Schindler
>> >> >> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>> >> >> > > eMail: uwe@thetaphi.de
>> >> >> > >
>> >> >> > > > -----Original Message-----
>> >> >> > > > From: Clemens Wyss [mailto:clemensdev@mysign.ch]
>> >> >> > > > Sent: Monday, May 02, 2011 1:47 PM
>> >> >> > > > To: java-user@lucene.apache.org
>> >> >> > > > Subject: "fuzzy prefix" search
>> >> >> > > >
>> >> >> > > > I'd like to search fuzzily but not on a full term.
>> >> >> > > > E.g.
>> >> >> > > > I have a text "Merlot del Ticino"
>> >> >> > > > I'd like
>> >> >> > > > "mer", "merr", "melo", ... to match.
>> >> >> > > >
>> >> >> > > > If I use FuzzyQuery only "merlot,  "merlott" hit.
What
>> >> >> > > > Query-combination should I use?
>> >> >> > > >
>> >> >> > > > Thx
>> >> >> > > > Clemens
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > ------------------------------------------------------------
>> >> >> > > > ---
>> >> >> > > > ---
>> >> >> > > > --
>> >> >> > > > - To unsubscribe, e-mail:
>> >> >> > > > java-user-unsubscribe@lucene.apache.org
>> >> >> > > > For additional commands, e-mail:
>> >> >> > > > java-user-help@lucene.apache.org
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > --------------------------------------------------------------
>> >> >> > > ---
>> >> >> > > ---
>> >> >> > > - To unsubscribe, e-mail:
>> >> >> > > java-user-unsubscribe@lucene.apache.org
>> >> >> > > For additional commands, e-mail:
>> >> >> > > java-user-help@lucene.apache.org
>> >> >> >
>> >> >> >
>> >> >> > ----------------------------------------------------------------
>> >> >> > ---
>> >> >> > -- To unsubscribe, e-mail:
>> >> >> > java-user-unsubscribe@lucene.apache.org
>> >> >> > For additional commands, e-mail:
>> >> >> > java-user-help@lucene.apache.org
>> >> >>
>> >> >>
>> >> >> ------------------------------------------------------------------
>> >> >> --- To unsubscribe, e-mail:
>> >> >> java-user-unsubscribe@lucene.apache.org
>> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >
>> >> >
>> >> > -------------------------------------------------------------------
>> >> > -- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message