lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Biedermann,S.,Fa. Post Direkt" <S.Biederm...@postdirekt.de>
Subject AW: "fuzzy prefix" search
Date Tue, 03 May 2011 11:22:21 GMT
I don't know.

But changing it now would cause trouble in many applications...

For our applications we reimplemented fuzzy query so that we can pass along a org.apache.lucene.search.spell.StringDistance
instance that holds the similarity algorithm of choice.


--

Sven


-----Ursprüngliche Nachricht-----
Von: Clemens Wyss [mailto:clemensdev@mysign.ch] 
Gesendet: Dienstag, 3. Mai 2011 12:12
An: java-user@lucene.apache.org
Betreff: AW: "fuzzy prefix" search

Is this calculation intended or a bug?

> -----Ursprüngliche Nachricht-----
> Von: Biedermann,S.,Fa. Post Direkt [mailto:S.Biedermann@postdirekt.de]
> Gesendet: Dienstag, 3. Mai 2011 12:00
> An: java-user@lucene.apache.org
> Betreff: AW: "fuzzy prefix" search
> 
> I had a look into the 3.0 implementation
> 
> The calculation of the similarity is
> 
> 	1 - (edit distance / min (string 1 length, string 2 length)
> 
> As opposed to the levenstein in spellchecker
> 
> 	1 - (edit distance / max (string 1 length, string 2 length)
> 
> 
> So, the similarity is 1 - ( 3 / min(3,6)) = 0.
> 
> --
> Sven
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Biedermann,S.,Fa. Post Direkt [mailto:S.Biedermann@postdirekt.de]
> Gesendet: Dienstag, 3. Mai 2011 11:43
> An: java-user@lucene.apache.org
> Betreff: AW: "fuzzy prefix" search
> 
> Have you tried
> 
> Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.499f);
> 
> 
> Sven
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Clemens Wyss [mailto:clemensdev@mysign.ch]
> Gesendet: Dienstag, 3. Mai 2011 10:57
> An: java-user@lucene.apache.org
> Betreff: AW: "fuzzy prefix" search
> 
> Sorry for coming back to my issue. Can anybody explain why my "simple" unit
> test below fails? Any hint/help appreciated.
> 
> Directory directory = new RAMDirectory(); IndexWriter indexWriter = new
> IndexWriter( directory, new StandardAnalyzer( Version.LUCENE_31 ),
> IndexWriter.MaxFieldLength.UNLIMITED );
> Document document = new Document();
> document.add( new Field( "test", "Merlot", Field.Store.YES,
> Field.Index.ANALYZED ) ); indexWriter.addDocument( document );
> IndexReader indexReader = indexWriter.getReader(); IndexSearcher
> searcher = new IndexSearcher( indexReader ); Query q = new FuzzyQuery(
> new Term( "test", "Mer" ), 0.5f, 0, 10 ); // or Query q = new FuzzyQuery( new
> Term( "test", "Mer" ), 0.5f); TopDocs result = searcher.search( q, 10 );
> Assert.assertEquals( 1, result.totalHits );
> 
> - Clemens
> 
> > -----Ursprüngliche Nachricht-----
> > Von: Clemens Wyss [mailto:clemensdev@mysign.ch]
> > Gesendet: Montag, 2. Mai 2011 23:01
> > An: java-user@lucene.apache.org
> > Betreff: AW: "fuzzy prefix" search
> >
> > Is it the combination of FuzzyQuery and Term which makes the search to
> > go for "word boundaries"?
> >
> > > -----Ursprüngliche Nachricht-----
> > > Von: Clemens Wyss [mailto:clemensdev@mysign.ch]
> > > Gesendet: Montag, 2. Mai 2011 14:13
> > > An: java-user@lucene.apache.org
> > > Betreff: AW: "fuzzy prefix" search
> > >
> > > I tried this too, but unfortunately I only get hits when the search
> > > term is a least as long as the word to be looked up.
> > >
> > > E.g.:
> > > ...
> > > Directory directory = new RAMDirectory(); IndexWriter indexWriter =
> > > new IndexWriter( directory, IndexManager.getIndexingAnalyzer(
> > LOCALE_DE ),
> > > 		IndexWriter.MaxFieldLength.UNLIMITED );
> > >
> > > Document document = new Document();
> > > document.add( new Field( "test", "Merlot",
> > > 		Field.Store.YES, Field.Index.ANALYZED ) );
> > indexWriter.addDocument(
> > > document );
> > >
> > > IndexReader indexReader = indexWriter.getReader(); IndexSearcher
> > > searcher = new IndexSearcher( indexReader );
> > >
> > > Query q = new FuzzyQuery( new Term( "test", "Mer" ), 0.6f, 1 );
> > > TopDocs result = searcher.search( q, 10 ); Assert.assertEquals( 1,
> > result.totalHits ); ...
> > >
> > > > -----Ursprüngliche Nachricht-----
> > > > Von: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > > Gesendet: Montag, 2. Mai 2011 13:50
> > > > An: java-user@lucene.apache.org
> > > > Betreff: RE: "fuzzy prefix" search
> > > >
> > > > Hi,
> > > >
> > > > You can pass an integer to FuzzyQuery which defines the number of
> > > > characters that are seen as prefix. So all terms must match this
> > > > prefix and the rest of each term is matched using fuzzy.
> > > >
> > > > Uwe
> > > >
> > > > -----
> > > > Uwe Schindler
> > > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > > eMail: uwe@thetaphi.de
> > > >
> > > > > -----Original Message-----
> > > > > From: Clemens Wyss [mailto:clemensdev@mysign.ch]
> > > > > Sent: Monday, May 02, 2011 1:47 PM
> > > > > To: java-user@lucene.apache.org
> > > > > Subject: "fuzzy prefix" search
> > > > >
> > > > > I'd like to search fuzzily but not on a full term.
> > > > > E.g.
> > > > > I have a text "Merlot del Ticino"
> > > > > I'd like
> > > > > "mer", "merr", "melo", ... to match.
> > > > >
> > > > > If I use FuzzyQuery only "merlot,  "merlott" hit. What
> > > > > Query-combination should I use?
> > > > >
> > > > > Thx
> > > > > Clemens
> > > > >
> > > > >
> > > > > ----------------------------------------------------------------
> > > > > --
> > > > > --
> > > > > - To unsubscribe, e-mail:
> > > > > java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail:
> > > > > java-user-help@lucene.apache.org
> > > >
> > > >
> > > >
> > > > ------------------------------------------------------------------
> > > > --
> > > > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message