lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Funny results with Fuzzy
Date Tue, 25 Oct 2005 16:13:50 GMT
I'd be more inclined to guess that kylie->klyie falls
below the 0.5f similarity threshold you pass.

Try print out the results of
fuzzyQuery.rewrite(indexReader).toString();

This will rewrite the fuzzyQuery to a BooleanQuery
which explicitly lists the TermQuery objects that the
fuzzyQuery has found potential matches for in your
index.







--- Rob Young <bubblenut@gmail.com> wrote:

> Rob Young wrote:
> 
> > mark harwood wrote:
> >
> >> It comes down to your choice of analyzer.
> >>
> >> Don't forget your "all" field is broken down into
> >> discreet terms by your choice of analyzer.
> >>
> >> Most often, you will want to use the same
> analyzer at
> >> query-time with the query parser to make sure the
> >> user's input matches the stored document terms.
> >> If you get it wrong (say using an analyzer that
> >> doesn't lower-case) you'll find nothing:
> >>  Kylie != kylie
> >>
> >> If you aren't using a query parser and manually
> >> constructing FuzzyQuery objects programmatically
> the
> >> same logic applies (psueodocode):  new
> FuzzyQuery(new Term("Kylie")) 
> >> != "kylie"
> >>  
> >>
> > Thanks for the help. I'm using the
> StandardAnalyzer to do the indexing 
> > (which lower cases everything) and I'm lowercasing
> my search string 
> > before I build the Term. So shouldn't be an issue,
> are there other 
> > factors in this vein that may cause problems
> considering that this is 
> > an alphabetic string which shouldn't be in any
> stop word lists.
> >
> > Indexed as "kylie minogue: kyliefever2002 (live in
> manchester)"
> > Searched with "klyie"
> > Using new FuzzyQuery( new Term("klyie", 0.5f, 1)
> );
> >
> > No matches!  I don't get it :(
> 
> I've even gone as far as to run the search term
> through the 
> StandardAnalyzer, grab the tokens and rebuild the
> search string (even 
> though it's only one, valid token). I'm clutching at
> straws now but 
> could it be that I build a BooleanQuery (OR) even if
> there is only one 
> search term? In this case it's basically like this
> 
> BooleanQuery query = new BooleanQuery();
> query.add( new FuzzyQuery( new Term( "all",
> "klyie"), 0.5f, 1, false, 
> false);
> 
> Hits hits = is.search(query);
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 



		
___________________________________________________________ 
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre.
http://uk.security.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message