lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Fuzzy query on capital letters does not match documents
Date Thu, 27 Feb 2014 17:36:22 GMT
Be careful with very short terms and fuzzy query. The rounding when 
converting from a fraction to an edit distance can make the match exact 
rather than fuzzy.

What terms does your index have? XV, Xv, xV, xv? XV~0.7 may only match XV.

-- Jack Krupansky

-----Original Message----- 
From: G.Long
Sent: Thursday, February 27, 2014 12:15 PM
To: java-user@lucene.apache.org
Subject: Fuzzy query on capital letters does not match documents

Hi :)

In my lucene index, there are documents with a field title. values of
this field are indexed with a whitespace analyzer. When I search for
documents, I create a boolean query which includes fuzzy queries for the
title. The final query looks like: +tnc_title:portant~0.7
+tnc_title:création~0.7 +tnc_title:mention~0.7 +tnc_title:rugby~0.7
+tnc_title:XV~0.7

One of the documents in the index has all these words in its title but
the query does not return any results. If I remove the +tnc_title:XV~0.7
part, the document is found.

Is there any known issue with upper case letters and fuzzy queries?

Regards,

Gary



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message