lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 18014] New: - Fuzzy searches are case sensitive
Date Fri, 14 Mar 2003 20:13:44 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18014>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18014

Fuzzy searches are case sensitive

           Summary: Fuzzy searches are case sensitive
           Product: Lucene
           Version: 1.2
          Platform: All
        OS/Version: Other
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Search
        AssignedTo: lucene-dev@jakarta.apache.org
        ReportedBy: cormac@siderean.com


I've found that fuzzy search terms are case sensitive. For example, "Adagio" is calculated
as having a levenshtein distance of 1 from "adagio". Of course, "ADAGIO" has a distance of
6, and would not get returned as a search result if searching for 'adagio~'.

the patch is trivial and I have it here:

*** lucene-1.2\src\java\org\apache\lucene\search\FuzzyTermEnum.java	Sun Jun 09 13:47:54 2002
--- patched\src\java\org\apache\lucene\search\FuzzyTermEnum.java	Fri Mar 14 11:37:20 2003
***************
*** 77,83 ****
          super(reader, term);
          searchTerm = term;
          field = searchTerm.field();
!         text = searchTerm.text();
          textlen = text.length();
          setEnum(reader.terms(new Term(searchTerm.field(), "")));
      }
--- 77,83 ----
          super(reader, term);
          searchTerm = term;
          field = searchTerm.field();
!         text = searchTerm.text().toLowerCase();
          textlen = text.length();
          setEnum(reader.terms(new Term(searchTerm.field(), "")));
      }
***************
*** 88,94 ****
       */
      final protected boolean termCompare(Term term) {
          if (field == term.field()) {
!             String target = term.text();
              int targetlen = target.length();
              int dist = editDistance(text, target, textlen, targetlen);
              distance = 1 - ((double)dist / (double)Math.min(textlen, targetlen));
--- 88,94 ----
       */
      final protected boolean termCompare(Term term) {
          if (field == term.field()) {
!             String target = term.text().toLowerCase();
              int targetlen = target.length();
              int dist = editDistance(text, target, textlen, targetlen);
              distance = 1 - ((double)dist / (double)Math.min(textlen, targetlen));

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message