lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaozheng Ma" <Xiaozheng...@redwood.com>
Subject [PATCH]multiple wildcards ? at the end of search pattern return incorrect hits
Date Wed, 10 Nov 2004 16:06:27 GMT

Hi all,

I sent a patch regarding wildcard search a couple of days ago(that was
my 1st time sending anything to the list). I've seen no response so far.
Not sure if it has been received by any of you. On the other hand, based
on what I see these two days, you guys usually response to issues
promptly. 

The problem is if you search on "ca??", the hit includes 'cat', 'CA',
etc, while the user only wants 4 letter words start with CA, such as
'card', 'cash', to be returned. This happens only when multiple '?' at
the end of search pattern. The solution is to check if the word that is
matching against search pattern ends while there is still '?' left. If
this is the case, match should return false. 

The patch file is attached and here is the text copy:
------------------------------------------------------------------------
-
--- WildcardTermEnum.org	2004-05-11 11:42:10.000000000 -0400
+++ WildcardTermEnum.java	2004-11-08 14:35:14.823610500 -0500
@@ -132,6 +132,10 @@
             }
             else
             {
+	      //to prevent "cat" matches "ca??"
+	      if(wildchar == WILDCARD_CHAR){
+		return false;
+	      }	      
               // Look at the next character
               wildcardSearchPos++;
             } 

------------------------------------------------------------------------
--
Thanks!

Xiaozheng

Mime
View raw message