lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sale, Doug" <ds...@us.britannica.com>
Subject RE: joker * problem
Date Tue, 04 Feb 2003 15:10:20 GMT
this came up on the list a day or two ago...

i believe that someone said that wildcard queries of the form
"<partial-term>*" are not run through the analyzer.  methinks this is
probably because they don't want any stemming to be done on the partial
term...  what you really need in this case is to employ the same analyzer
used in the indexing, but without plural or suffix (porter) stemming, and
not removing wildcard chars.  or, for a dirty hack, "query.toLowerCase()".
anyway, i believe this is a "feature".  interesting problem - anyone?

-doug

> -----Original Message-----
> From: Ralph Schaer [mailto:ralphschaer@yahoo.com]
> Sent: Tuesday, February 04, 2003 1:33 AM
> To: lucene-dev@jakarta.apache.org
> Subject: joker * problem
> 
> 
> Hello
> I found a problem with the joker * and lower/uppercase search 
> strings. (latest nightly build)
> Here's the index
> IndexWriter writer = new IndexWriter("c:\\temp\\ix", new 
> StandardAnalyzer(), true);
> Document doc = new Document();
> doc.add(Field.UnStored("txt", "Onetwo"));
> doc.add(Field.UnStored("txt", "two three"));
> doc.add(Field.UnIndexed("id", "1"));
> writer.addDocument(doc);
> writer.optimize();
> writer.close();    
> Searcher searcher = new IndexSearcher("c:\\temp\\ix");
> 
> Without the joker I can enter the search string lower or uppercase. 
> Both queries find the document:
> Query query = QueryParser.parse("onetwo", "txt", new 
> StandardAnalyzer());
> Query query = QueryParser.parse("Onetwo", "txt", new 
> StandardAnalyzer());
> 
> But with the joker * the uppercase version does not find the document:
> Query query = QueryParser.parse("one*", "txt", new 
> StandardAnalyzer());  <-- document found
> Query query = QueryParser.parse("One*", "txt", new 
> StandardAnalyzer());  <-- no document found
> 
> Regards
> Ralph
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message