lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brent Schneeman <bschnee...@yahoo.com>
Subject Case Sensitive and InSensitive Queries
Date Sat, 25 Oct 2003 00:59:49 GMT
Hi All:

I've only been looking at Lucene for about a week now. I'm using 1.3
RC2. 

I am searching a moderately sized repository of documents containing
assembly language source code and I'm trying to implement both case
sensitive and case insensitive queries (that's what the users want). 

I'm using the WhitespaceAnalyzer to index the files, thereby preserving
the case of the tokens in the index. I'm now trying to query that
index.

The Analyzer used for my QueryParser is either a standard
WhitespaceAnalyzer (for case sensitive) or a
"LowerCaseWhitespaceAnalyzer" whose underlying tokenizer simply extends
the WhitespaceTokenizer and overrides normalize to lowercase the
characters (for case insensitive).

I'm also setting the QueryParser's lowercaseWildcardTerms attribute
appropriately.

For case sensitive searches, I use a standard IndexSearcher, whose
IndexReader provides tokens directly from the index (which have their
case's  preserved). 

For case insensitive searches, I believe that I want to lowercase the
tokens from the index. Is this what the FilterIndexReader is for?
Should I extend that Reader and override its terms(...) methods to
provide lowercased terms?

Or am I just on the wrong path altogether?

Help?

Thanks
-Brent Schneeman

__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message