lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Hunecke <A.Hune...@topdesk.com>
Subject AW: Case insensitive Keyword Analyser
Date Mon, 17 Oct 2011 06:51:02 GMT
Hi Jamir,

you can easily combine Analyzers however you need it by filtering the output of one Analyzer
with another. In your case, I would just write my own Analyzer class like this:

class LowerCaseKeywordAnalyzer extends Analyzer {

	@Override
	public TokenStream tokenStream(String fieldName, Reader reader) {		TokenStream tokenStream
= new KeywordTokenizer(reader);
		tokenStream = 
			new LowerCaseFilter(Version.LUCENE_34, tokenStream);
		return tokenStream;
	}
		
}

Best,
Anna


-----Urspr√ľngliche Nachricht-----
Von: Jamir Shaikh [mailto:shaikhjamir@gmail.com] 
Gesendet: Samstag, 15. Oktober 2011 02:22
An: java-user@lucene.apache.org
Betreff: Case insensitive Keyword Analyser

Hi Guys,

Use Case: Field: Name
                 Data:  Jose ,
                           Jose Sam,
                            jose,
                            jose jacob,
                             jose ,
                                      joseph,
                                      josef ,
                             S. Jose,
                             B. jose
              etc.

There is a field (Name), I want to index this field.
I will be searching this field for a Wildcard query
e.g. jose*
This should return all names starting with jose.

Search: Jose* (should return all names starting with jose)

Solution:
1. Using Standard analyser.

Problem with Standard Analyser:
If I use Standard Analyser in addition to correct results it returns results
like S. Jose, B. jose
which do not start with Jose.


2. Using Keyword Analyser.
Problem with Keyword Analyser:
Keyword Analyser is case sensitive so it misses names like Jose, Jose Sam,
This happens becuase a search Jose* will be changed to jose* (all small
letters)



So is there any analyser available which will take care of such use case.
What I am searching is a Case insensitive Keyword Analyser.
 Or let me know if there is any other approach to handle this use case.


Thanks,
Jamir


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message