lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nivedita <>
Subject Solr pattern tokenizer
Date Mon, 02 Feb 2015 09:56:04 GMT

I want to tokenize query like "CHQ PAID-INWARD TRAN-HDFC LTD"  in such a way
that it should give me result documnet containing HDFC LTD and not HDFC MF. 

How can I do this. 
I Have already applied below Tokenizers

 <fieldType name="text_general" class="solr.TextField"
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.TrimFilterFactory" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
		<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="0"
catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
    	<filter class="solr.EdgeNGramFilterFactory" minGramSize="3"
maxGramSize="25" side="front"/>
        <filter class="solr.LowerCaseFilterFactory"/>
		<filter class="solr.StopFilterFactory" words="stopwords.txt"
        <filter class="solr.TrimFilterFactory" />

Please help.

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message