lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vishal Swaroop <vishal....@gmail.com>
Subject Re: Ignore whitesapce, underscore using KeywordTokenizer... EdgeNGramFilter
Date Wed, 21 Jan 2015 21:40:23 GMT
I tried adding *PatternReplaceFilterFactory *in index section but it is not
working

Example itemName data can be :
- "ABC E12" : if user types "ABCE" suggestion should be "ABC E12"
- "ABCE_12" : if user types "ABCE1" suggestion should be "ABCE_12"

<field name="itemName" type="text_general_edge_ngram" indexed="true"
stored="true" multiValued="false" />

<fieldType name="text_general_edge_ngram" class="solr.TextField"
positionIncrementGap="100">
   <analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
*<filter class="solr.PatternReplaceFilterFactory" pattern="(\s+)"
replacement="" replace="all" />*
<filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
maxGramSize="15" side="front"/>
   </analyzer>

   <analyzer type="query">
    <tokenizer class="solr.LowerCaseTokenizerFactory"/>
   </analyzer>
</fieldType>

On Wed, Jan 21, 2015 at 3:31 PM, Alvaro Cabrerizo <toporniz@gmail.com>
wrote:

> Hi,
>
> Not sure, but I think that the PatternReplaceFilterFactory or
> the PatternReplaceCharFilterFactory could help you deleting those
> characters.
>
> Regards.
> On Jan 21, 2015 7:59 PM, "Vishal Swaroop" <vishal.rec@gmail.com> wrote:
>
> > I am trying to implement type-ahead suggestion for single field which
> > should ignore whitesapce, underscore or special characters in
> autosuggest.
> >
> > It works as suggested by Alex using KeywordTokenizerFactory but how to
> > ignore whitesapce, underscore...
> >
> > Example itemName data can be :
> > "ABC E12" : if user types "ABCE" suggestion should be "ABC E12"
> > "ABCE_12" : if user types "ABCE1" suggestion should be "ABCE_12"
> >
> > Schema.xml
> > <field name="itemName" type="text_general_edge_ngram" indexed="true"
> > stored="true" multiValued="false" />
> >
> > <fieldType name="text_general_edge_ngram" class="solr.TextField"
> > positionIncrementGap="100">
> >    <analyzer type="index">
> > <tokenizer class="solr.KeywordTokenizerFactory"/>
> > <filter class="solr.LowerCaseFilterFactory"/>
> >     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> > maxGramSize="15" side="front"/>
> >    </analyzer>
> >    <analyzer type="query">
> >     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
> >    </analyzer>
> > </fieldType>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message