lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: How to use the StandardTokenizer with currency
Date Wed, 30 Nov 2016 20:08:37 GMT
Hi Vinay,

You should be able to use a char filter to convert “$” characters into something that
will survive tokenization, and then a token filter to convert it back.

Something like this (untested):

  <analyzer>
    <charFilter class=“solr.PatternReplaceCharFiterFactory”
                pattern=“\$” 
                replacement=“__dollar__”/>
    <tokenizer class=“solr.StandardTokenizerFactory”/>
    <filter class="solr.PatternReplaceFilterFactory” 
            pattern=“__dollar__” 
            replacement=“\$”
            replace=“all”/>
  </analyzer>

--
Steve
www.lucidworks.com

> On Nov 30, 2016, at 1:58 PM, Vinay B, <vybe3142@gmail.com> wrote:
> 
> Prior discussion at
> http://stackoverflow.com/questions/40877567/using-standardtokenizerfactory-with-currency
> 
> I'd like to maintain other aspects of the StandardTokenizer functionality
> but I'm wondering if to do what I want, the task boils down to be able to
> instruct the StandardTokenizer not to discard the $ symbol ? Or is there
> another way? I'm hoping that this is possible with configuration, rather
> than code changes.
> 
> Thanks


Mime
View raw message