lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: charFilter
Date Thu, 13 Sep 2012 15:15:39 GMT
On Thu, Sep 13, 2012 at 6:43 AM, Osullivan L. <L.Osullivan@swansea.ac.uk> wrote:
>
> In my schema I have:
>
>     <fieldType name="LCNormalized" class="solr.TextField" sortMissingLast="true" omitNorms="true">
>         <analyzer>
>           <charFilter class="com.test.solr.analysis.LukesTestCharFilterFactory"/>
>           <tokenizer class="solr.KeywordTokenizerFactory"/>
>         </analyzer>
>     </fieldType>
>

The main use of a CharFilter is to alter the text before the tokenizer
even runs at all: you can use this to do things like adjust the
tokenizer's behavior.

So in your example, since it just has KeywordTokenizer, I don't think
CharFilter is the easiest way to do what you want.
I think you should instead just use a TokenFilter that does your
transformation, putting it after KeywordTokenizer.

This should be significantly easier to write as you don't need to deal
with offset corrections or any of that, just change the term text.

-- 
lucidworks.com

Mime
View raw message