lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: charfilter doesn't do anything
Date Thu, 05 Sep 2013 18:41:53 GMT
On 9/5/2013 10:03 AM, Andreas Owen wrote:
> i would like to filter / replace a word during indexing but it doesn't do anything and
i dont get a error.
> 
> in schema.xml i have the following:
> 
> <field name="text_html" type="text_cutHtml" indexed="true" stored="true" multiValued="true"/>
> 
> <fieldType name="text_cutHtml" class="solr.TextField">
> 	<analyzer>
> 	  <!--  <tokenizer class="solr.StandardTokenizerFactory"/> -->
> 	  <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="Zahlungsverkehr"
replacement="ASDFGHJK" />
> 	  <tokenizer class="solr.KeywordTokenizerFactory"/>
> 	</analyzer>
>    </fieldType>
> 
> my 2. question is where can i say that the expression is multilined like in javascript
i can use /m at the end of the pattern?

I don't know about your second question.  I don't know if that will be
possible, but I'll leave that to someone who's more expert than I.

As for the first question, here's what I have.  Did you reindex?  That
will be required.

http://wiki.apache.org/solr/HowToReindex

Assuming that you did reindex, are you trying to search for ASDFGHJK in
a field that contains more than just "Zahlungsverkehr"?  The keyword
tokenizer might not do what you expect - it tokenizes the entire input
string as a single token, which means that you won't be able to search
for single words in a multi-word field without wildcards, which are
pretty slow.

Note that both the pattern and replacement are case sensitive.  This is
how regex works.  You haven't used a lowercase filter, which means that
you won't be able to search for asdfghjk.

Use the analysis tab in the UI on your core to see what Solr does to
your field text.

Thanks,
Shawn


Mime
View raw message