lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject StopFilterFactory attribute format in schema.xml
Date Sun, 09 Sep 2012 18:41:53 GMT

what is the effect of the format attribute for StopFilterFactory? E.g. format="snowball"?

Sorl ships with a schema.xml with a lot of good examples. The file is in example/solr/conf/schema.xml
and defines a <fieldType> for German text:
  <!-- German -->
  <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"
format="snowball" enablePositionIncrements="true"/>
      <filter class="solr.GermanNormalizationFilterFactory"/>
      <filter class="solr.GermanLightStemFilterFactory"/>
      <!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/>
      <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/>
The StopFilterFactory is configured with format="snowball". For what is this good?

I grabbed the Solr 4.0-BETA source with Maven and had a look at classes StopFilter and StopFilterFactory:
But there is no attribute format handled anywhere. Am I missing something here?

View raw message