lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sy...@web.de
Subject StopFilterFactory attribute format in schema.xml
Date Sun, 09 Sep 2012 18:41:53 GMT
Hi,

what is the effect of the format attribute for StopFilterFactory? E.g. format="snowball"?

Sorl ships with a schema.xml with a lot of good examples. The file is in example/solr/conf/schema.xml
and defines a <fieldType> for German text:
  <!-- German -->
  <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
    <analyzer> 
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt"
format="snowball" enablePositionIncrements="true"/>
      <filter class="solr.GermanNormalizationFilterFactory"/>
      <filter class="solr.GermanLightStemFilterFactory"/>
      <!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/>
-->
      <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/>
-->
    </analyzer>
  </fieldType>
The StopFilterFactory is configured with format="snowball". For what is this good?

I grabbed the Solr 4.0-BETA source with Maven and had a look at classes StopFilter and StopFilterFactory:
  <dependency>
    <groupId>org.apache.solr</groupId>
    <artifactId>solr</artifactId>
    <version>4.0.0-BETA</version>
    <type>java-source</type>
  </dependency>
But there is no attribute format handled anywhere. Am I missing something here?

Mime
View raw message