lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <yo...@apache.org>
Subject Re: How configure SnowballAnalyzer to language Spanish
Date Tue, 28 Nov 2006 18:10:42 GMT
Hi Iris,

An "Analyzer" is just a tokenizer followed by a series of token filters.
Stick with the TextField that you defined below and you should be fine.
I'm not sure how the Spanish stemmer works, and if it expects to work
on accented characters... if so, you may want to move
ISOLatin1AccentFilterFactory after the stemmer.

-Yonik

On 11/27/06, Iris Soto <msoto@agssa.net> wrote:
> Hello,
>
> I am trying to configure Solr to index a Spanish site and I am hitting
> some problems.
> I have a basic install using the Tomcat.
>
> Into schema.xml file i have the following:
>
> <fieldtype name="text_es" class="solr.TextField"
> positionIncrementGap="100">
>      <analyzer>
>          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>          <filter class="solr.ISOLatin1AccentFilterFactory"/>
>          <filter class="solr.StopFilterFactory" ignoreCase="true"/>
>          <filter class="solr.LowerCaseFilterFactory"/>
>          <filter class="solr.SnowballPorterFilterFactory"
> language="Spanish"/>
>      </analyzer>
>    </fieldtype>
>
> In Solr wiki appears package:
> org.apache.lucene.analysis.snowball.SnowballAnalyzer, how can i specify
> the type of language to use it?
> <analyzer class="org.apache.lucene.analysis.snowball.SnowballAnalyzer">
>
> I want that ISOLatin1AccentFilterFactory delete accented forms, like: á,
> é, ñ... , but in case of queries, this process doesn't works, because it
> should search words that contains that accented forms.
> Is good this code? How can i configure the analyzer to Spanish language?
>
> Thanks & Regards,
>
>
> --
> Iris Soto

Mime
View raw message