lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Greek and English text into the same field
Date Fri, 18 Mar 2011 11:24:11 GMT
On Thu, Mar 17, 2011 at 9:18 PM, abiratsis <abiratsis@gmail.com> wrote:
>
> Basicaly I don't know what the best approach is for handling a multilingual
> case like mine e.g:should I create a seperate index for each language?
>

In this particular case (Greek, English), they use totally distinct
characters. so their terms will never conflate with each other, their
stemmers will never mess with the other language's text, etc.

So I would:
a. switch from LowerCaseFilter to GreekLowerCaseFilter... it
lowercases english the same way, don't worry.
b. add greek stopwords file to your stopfilter. stopfilterfactory can
take multiple file arguments... just separate them with a comma.
c. add the greek stemmer right after the porter stemmer.

then your field works fine for greek and english...

Mime
View raw message