lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Analyzer enquiry
Date Mon, 14 Mar 2011 12:44:27 GMT
I don't understand what you're saying here. If you put a stemmer in the
constructor, you *are* using it. If you don't specify any stemmer at all, you
still have to define different analyzers to use different stop word lists.

Can you restate your question?

Best
Erick

On Mon, Mar 14, 2011 at 8:21 AM, Vasiliki Gkouta <vgkouta@csd.auth.gr> wrote:
> Thanks a lot for your help Erick! About the fields you mentioned: If I don't
> use stemmers, except for the constructor argument related to the stop words,
> is there anything else that I have to modify?
>
> Thanks,
> Vicky
>
>
> Quoting Erick Erickson <erickerickson@gmail.com>:
>
>> StandardAnalyzer works well for most European languages. The problem will
>> be stemming. Applying stemming via English rules to non-English languages
>> produces...er...interesting results.
>>
>> You can go ahead and create language-specific fields for each language and
>> use StandardAnalyzer with the appropriate stopwords and stemming with
>> each,
>> this is a common approach.. The Snowball stemmer takes a language
>> parameter...
>>
>> You need to use specific analyzers for Chinese Japanese Korean (CJK)
>> documents
>> though.
>>
>> Hope that helps
>> Erick
>>
>> On Sun, Mar 13, 2011 at 7:23 PM, Vasiliki Gkouta <vgkouta@csd.auth.gr>
>> wrote:
>>>
>>> Hello everybody,
>>>
>>> I have an enquiry about StandardAnalyzer. Can I use it for other
>>> languages
>>> except from English? I give the right list of stop words at
>>> initialization.
>>> Is there anything else inside the class that is by default set in
>>> English?
>>> I've found the Analyzers for other languages too but they where seem to
>>> be
>>> deprecated.. Moreover I use english and other languages, all together in
>>> my
>>> project so I would like to ask if there is a way to use either the same
>>> class analyzer for all of them, or analyzers of the same functionality
>>> for
>>> all the languages. Thanks in advance!
>>>
>>> Best regards,
>>> Vicky
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message