lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Analyzer enquiry
Date Mon, 14 Mar 2011 18:08:21 GMT
Nope, that should do it.

Best
Erick

On Mon, Mar 14, 2011 at 9:35 AM, Vasiliki Gkouta <vgkouta@csd.auth.gr> wrote:
> Sorry for the confusion. I have two analyzers(of StandardAnalyzer) and use
> no stemmers. At the one analyzer I passed a german stop words set to the
> constructor and at the other one I passed an english stop words set. My
> question was if I have to call any other function of the german analyzer for
> it to be corrent.
>
> Thank you.
>
>
> Quoting Erick Erickson <erickerickson@gmail.com>:
>
>> I don't understand what you're saying here. If you put a stemmer in the
>> constructor, you *are* using it. If you don't specify any stemmer at all,
>> you
>> still have to define different analyzers to use different stop word lists.
>>
>> Can you restate your question?
>>
>> Best
>> Erick
>>
>> On Mon, Mar 14, 2011 at 8:21 AM, Vasiliki Gkouta <vgkouta@csd.auth.gr>
>> wrote:
>>>
>>> Thanks a lot for your help Erick! About the fields you mentioned: If I
>>> don't
>>> use stemmers, except for the constructor argument related to the stop
>>> words,
>>> is there anything else that I have to modify?
>>>
>>> Thanks,
>>> Vicky
>>>
>>>
>>> Quoting Erick Erickson <erickerickson@gmail.com>:
>>>
>>>> StandardAnalyzer works well for most European languages. The problem
>>>> will
>>>> be stemming. Applying stemming via English rules to non-English
>>>> languages
>>>> produces...er...interesting results.
>>>>
>>>> You can go ahead and create language-specific fields for each language
>>>> and
>>>> use StandardAnalyzer with the appropriate stopwords and stemming with
>>>> each,
>>>> this is a common approach.. The Snowball stemmer takes a language
>>>> parameter...
>>>>
>>>> You need to use specific analyzers for Chinese Japanese Korean (CJK)
>>>> documents
>>>> though.
>>>>
>>>> Hope that helps
>>>> Erick
>>>>
>>>> On Sun, Mar 13, 2011 at 7:23 PM, Vasiliki Gkouta <vgkouta@csd.auth.gr>
>>>> wrote:
>>>>>
>>>>> Hello everybody,
>>>>>
>>>>> I have an enquiry about StandardAnalyzer. Can I use it for other
>>>>> languages
>>>>> except from English? I give the right list of stop words at
>>>>> initialization.
>>>>> Is there anything else inside the class that is by default set in
>>>>> English?
>>>>> I've found the Analyzers for other languages too but they where seem
to
>>>>> be
>>>>> deprecated.. Moreover I use english and other languages, all together
>>>>> in
>>>>> my
>>>>> project so I would like to ask if there is a way to use either the same
>>>>> class analyzer for all of them, or analyzers of the same functionality
>>>>> for
>>>>> all the languages. Thanks in advance!
>>>>>
>>>>> Best regards,
>>>>> Vicky
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message