lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasiliki Gkouta <vgko...@csd.auth.gr>
Subject Re: Analyzer enquiry
Date Mon, 14 Mar 2011 22:33:26 GMT
Thank you for your help!

Best Regards,
Vicky

Quoting Erick Erickson <erickerickson@gmail.com>:

> Nope, that should do it.
>
> Best
> Erick
>
> On Mon, Mar 14, 2011 at 9:35 AM, Vasiliki Gkouta <vgkouta@csd.auth.gr> wrote:
>> Sorry for the confusion. I have two analyzers(of StandardAnalyzer) and use
>> no stemmers. At the one analyzer I passed a german stop words set to the
>> constructor and at the other one I passed an english stop words set. My
>> question was if I have to call any other function of the german analyzer for
>> it to be corrent.
>>
>> Thank you.
>>
>>
>> Quoting Erick Erickson <erickerickson@gmail.com>:
>>
>>> I don't understand what you're saying here. If you put a stemmer in the
>>> constructor, you *are* using it. If you don't specify any stemmer at all,
>>> you
>>> still have to define different analyzers to use different stop word lists.
>>>
>>> Can you restate your question?
>>>
>>> Best
>>> Erick
>>>
>>> On Mon, Mar 14, 2011 at 8:21 AM, Vasiliki Gkouta <vgkouta@csd.auth.gr>
>>> wrote:
>>>>
>>>> Thanks a lot for your help Erick! About the fields you mentioned: If I
>>>> don't
>>>> use stemmers, except for the constructor argument related to the stop
>>>> words,
>>>> is there anything else that I have to modify?
>>>>
>>>> Thanks,
>>>> Vicky
>>>>
>>>>
>>>> Quoting Erick Erickson <erickerickson@gmail.com>:
>>>>
>>>>> StandardAnalyzer works well for most European languages. The problem
>>>>> will
>>>>> be stemming. Applying stemming via English rules to non-English
>>>>> languages
>>>>> produces...er...interesting results.
>>>>>
>>>>> You can go ahead and create language-specific fields for each language
>>>>> and
>>>>> use StandardAnalyzer with the appropriate stopwords and stemming with
>>>>> each,
>>>>> this is a common approach.. The Snowball stemmer takes a language
>>>>> parameter...
>>>>>
>>>>> You need to use specific analyzers for Chinese Japanese Korean (CJK)
>>>>> documents
>>>>> though.
>>>>>
>>>>> Hope that helps
>>>>> Erick
>>>>>
>>>>> On Sun, Mar 13, 2011 at 7:23 PM, Vasiliki Gkouta <vgkouta@csd.auth.gr>
>>>>> wrote:
>>>>>>
>>>>>> Hello everybody,
>>>>>>
>>>>>> I have an enquiry about StandardAnalyzer. Can I use it for other
>>>>>> languages
>>>>>> except from English? I give the right list of stop words at
>>>>>> initialization.
>>>>>> Is there anything else inside the class that is by default set in
>>>>>> English?
>>>>>> I've found the Analyzers for other languages too but they where seem
to
>>>>>> be
>>>>>> deprecated.. Moreover I use english and other languages, all together
>>>>>> in
>>>>>> my
>>>>>> project so I would like to ask if there is a way to use either the
same
>>>>>> class analyzer for all of them, or analyzers of the same functionality
>>>>>> for
>>>>>> all the languages. Thanks in advance!
>>>>>>
>>>>>> Best regards,
>>>>>> Vicky
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message