lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: arabic analyzer
Date Mon, 03 Aug 2009 16:39:54 GMT
Walid, thanks for your feedback.

fyi I created an issue with some minor improvements (such as lam-lam
prefix) to the arabic analyzer:
http://issues.apache.org/jira/browse/LUCENE-1758

I also tried to improve the stopwords list, but your Arabic is surely
much better than mine. If you are interested, have a look perhaps you
could double check :)


On Mon, Aug 3, 2009 at 12:05 PM, walid<walid.bakkar@elementn.com> wrote:
> Hello Robert,
>
> you are so right, plurals based on prefixes and suffixes are working.
> Plurals based on inserted "و" do not (باب and ابوب).
>
> The few words i had tested where all of the "insert" type and not the
> prefix/suffix.
>
> thank you :)
>
> -walid
>
> On Sun, 2009-08-02 at 15:08 -0400, Robert Muir wrote:
>> > the fact is, plural (as an example) is not supported, and that is one of
>> > the most common things that a person doing some search will expect to
>>
>> Walid, I'm not sure this is true. Many plurals are supported
>> (certainly not exceptional cases or broken plurals).
>> This is no different than the other language analyzers in lucene, even
>> english stemmers: the most common forms are grouped together and thats
>> about all you can say :)
>>
>> maybe in the future we can improve it though for your particular
>> concern, add simple dictionary mappings for at least the most common
>> broken plurals, something like that.
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message