lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Question wrt Lucene analyzer for different language
Date Thu, 14 May 2009 15:30:02 GMT
> Thanks for the quick answer. :-)
> 
> So  can I say, for ArabicAnalyzer, generally it can tokenize the mixed
> content with Arabic and English? :-)
> 
> I am not really familiar with Arabic language. What do you mean for
> "change
> Arabic tokens"? Does Arabic has something like upper/lower case as English
> does?

For the arabic anayzer this works, because you can detect the "language"
easy from the used characters. But then it stems only the Arabic part. The
English one is simply untouched.

But a analyzer that should automatically detect English, French, German on
the index and search side (see my email before) is almost impossible to
create.

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message