lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: How international languages are supported in Lucene
Date Thu, 05 Jun 2008 15:52:57 GMT
Hi Michael,

That's a pretty open ended question and, I'm assuming, by  
"international languages" you mean non-English :-).  You might get  
some mileage out of http://wiki.apache.org/lucene-java/IndexingOtherLanguages 
   but it is a bit out of date (namely the sandbox references).   
Lucene indexes non-English languages just like it does English.  You  
need to figure out what Analyzer you need (have a look in the contrib/ 
Analyzers code/javadocs for many existing languages) and then pretty  
much everything else is the same.  Namely, the same principals apply  
(what to store, index, etc.), as they do in English.

Did you have something specific in mind?  i.e. how to handle Chinese  
or some specific language?  Lastly, if you do have a language in mind,  
try searching the mail archives for the name of that language.

HTH,
Grant

On Jun 5, 2008, at 11:32 AM, Michael Siu wrote:

> Would someone tell me how Lucene supports indexing and searching  
> documents
> that contain international languages? What do I need to do in  
> additions to
> using the StandardAnalyzer?
>
>
>
> Thanks.
>
>
>
>
>
>
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message