lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Teruhiko Kurosaka <K...@basistech.com>
Subject RE: Language Detection for Analysis?
Date Mon, 10 Aug 2009 18:59:46 GMT
A shameless self-promotion:
http://basistech.com/language-identification/
No, it's not free. Sorry.

We have Lucene-compatible Tokenizers for those languages too:
http://basistech.com/lucene/How-to-build-a-multilingual-search-engine.pdf

Contact me if you have questions.
-kuro  

> -----Original Message-----
> From: Bradford Stephens [mailto:bradfordstephens@gmail.com] 
> Sent: Thursday, August 06, 2009 12:46 PM
> To: solr-user@lucene.apache.org; java-user@lucene.apache.org
> Subject: Language Detection for Analysis?
> 
> Hey there,
> 
> We're trying to add foreign language support into our new 
> search engine -- languages like Arabic, Farsi, and Urdu (that 
> don't work with standard analyzers). But our data source 
> doesn't tell us which languages we're actually collecting -- 
> we just get blocks of text. Has anyone here worked on 
> language detection so we can figure out what analyzers to 
> use? Are there commercial solutions?
> 
> Much appreciated!

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message