lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Language Detection for Analysis?
Date Fri, 07 Aug 2009 03:51:48 GMT
Bradford,

If I may:

Have a look at http://www.sematext.com/products/language-identifier/index.html
And/or http://www.sematext.com/products/multilingual-indexer/index.html

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Bradford Stephens <bradfordstephens@gmail.com>
> To: solr-user@lucene.apache.org; java-user@lucene.apache.org
> Sent: Thursday, August 6, 2009 3:46:21 PM
> Subject: Language Detection for Analysis?
> 
> Hey there,
> 
> We're trying to add foreign language support into our new search
> engine -- languages like Arabic, Farsi, and Urdu (that don't work with
> standard analyzers). But our data source doesn't tell us which
> languages we're actually collecting -- we just get blocks of text. Has
> anyone here worked on language detection so we can figure out what
> analyzers to use? Are there commercial solutions?
> 
> Much appreciated!
> 
> -- 
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message