lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <>
Subject RE: Automatic Language Identification
Date Fri, 01 Jul 2016 12:33:24 GMT
+1 to langdetect

In Tika 2.0, we're going to remove our own language detection code and allow users to select
Optimaize (fork of langdetect), MIT Lincoln Lab’s Text.jl library or Yalder (
 The first two are now available in Tika 1.13.

-----Original Message-----
From: Markus Jelsma [] 
Sent: Wednesday, June 22, 2016 8:27 AM
To:; solr-user <>
Subject: RE: Automatic Language Identification


I recommend using the langdetect language detector, it supports many more languages and has
much higher precission than Tika's detector.

View raw message