uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Ott <n...@sfs.uni-tuebingen.de>
Subject Re: Language recognition
Date Mon, 08 Dec 2008 11:31:27 GMT
Torsten Zesch schrieb:
> you could use TextCat
> http://odur.let.rug.nl/~vannoord/TextCat/

This works quite well, but it is a bit slow.

If you simply want to know whether a document is written in a given 
language or not, the laziest way is to use a spell checker and compute 
the percentage of "correctly spelled" words.

Best,

    Niels

-- 
Niels Ott
Computational Linguist (B.A.)
http://www.drni.de/niels/

Mime
View raw message