opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Kosin <james.ko...@gmail.com>
Subject Re: Corpora for OpenNLP
Date Thu, 19 Apr 2012 04:29:58 GMT
On 4/16/2012 12:54 PM, Prahalad Deshpande wrote:
> Hello,
>
> I have just started reading up on the OpenNLP project. Was curious to know
> if there is a set of corpora available for client libraries to use similar
> to the NLTk(www.nltk.org) project?
>
> Thank you
> -Prahalad
>
Prahalad,

We use many of the same and even some more.
Unfortunately, they are copyrighted texts.... so, commercial usage and
free distribution are discouraged or prohibited.

We do have other routes we are working on to make freely available
corpus's available and usage for any application.

You can find example models here (only for sample and research usage
only)....
    http://opennlp.sourceforge.net/models-1.5/

The sources used for some of these are copyrighted texts (mostly news
articles)....  So, we can't freely distribute the sources.  Some are the
same data others use, we just don't distribute the sources but... can
give links to where to get.

James

Mime
View raw message