opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <>
Subject Re: Question about OpenNLP and comparison to e.g., NTLK, Stanford NER, etc.
Date Thu, 12 Nov 2015 22:27:07 GMT
Yes I think it’s critical that we also distribute models and have
e.g., things like brew packages and so forth so they are install
to install. Imagine:

# brew install opennlp —with-models

I’ll start working on that.


Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

-----Original Message-----
From: Joern Kottmann <>
Reply-To: <>
Date: Thursday, November 12, 2015 at 5:22 PM
To: <>
Subject: Re: Question about OpenNLP and comparison to e.g., NTLK, Stanford
NER, etc.

>On Thu, 2015-11-12 at 15:43 +0000, Russ, Daniel (NIH/CIT) [E] wrote:
>> 1) I use the old sourceforge models.  I find that the source of error
>> in my analysis are usually not do to mistakes in sentence detection or
>> POS tagging.  I don’t have the annotated data or the time/money to
>> build custom models.  Yes, the text I analyze is quite different than
>> the (WSJ? or what corpus was used to build the models), but it is good
>> enough. 
>That is interesting, wasn't aware of that those are still useful.
>It really depends on the component as well, I was mostly thinking about
>the name finder models when I wrote that.
>Do you only use the Sentence Detector, Tokenizer and POS tagger?
>You could use OntoNotes (almost for free) to train models. Maybe we
>should look into distributing models trained on OntoNotes.

View raw message