opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <>
Subject Re: Stemmer
Date Thu, 18 Aug 2011 10:32:21 GMT
On 8/18/11 12:24 PM, Olivier Grisel wrote:
> Is this better or cover more languages than what's already provided by
> Apache Lucene? Maybe it should better be contributed to the Lucene
> project and make it easy to use the generic, battle tested Lucene
> analyzers / tokenizers infrastructure to generate features in OpenNLP.

The OpenNLP APIs are all not designed to work on token streams, instead
a user usually has to provide an entire sentence at once, so that does not
make a nice fit.

And since we are an NLP library I believe it is absolutly fine to implement
our own stemming here.


View raw message