opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Grisel <olivier.gri...@ensta.org>
Subject Re: Fixes that decrease existing model performance
Date Tue, 17 May 2011 10:29:50 GMT
2011/5/17 Jörn Kottmann <kottmann@gmail.com>:
> Hi all,
>
> I was wondering if we can do bug fixes which slightly decrease
> the performance of existing models?
>
> In this case I am speaking about OPENNLP-172 which fixes the handling
> of lower case sequences in of the token class feature. It detects a
> lower case sequences when they contain only A to Z, but in other languages
> are more letters like the German umlauts.
>
> This fix will decrease the recall of the existing spanish person ner model
> by 2%,
> should we apply it anyway for the next release?
>
> After retraining the recall goes up by 6%.

I am +1 for fixing bugs and providing retrained models for the next release.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Mime
View raw message