ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Costumero Moreno <roberto.costum...@upm.es>
Subject Re: Training new models
Date Thu, 21 Nov 2013 10:00:07 GMT
I put the serialize but as the output was in binary I couldn’t compare the actual results
with the examples.

Also I created the Sentence Detector that way because I couldn’t find any proper documentation
on the way settings are managed, or what is more likely, I didn’t understand them.

Could you bring more light on how to use the settings so I can make the changes properly?

Thanks,

--
Roberto Costumero Moreno
Laboratorio de Minería de Datos y Simulación (MIDAS)
Centro de Tecnología Biomédica
Universidad Politecnica de Madrid
roberto.costumero@upm.es
Tlf: +34 91 336 4664

El 21/11/2013, a las 09:57, Jörn Kottmann <kottmann@gmail.com> escribió:

> On 11/20/2013 09:53 PM, Chen, Pei wrote:
>> Re:https://issues.apache.org/jira/browse/CTAKES-268
>> Joern- could you confirm- I think in the latest OpenNLP versions, you can simply
call something like
>> SentenceModel.serialize(outputstream) now to save the models?
> 
> Yes, excatly, this how a model in OpenNLP should be serialized.
> 
> The proposed code inside the jira issue to save the model should really not be used,
> first of all the API to instantiate a Senence Detector of a model serialized in this
way
> is deprecated and will be removed in the next version, and second this creates a Sentence
> Detector which uses default settings, if some non-default settings (e.g. more EOS chars)
are used
> during training the settings don't match.
> 
> Jörn


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message