ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Costumero Moreno <roberto.costum...@upm.es>
Subject Re: Training new models
Date Mon, 25 Nov 2013 13:17:48 GMT
Just updated info and patch file in https://issues.apache.org/jira/browse/CTAKES-268

Please, take a look.

--
Roberto Costumero Moreno
Laboratorio de Minería de Datos y Simulación (MIDAS)
Centro de Tecnología Biomédica
Universidad Politecnica de Madrid
roberto.costumero@upm.es
Tlf: +34 91 336 4664

El 21/11/2013, a las 11:00, Roberto Costumero Moreno <roberto.costumero@upm.es> escribió:

> I put the serialize but as the output was in binary I couldn’t compare the actual results
with the examples.
> 
> Also I created the Sentence Detector that way because I couldn’t find any proper documentation
on the way settings are managed, or what is more likely, I didn’t understand them.
> 
> Could you bring more light on how to use the settings so I can make the changes properly?
> 
> Thanks,
> 
> --
> Roberto Costumero Moreno
> Laboratorio de Minería de Datos y Simulación (MIDAS)
> Centro de Tecnología Biomédica
> Universidad Politecnica de Madrid
> roberto.costumero@upm.es
> Tlf: +34 91 336 4664
> 
> El 21/11/2013, a las 09:57, Jörn Kottmann <kottmann@gmail.com> escribió:
> 
>> On 11/20/2013 09:53 PM, Chen, Pei wrote:
>>> Re:https://issues.apache.org/jira/browse/CTAKES-268
>>> Joern- could you confirm- I think in the latest OpenNLP versions, you can simply
call something like
>>> SentenceModel.serialize(outputstream) now to save the models?
>> 
>> Yes, excatly, this how a model in OpenNLP should be serialized.
>> 
>> The proposed code inside the jira issue to save the model should really not be used,
>> first of all the API to instantiate a Senence Detector of a model serialized in this
way
>> is deprecated and will be removed in the next version, and second this creates a
Sentence
>> Detector which uses default settings, if some non-default settings (e.g. more EOS
chars) are used
>> during training the settings don't match.
>> 
>> Jörn
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message