opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jairo Sarabia <jairo.sara...@appstylus.com>
Subject Re: Training tool doubt
Date Mon, 13 Feb 2012 17:13:29 GMT
Hello,

Responding to questions from Jörn:

- What training data do I use?
I've created my own data file:
I wrote every spanish sentence in a different line as the Spanish grammar. For
example, I created a new sentence after . or after , or after ? or after ! or
after : characters...

- Which settings did I use?
I've followed the steps of
https://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Sentence_Detector

Specifically I ran:

*bin/opennlp SentenceDetectorTrainer -encoding UTF-8 -lang es -data
~/Documentos/es-sent.train -model es-Sent.bin -iterations 1000*

for training, and

*bin/opennlp SentenceDetectorEvaluator -encoding UTF-8 -model es-Sent.bin
-data ~/Documentos/es-sent.train*

for evaluation.

Grateful that you could help me and tell me what I can improve.

Cheers,

Jairo


2012/2/11 Jörn Kottmann <kottmann@gmail.com>

> These numbers indicate that it doesn't work at all.
>
> Can you tell us a bit more?
>
> - What training data do you use?
> - Which settings did you use?
>
> Jörn
>
>
> On 02/10/2012 07:02 PM, Jairo Sarabia wrote:
>
>> Hello all!,
>>
>> I've trained a spanish sentence detector with Training Tool
>> SentenceTrainerDetector function.
>> But when I evaluate it with EvaluatorTool and SentenceDetectorEvaluator
>> only returns 0.5 of precision and recall.
>> How can I improve precision and evaluation in general?
>>
>> Thanks in advance!,
>>
>> Jairo Sarabia
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message