opennlp-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Hernandez <nicolas.hernan...@gmail.com>
Subject UIMA TokenizerTrainer component : the model file is not created
Date Wed, 15 Jun 2011 14:46:51 GMT
Hello

Does someone have already used the UIMA TokenizerTrainer component ? I
am a bit confused since it does not create any model file.

In my stdout I got this :
Indexing events using cutoff of 5
	Computing event counts...

done. 69669 events
	Indexing...  done.
Sorting and merging events... done. Reduced 69669 events to 16467.
Done indexing.
Incorporating indexed data for training...
done.
	Number of Event Tokens: 16467
	    Number of Outcomes: 1
	  Number of Predicates: 5624
...done.
Computing model parameters...
Performing 100 iterations.
  1:  .. loglikelihood=0.0	1.0
  2:  .. loglikelihood=0.0	1.0

This look like a problem I got when I trained the model in command
line without using the '<SPLIT>' tag. In command line, It differs
since in command line I also got the following exception
Exception in thread "main" java.lang.IllegalArgumentException: The
maxent model is not compatible!

I solved this problem by adding the tag as it is mentioned in the post
of maxent model is not compatible with Tokenizer training	Fri, 13 May,
09:33
 http://mail-archives.apache.org/mod_mbox/incubator-opennlp-users/201105.mbox/browser

Does anyone know if it is the same problem ? In that case, how to
specify the '<SPLIT>' tag in the UIMA version? As much as I understand
its role, it is important to let the user the possibility of setting
it.

More globaly I am interested by any return on experience of people who
successfully managed to build models with the UIMA OpenNLP * Trainer
components. For now, I also got some trouble with the SentenceTrainer
and I do not have test the others.

/Nicolas


-- 
nicolas.hernandez@univ-nantes.fr
#
http://enicolashernandez.blogspot.com
http://www.univ-nantes.fr/hernandez-n
#
Laboratoire LINA-TALN CNRS UMR 6241
tel. +33 (0)2 51 12 58 55
#
Université de Nantes - Institut Universitaire de Technologie -
Département Informatique
tel. +33 (0)2 40 30 60 67

Mime
View raw message