opennlp-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <kottm...@gmail.com>
Subject Re: will version 1.5.3 be announced here?
Date Wed, 04 Apr 2012 12:29:38 GMT
On 04/04/2012 02:21 PM, Jim - FooBar(); wrote:
> Hmm...it is not that simple! what about the maxent model? Let's say you
>
> --create a class that extends Evaluator  (as all evaluators do)
> --allow the constructor of that class to take variable number of 
> arguments (of type TokenNameFinder) so we can use them later on
> --pretty much copy the code from TokenNameFinderEvaluator and paste it 
> in the new class
> --make sure the code asks both (or however many) name-finders before 
> it classifies a prediction as right or wrong.
>
> Up to here everything is quite straight forward...now the problems 
> begin!!!
> The Dictionary has to be evaluated on separate data than the 
> model...That is because the Dictionary can only deal with  the 
> "default tag" and nothing else! A quick workaround would be to retrain 
> the maxent model with "default" tags so i can evaluate it on the 
> dictionary's test-set but then what about multiple types??? 

We could pass a tag to the DictionaryNameFinder it should output, then 
it works with the data you give to the
evaluator.
You would run your tests twice, once with the maxent name finder and 
once with the dictionary name finder.

Why do you want to extend the evaluators? It should work out of the box.

I would do it like this:
ObejctStream<NameSample> samples = ...;

// create NameFinderME
NameFinderME maxentNameFinder = ...;

TokenNameFinderEvaluator maxentEval = new 
TokenNameFinderEvaluator(maxentNameFinder);
maxentEval.evaluate(samples);
maxentEval.getFMeasure(); // here are you maxent results

samples.reset();

DictionaryNameFinder dictNameFinder = ...;
TokenNameFinderEvaluator dictEval = new 
TokenNameFinderEvaluator(dictNameFinder);
dictEval.evaluate(samples);
dictEval.getFMeasure(); // here are you dictionary name finder results

Doesn't that more or less like this work?

Jörn

Mime
View raw message