Return-Path: X-Original-To: apmail-opennlp-users-archive@www.apache.org Delivered-To: apmail-opennlp-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A5CA922B for ; Wed, 4 Apr 2012 12:30:15 +0000 (UTC) Received: (qmail 42170 invoked by uid 500); 4 Apr 2012 12:30:11 -0000 Delivered-To: apmail-opennlp-users-archive@opennlp.apache.org Received: (qmail 42134 invoked by uid 500); 4 Apr 2012 12:30:11 -0000 Mailing-List: contact users-help@opennlp.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@opennlp.apache.org Delivered-To: mailing list users@opennlp.apache.org Received: (qmail 41943 invoked by uid 99); 4 Apr 2012 12:30:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Apr 2012 12:30:11 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kottmann@gmail.com designates 209.85.214.49 as permitted sender) Received: from [209.85.214.49] (HELO mail-bk0-f49.google.com) (209.85.214.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Apr 2012 12:30:02 +0000 Received: by bkcjk13 with SMTP id jk13so298983bkc.22 for ; Wed, 04 Apr 2012 05:29:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=bQuPTIf1fvMlUxQ7aE9SNsMtjI8D6hX0yzYPedO3xRM=; b=sSjQ3x6H16V4ID+FPuVM0f2scyydQ/DcUbedmm0xLe0KLyUX35m+0uEa7uV9kbnq/F MdYwf1ncG2dQqShnWUd0xTWVZCYSM2oswPAY2bO2OoxqXRTNMlE9cF+tQu2zeevfoU5a E91avYwC2dJJkpQ4RZf7cmiVmslHijnYEGV/OXrYTNQzNqGN8HSFXuD4btziepQV+xVZ jVgGkAo70PIX9t+lzi3ZFWHqQ/eqZvbgCoUxWXJEuUF1NN9F2CEQKQn3O2JRkwWjgob4 xM5mfsAhgmY0tmU+6M+TA2vNmIrErPs+ib7rxhP/tn8WYONOtJ0wenABmQxTqYAN6nqk fZ9g== Received: by 10.204.151.89 with SMTP id b25mr7043367bkw.18.1333542581887; Wed, 04 Apr 2012 05:29:41 -0700 (PDT) Received: from [192.168.0.134] ([195.218.7.44]) by mx.google.com with ESMTPS id cy11sm1511828bkb.7.2012.04.04.05.29.39 (version=SSLv3 cipher=OTHER); Wed, 04 Apr 2012 05:29:40 -0700 (PDT) Message-ID: <4F7C3EB2.5020908@gmail.com> Date: Wed, 04 Apr 2012 14:29:38 +0200 From: =?ISO-8859-1?Q?J=F6rn_Kottmann?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120310 Thunderbird/11.0 MIME-Version: 1.0 To: users@opennlp.apache.org Subject: Re: will version 1.5.3 be announced here? References: <4F7871B5.80504@gmail.com> <4F788FC5.1010400@gmail.com> <4F79963C.6060308@gmail.com> <4F7ACAEB.3040501@gmail.com> <4F7AD2F1.3050509@gmail.com> <4F7AD3C9.9050800@gmail.com> <4F7ADE17.9020205@gmail.com> <4F7C2691.8010104@gmail.com> <4F7C364C.5020905@gmail.com> <4F7C370D.2000509@gmail.com> <4F7C3CD9.70703@gmail.com> In-Reply-To: <4F7C3CD9.70703@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit On 04/04/2012 02:21 PM, Jim - FooBar(); wrote: > Hmm...it is not that simple! what about the maxent model? Let's say you > > --create a class that extends Evaluator (as all evaluators do) > --allow the constructor of that class to take variable number of > arguments (of type TokenNameFinder) so we can use them later on > --pretty much copy the code from TokenNameFinderEvaluator and paste it > in the new class > --make sure the code asks both (or however many) name-finders before > it classifies a prediction as right or wrong. > > Up to here everything is quite straight forward...now the problems > begin!!! > The Dictionary has to be evaluated on separate data than the > model...That is because the Dictionary can only deal with the > "default tag" and nothing else! A quick workaround would be to retrain > the maxent model with "default" tags so i can evaluate it on the > dictionary's test-set but then what about multiple types??? We could pass a tag to the DictionaryNameFinder it should output, then it works with the data you give to the evaluator. You would run your tests twice, once with the maxent name finder and once with the dictionary name finder. Why do you want to extend the evaluators? It should work out of the box. I would do it like this: ObejctStream samples = ...; // create NameFinderME NameFinderME maxentNameFinder = ...; TokenNameFinderEvaluator maxentEval = new TokenNameFinderEvaluator(maxentNameFinder); maxentEval.evaluate(samples); maxentEval.getFMeasure(); // here are you maxent results samples.reset(); DictionaryNameFinder dictNameFinder = ...; TokenNameFinderEvaluator dictEval = new TokenNameFinderEvaluator(dictNameFinder); dictEval.evaluate(samples); dictEval.getFMeasure(); // here are you dictionary name finder results Doesn't that more or less like this work? J�rn