Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A5E8E6B1 for ; Wed, 6 Mar 2013 09:48:43 +0000 (UTC) Received: (qmail 89531 invoked by uid 500); 6 Mar 2013 09:48:43 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 89023 invoked by uid 500); 6 Mar 2013 09:48:37 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 88991 invoked by uid 99); 6 Mar 2013 09:48:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Mar 2013 09:48:36 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gcaug-uima-user@m.gmane.org designates 80.91.229.3 as permitted sender) Received: from [80.91.229.3] (HELO plane.gmane.org) (80.91.229.3) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Mar 2013 09:48:29 +0000 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1UDAxX-0000Gm-M5 for user@uima.apache.org; Wed, 06 Mar 2013 10:48:27 +0100 Received: from host9.omilia.com ([62.103.88.121]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 06 Mar 2013 10:48:27 +0100 Received: from nickkolivas by host9.omilia.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 06 Mar 2013 10:48:27 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: user@uima.apache.org From: Nick Kolivas Subject: Re: UIMA Dictionary Annotator help Date: Wed, 6 Mar 2013 09:47:48 +0000 (UTC) Lines: 40 Message-ID: References: <67BFB0CF-B1EC-428C-A8D7-B7834EDF5601@ukp.informatik.tu-darmstadt.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: sea.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 62.103.88.121 (Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0) X-Virus-Checked: Checked by ClamAV on apache.org Richard Eckart de Castilho writes: > > Hello Nick, > > we have a much simpler DictionaryAnnotator in DKPro Core which might serve as a starting > point for writing your own dictionary annotator: > > http://code.google.com/p/dkpro-core-asl/source/browse/de.tudarmstadt.ukp.dkpro.core-asl/trunk/de.tudarmstadt.ukp.dkpro.core.dictionaryannotator-asl/src/main/java/de/tudarmstadt/ukp/dkpro/core/dictionaryannotator/DictionaryAnnotator.java > > Cheers, > Goodmorning Richard. Thank you for the link! I was having some second thoughts yesterday about how to proceed with my annotator and I would like to share them with you. First of all what I need to do is match a recognised text with a table of a database containing names. So for example if we have a recognition " My name is Nick" I want my annotator to compare the recognised text with the database and try to find a match. Now lets say that "Nick" exists in the database so we have a match then the annotator will return a name annotation. This annotator will be used as UIMA pipeline early stage. My problem is on how to create it. Using a dictionary annotator, or by just using a general annotatator with some regex that will just scan the database? Is it possible for the dictionary annotator to scan a table of a database like it was a dictionary? Thank you both for your answers. With regards, Nick