Return-Path: Delivered-To: apmail-incubator-uima-user-archive@locus.apache.org Received: (qmail 59972 invoked from network); 16 Jul 2008 17:55:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Jul 2008 17:55:54 -0000 Received: (qmail 66119 invoked by uid 500); 16 Jul 2008 17:55:53 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 66092 invoked by uid 500); 16 Jul 2008 17:55:53 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Received: (qmail 66081 invoked by uid 99); 16 Jul 2008 17:55:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2008 10:55:53 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ahmed.elmasri@gmail.com designates 209.85.198.244 as permitted sender) Received: from [209.85.198.244] (HELO rv-out-0708.google.com) (209.85.198.244) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2008 17:55:00 +0000 Received: by rv-out-0708.google.com with SMTP id k29so5458119rvb.0 for ; Wed, 16 Jul 2008 10:55:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=1kf8g8N6F1d9nxlOfzfrlh5DILJoeB9eCXnRii+7p58=; b=qDPpIl60/1/ZsrfW+kebb9MvJbwwLSC4Ee5PDoaANSBsdZ37Xp3awUFNoJHh519RxT 4Wz0DtP96u7pqEcwZ0nA1CxQEMjx7qP6wsDwY/P35ZqS2bQxFUlqhXLOKlBMKsq5VTJ+ GqDch4ewBwnrcXBp2cGwGxz59it0Hm572XrPo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=Dow8pXYHI0XdF+mttdzSBYRvEdv3k2AXAAUId4k8ddfKPK4TJes7SEL4mojoT9TBMx mo2jr2G77SMfB995glRtF6QK9kdCHv3fLk61QWB2aI4rVvAmtOiwDG3MNLySOlOSQ3Ga 2msyteRv6iGTJ+j8/fP56SNrV4xmEIikO0PNU= Received: by 10.140.163.3 with SMTP id l3mr438234rve.55.1216230925019; Wed, 16 Jul 2008 10:55:25 -0700 (PDT) Received: by 10.141.193.15 with HTTP; Wed, 16 Jul 2008 10:55:24 -0700 (PDT) Message-ID: <5cdd31570807161055v15cf224cg46a602d0e81a3bfd@mail.gmail.com> Date: Wed, 16 Jul 2008 13:55:24 -0400 From: "Ahmed Abdeen Hamed" To: uima-user@incubator.apache.org Subject: Re: ConceptMapper POS In-Reply-To: <487E3572.3030503@gmx.de> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_28934_16050037.1216230924178" References: <5cdd31570807161007u6cf8c65br7b0f25819e5fce86@mail.gmail.com> <8E64C9D1-45C1-42AC-BB43-35E54B78C5A0@park-slope.net> <5cdd31570807161049x7dac73dfh775276f4dc2e6598@mail.gmail.com> <487E3572.3030503@gmx.de> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_28934_16050037.1216230924178 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline That's awesome :) Thanks Thilo!Ahmed On Wed, Jul 16, 2008 at 1:52 PM, Thilo Goetz wrote: > Ahmed Abdeen Hamed wrote: > >> Thanks Michael. I like the idea of attaching the POS to dictionary terms. >> What POS tagger are you using then? Is it the Stanford or LingPipe? I >> doubt >> that UIMA has a native POS-tagger.Ahmed >> > > Haha, but we do :-) Check out the Tagger project in the sandbox, an > HMM-based POS tagger. > > > >> On Wed, Jul 16, 2008 at 1:24 PM, Michael Tanenblatt < >> slothrop@park-slope.net> >> wrote: >> >> ConceptMapper maps entries in the dictionary to new annotations using the >>> AE descriptor parameters "AttributeList" and "FeatureList". From the >>> comments in the descriptor: >>> >>> AttributeList: List of attribute names for XML dictionary entry record - >>> must correspond to FeatureList >>> >>> FeatureList: List of feature names for CAS annotation - must correspond >>> to >>> AttributeList >>> >>> In other words, these are two parallel arrays mapping from the attributes >>> in the dictionary entries to the new annotation features. So, if your >>> dictionary entries had attributes named "POS_Tag", e.g.: >>> >>> >>> >>> >>> >>> >>> and the resultant annotations had the feature "PartOfSpeechTag", the >>> parameter "AttributeList" (an array) would have "POS_Tag" at the same >>> position (array index) as the parameter "FeatureList" would have >>> "PartOfSpeechTag". >>> >>> One key pice of information: ConceptMapper does not do any POS tagging, >>> it >>> only maps from the dictionary. In some cases, I have run a >>> tokenizer/POS-tagger, then use this technique to unconditionally override >>> the computed POS tag in the token using the >>> TokenClassWriteBackFeatureNames >>> parameter. This allows any attributes from the dictionary to be stuffed >>> back >>> into all of the matching tokens, which can sometimes be useful... >>> >>> TokenClassWriteBackFeatureNames: names of features that should be written >>> back to a token, such as a POS tag >>> >>> >>> >>> >>> >>> On Jul 16, 2008, at 1:07 PM, Ahmed Abdeen Hamed wrote: >>> >>> Hello,TokenAnnotation objects don't get fully populated with data after >>> >>>> annotation. For instance, POS feature returns null when printing out an >>>> annotation object. Apparently, this feature needs to be set while doing >>>> the >>>> annotation. How does ConceptMapper do the POS tagging? I appreciate any >>>> insights! >>>> Best wishes, >>>> Ahmed >>>> >>>> >>> >> > ------=_Part_28934_16050037.1216230924178--