uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Tanenblatt <sloth...@park-slope.net>
Subject Re: ConceptMapper POS
Date Wed, 16 Jul 2008 18:26:23 GMT
Not sure where I got the OpenNLP annotator from, but you can probably  
google it to find it. The tagger in the sandbox that Thilo pointed out  
is probably a good alternative. As to LanguageWare, that is a product,  
not an open source project.


On Jul 16, 2008, at 2:20 PM, Ahmed Abdeen Hamed wrote:

> Can you point me to the source for those UIMA annotators? I would  
> like to
> use one of them for a really simple task. Thanks again!Ahmed
>
> On Wed, Jul 16, 2008 at 1:56 PM, Michael Tanenblatt <slothrop@park-slope.net 
> >
> wrote:
>
>> I have used the OpenNLP tagger as well as the IBM LanguageWare  
>> product,
>> both of which are available as UIMA annotators.
>>
>>
>>
>> On Jul 16, 2008, at 1:49 PM, Ahmed Abdeen Hamed wrote:
>>
>> Thanks Michael. I like the idea of attaching the POS to dictionary  
>> terms.
>>> What POS tagger are you using then? Is it the Stanford or  
>>> LingPipe? I
>>> doubt
>>> that UIMA has a native POS-tagger.Ahmed
>>>
>>> On Wed, Jul 16, 2008 at 1:24 PM, Michael Tanenblatt <
>>> slothrop@park-slope.net>
>>> wrote:
>>>
>>> ConceptMapper maps entries in the dictionary to new annotations  
>>> using the
>>>> AE descriptor parameters "AttributeList" and "FeatureList". From  
>>>> the
>>>> comments in the descriptor:
>>>>
>>>> AttributeList: List of attribute names for XML dictionary entry  
>>>> record -
>>>> must correspond to FeatureList
>>>>
>>>> FeatureList: List of feature names for CAS annotation - must  
>>>> correspond
>>>> to
>>>> AttributeList
>>>>
>>>> In other words, these are two parallel arrays mapping from the  
>>>> attributes
>>>> in the dictionary entries to the new annotation features. So, if  
>>>> your
>>>> dictionary entries had attributes named "POS_Tag", e.g.:
>>>>
>>>> <token canonical="abdomen, nos" POS_Tag ="NN" >
>>>> <variant base="abdomen, nos" />
>>>> <variant base="abdomen" />
>>>> </token>
>>>>
>>>> and the resultant annotations had the feature "PartOfSpeechTag",  
>>>> the
>>>> parameter "AttributeList" (an array) would have "POS_Tag" at the  
>>>> same
>>>> position (array index) as the parameter "FeatureList" would have
>>>> "PartOfSpeechTag".
>>>>
>>>> One key pice of information: ConceptMapper does not do any POS  
>>>> tagging,
>>>> it
>>>> only maps from the dictionary. In some cases, I have run a
>>>> tokenizer/POS-tagger, then use this technique to unconditionally  
>>>> override
>>>> the computed POS tag in the token using the
>>>> TokenClassWriteBackFeatureNames
>>>> parameter. This allows any attributes from the dictionary to be  
>>>> stuffed
>>>> back
>>>> into all of the matching tokens, which can sometimes be useful...
>>>>
>>>> TokenClassWriteBackFeatureNames: names of features that should be  
>>>> written
>>>> back to a token, such as a POS tag
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Jul 16, 2008, at 1:07 PM, Ahmed Abdeen Hamed wrote:
>>>>
>>>> Hello,TokenAnnotation objects don't get fully populated with data  
>>>> after
>>>>
>>>>> annotation. For instance, POS feature returns null when printing  
>>>>> out an
>>>>> annotation object. Apparently, this feature needs to be set  
>>>>> while doing
>>>>> the
>>>>> annotation. How does ConceptMapper do the POS tagging? I  
>>>>> appreciate any
>>>>> insights!
>>>>> Best wishes,
>>>>> Ahmed
>>>>>
>>>>>
>>>>
>>>>
>>


Mime
View raw message