uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: ConceptMapper POS
Date Wed, 16 Jul 2008 18:35:39 GMT
Michael Tanenblatt wrote:
> Not sure where I got the OpenNLP annotator from, but you can probably 
> google it to find it. The tagger in the sandbox that Thilo pointed out 
> is probably a good alternative. As to LanguageWare, that is a product, 
> not an open source project.
Some parts of LanguageWare is available for a trial period from 
http://alphaworks.ibm.com/tech/lrw

-Marshall
>
>
> On Jul 16, 2008, at 2:20 PM, Ahmed Abdeen Hamed wrote:
>
>> Can you point me to the source for those UIMA annotators? I would 
>> like to
>> use one of them for a really simple task. Thanks again!Ahmed
>>
>> On Wed, Jul 16, 2008 at 1:56 PM, Michael Tanenblatt 
>> <slothrop@park-slope.net>
>> wrote:
>>
>>> I have used the OpenNLP tagger as well as the IBM LanguageWare product,
>>> both of which are available as UIMA annotators.
>>>
>>>
>>>
>>> On Jul 16, 2008, at 1:49 PM, Ahmed Abdeen Hamed wrote:
>>>
>>> Thanks Michael. I like the idea of attaching the POS to dictionary 
>>> terms.
>>>> What POS tagger are you using then? Is it the Stanford or LingPipe? I
>>>> doubt
>>>> that UIMA has a native POS-tagger.Ahmed
>>>>
>>>> On Wed, Jul 16, 2008 at 1:24 PM, Michael Tanenblatt <
>>>> slothrop@park-slope.net>
>>>> wrote:
>>>>
>>>> ConceptMapper maps entries in the dictionary to new annotations 
>>>> using the
>>>>> AE descriptor parameters "AttributeList" and "FeatureList". From the
>>>>> comments in the descriptor:
>>>>>
>>>>> AttributeList: List of attribute names for XML dictionary entry 
>>>>> record -
>>>>> must correspond to FeatureList
>>>>>
>>>>> FeatureList: List of feature names for CAS annotation - must 
>>>>> correspond
>>>>> to
>>>>> AttributeList
>>>>>
>>>>> In other words, these are two parallel arrays mapping from the 
>>>>> attributes
>>>>> in the dictionary entries to the new annotation features. So, if your
>>>>> dictionary entries had attributes named "POS_Tag", e.g.:
>>>>>
>>>>> <token canonical="abdomen, nos" POS_Tag ="NN" >
>>>>> <variant base="abdomen, nos" />
>>>>> <variant base="abdomen" />
>>>>> </token>
>>>>>
>>>>> and the resultant annotations had the feature "PartOfSpeechTag", the
>>>>> parameter "AttributeList" (an array) would have "POS_Tag" at the same
>>>>> position (array index) as the parameter "FeatureList" would have
>>>>> "PartOfSpeechTag".
>>>>>
>>>>> One key pice of information: ConceptMapper does not do any POS 
>>>>> tagging,
>>>>> it
>>>>> only maps from the dictionary. In some cases, I have run a
>>>>> tokenizer/POS-tagger, then use this technique to unconditionally 
>>>>> override
>>>>> the computed POS tag in the token using the
>>>>> TokenClassWriteBackFeatureNames
>>>>> parameter. This allows any attributes from the dictionary to be 
>>>>> stuffed
>>>>> back
>>>>> into all of the matching tokens, which can sometimes be useful...
>>>>>
>>>>> TokenClassWriteBackFeatureNames: names of features that should be 
>>>>> written
>>>>> back to a token, such as a POS tag
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Jul 16, 2008, at 1:07 PM, Ahmed Abdeen Hamed wrote:
>>>>>
>>>>> Hello,TokenAnnotation objects don't get fully populated with data 
>>>>> after
>>>>>
>>>>>> annotation. For instance, POS feature returns null when printing

>>>>>> out an
>>>>>> annotation object. Apparently, this feature needs to be set while

>>>>>> doing
>>>>>> the
>>>>>> annotation. How does ConceptMapper do the POS tagging? I 
>>>>>> appreciate any
>>>>>> insights!
>>>>>> Best wishes,
>>>>>> Ahmed
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>
>
>


Mime
View raw message