uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "J. William Murdock" <b...@murdocks.org>
Subject Re: ConceptMapper: Dictionary application to certain parts of a document
Date Mon, 07 Jul 2008 18:45:43 GMT
Here is a solution that may be a bit inefficient, but fits well into the 
framework.  Produce an aggregate with three elements:

1) An annotator that takes the text you want to analyze (e.g., 
abstractAnnotation.getCoveredText()) and copies it into a new sofa.

2) ConceptMapper; your main aggregate should specify a sofa mapping to 
make it run on the sofa that the first annotator created.

3) An annotator that copies all of the annotations from the sofa that 
the first annotator created to the default sofa.  When copying, the 
annotations, it adds the begin offset of the text that was analyzed 
(e.g., abstractAnnotation.getBegin()).

Bill Murdock, PhD
UIMA User (but not a UIMA developer or official spokesperson)
IBM Watson Research Center
19 Skyline Dr., Hawthorne, NY  10532  USA

Ahmed Abdeen Hamed wrote:
> Hi David,
> Thank you for your response. I actually wrote annotators that find useful
> things. Is there a way you can get access to those annotators from your
> aggregate analysis engine that get produced by UIMAFramework? I could do a
> work around and only pass the text that I am interested in parsing. However,
> my solution is required to be within the UIMA framework.
> Thanks again!
> Ahmed
> On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <buttler1@llnl.gov> wrote:
>> This seems very straight-forward to me.  My approach may not be the most
>> efficient, but I would
>> 1) write a wrapper around the ConceptMapper code so that you only pass it
>> spans of text that you would find useful. 2) write a post processing filter
>> that throws away any tag that occurs in a region of the text that you think
>> is inappropriate (e.g. if you do not want to tag a verb)
>> All of this would most easily be put into a single processing component so
>> you don't have unwanted annotations in your CAS
>> Dave
>> Ahmed Abdeen Hamed wrote:
>>> Hello,
>>> I have a quick question about the ConceptMapper project. How can I apply
>>> dictionary terms to a certain part of a document? For example, if you have
>>> documents that have titles and abstracts and you need only to find terms
>>> that appear in the abstract not the title, how do you do that? Also, if
>>> you
>>> would like to apply a filter such as detecting a certain POS like names vs
>>> verbs. How would you approach this problem? Are there examples that I can
>>> take a look at? Please let me know if you have an answer for me.
>>> Thanks in advance!
>>> Ahmed

View raw message