uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Schilder <frank.schil...@thomsonreuters.com>
Subject Re: Entities
Date Wed, 04 Mar 2009 14:50:03 GMT
Hi Tommaso,

We defined such a type called Element, but it is not a sub-type of
Annotation, because it doesn't contain any  begin and end offset
information. Moreover, it contains an attribute that refers back to  a list
of annotations where this element is mentioned in the text.

Terry Heinze, Marc Light and Frank Schilder (2008). Experiences with UIMA
for online information extraction at Thomson Corporation. In Proceedings or
the LREC workshop "Towards Enhanced Interoperability for large HLT systems:
UIMA for NLP, Marrakesh, Morocco.

The paper can be found in the proceedings:

There are also slides available (go to slide 11 and 12):


> From: Tommaso Teofili <tommaso.teofili@gmail.com>
> Reply-To: <uima-user@incubator.apache.org>
> Date: Wed, 4 Mar 2009 12:21:45 +0100
> To: <uima-user@incubator.apache.org>
> Subject: Re: Entities
> Ok, thanks. Done, it works now.
> I think this could be an interesting predefined feature, as this usage is
> mentioned in the documentation too.
> What do you think about it?
> 2009/3/3 Marshall Schor <msa@schor.com>
>> There is no predefined Entity type in base UIMA; you will need to define
>> your own "entity" type.  Suppose it is called "EntityInstance", is a
>> subtype of Annotation, and includes a field called "id", which is some
>> unique ID for this entity (perhaps a String type).  Then, you can have
>> an annotator that runs at the end of your pipeline of annotators which
>> detects instances of entities (I'm assuming you have multiple annotators
>> that do this, of course).  This last annotator could get an iteration
>> index over all things of the "EntityInstance" type, and use a standard
>> Java hashmap to associate entity unique IDs with Java ArrayLists of
>> their "instances".  Then, you could make one new Feature Structure, say
>> of type "Entity", which could have features "uniqueID" and "instances",
>> and set the "instances" to a FeatureStructure Array of EntityInstances.
>> HTH. -Marshall
>> Tommaso Teofili wrote:
>>> Hello everybody,
>>> I am annotating a document text and I have now a lot of annotations.
>>> Many of that annotations refer to the same "entity", as described in the
>>> UIMA Overview & SDK Setup (
>> http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/
>> html/overview_and_setup/overview_and_setup.html#ugr.ovv.conceptual.metadata_i
>> n_cas
>> ).
>>> I expected to have a predefined Entity type in UIMA but i cannot find it;
>>> moreover also defining it by myself I can't find an appropriate range
>> type
>>> for the "occurencies" feature to store the annotations related to that
>>> entity, as stated in the tutorial.
>>> Any suggestions?
>>> Thanks in advance,
>>> Tommaso

View raw message