uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: Entities
Date Wed, 04 Mar 2009 11:21:45 GMT
Ok, thanks. Done, it works now.
I think this could be an interesting predefined feature, as this usage is
mentioned in the documentation too.
What do you think about it?

2009/3/3 Marshall Schor <msa@schor.com>

> There is no predefined Entity type in base UIMA; you will need to define
> your own "entity" type.  Suppose it is called "EntityInstance", is a
> subtype of Annotation, and includes a field called "id", which is some
> unique ID for this entity (perhaps a String type).  Then, you can have
> an annotator that runs at the end of your pipeline of annotators which
> detects instances of entities (I'm assuming you have multiple annotators
> that do this, of course).  This last annotator could get an iteration
> index over all things of the "EntityInstance" type, and use a standard
> Java hashmap to associate entity unique IDs with Java ArrayLists of
> their "instances".  Then, you could make one new Feature Structure, say
> of type "Entity", which could have features "uniqueID" and "instances",
> and set the "instances" to a FeatureStructure Array of EntityInstances.
> HTH. -Marshall
> Tommaso Teofili wrote:
> > Hello everybody,
> > I am annotating a document text and I have now a lot of annotations.
> > Many of that annotations refer to the same "entity", as described in the
> > UIMA Overview & SDK Setup (
> >
> http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/html/overview_and_setup/overview_and_setup.html#ugr.ovv.conceptual.metadata_in_cas
> ).
> > I expected to have a predefined Entity type in UIMA but i cannot find it;
> > moreover also defining it by myself I can't find an appropriate range
> type
> > for the "occurencies" feature to store the annotations related to that
> > entity, as stated in the tutorial.
> > Any suggestions?
> > Thanks in advance,
> > Tommaso
> >
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message