uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Unique Annotator value in CAS
Date Tue, 15 Apr 2008 13:44:51 GMT
Thilo Goetz wrote:
> Ashutosh Sharma wrote:
>> Hi All,
>> I have run the meeting annotator. Now I need to know how can I get 
>> the unique annotated value of meeting(if same value appears more than 
>> two times) in CAS consumer. Basically I am looking for the unique 
>> annotated values to put in the database. Is there any feature of UIMA 
>> to just fetch only unique values from the CAS.
>> Thanks & Regards,
>> Ashutosh Sharma
> Sorry, I at least don't understand what you mean.  Could
> you rephrase your question?  Give an example?
My guess is that if a meeting detector found in a document the same 
meeting, twice, and annotated both of these, he wants to know if there 
is a feature of UIMA that would
a) detect that two annotations were "equal" and
b) have some kind of iterator that would only fetch one

UIMA has some basic support for this kind of thing, in its "set" index.  
You can define an index with your own custom set of keys, as a "set" 
index.  This index will hold only "unique" instances of annotations.  
The uniqueness is defined by the keys being "equal".

In your case, for example, if the meaning of "equal" for meetings was:  
a) the same start and end date/time strings,  and b) the same 
room-number, then you would create a set index using just these 3 
features (start-time, end-time, and room-number) as keys.

Of course, this notion of equality is too simple for actual use 
(because, for example, it would treat dates and times expressed as 
strings which represent the same date/time but in different written 
forms, such as 4/15/2008 and April 15, 2008 as not-equal), as is the 
"meeting annotator" - which is only intended as a teaching example ;-)


View raw message