uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kl├╝gl <peter.klu...@averbis.com>
Subject UIMA Ruta: assign distant annotations to features
Date Wed, 14 Oct 2015 09:04:46 GMT
Hi,

there were some questions about assignments of annotations to features
lately. This can by quite annoying right now. The problem is caused by
two missing language elements: variables for storing annotations, which
can be used to refer to annotation across rules. Explicit references to
match contexts (rule elements), which facilitate the assignment within
rules (local variables). I hope I will be able to add both language
elements in UIMA Ruta 2.4.0.

Here a short description how I solve these use cases in UIMA Ruta right now.

Lets assume we some exemplary document:

Some text
A
Some text
More text
B
Some text
B

A and B shall be annotations of the type A and B. The goal is to create
an annotation of type C for each annotation B with the same offsets as
B, and to assign the annotation A to a feature "a" of the annotation C.

In this simple case, this could be done with GATHER:

A # @B{-> GATHER(C, "a"=1)};

This rule starts to match on an annotation of type B (rule element 3).
Then, it searches for the next annotation of the type A in front of it
(rule element 1). If this is worked, then the GATHER action creates a
new annotation of type C with the offsets of the current annotation B
(rule element 3) and assigns the annotation matched by rule element 1 to
the feature "a" of the annotation C. Then, the rule continues with the
next annotations of type B. Thus, we get an annotation C for each
annotation B if there is an annotation A somewhere before.

Unfortunately, this is often not possible, e.g., because we need to
assign additional values to other features of the annotation C, which is
not supported by GATHER. Or we cannot directly use type A because the
actual annotation we want to assign is stored in a feature of A.

In order to overcome the missing languages elements that could be used
to solve this problem, I normally introduce a projection type: an
annotation that stores an annotation but is placed somewhere else in the
document.

DECLARE Projection (Annotation value);
A # @B{-> GATHER(Projection, "a"=1)};
B{-> CREATE(C, "x" = "String"), C.a=Projection.value};

The first line declares a new type with a feature "value".
The first rule (second line) connects the annotation A with each
position of B. Then, the last rule creates the annotation C, stores
"String" in the feature "x" and assigns the annotation stored in the
feature "value" of an annotation Projection, which has the same offsets
as B (the match context) in the feature "a" of the annotation C.

This approach does not fit all use cases, but it can be modified to do
so in most cases, e.g., the CREATE and feature assignment can be
separated in order to use them in different BLOCKs, or a feature of A
instead of A itself is stored in Projection.

This problem can also be solved with other approaches in UIMA Ruta. Feel
free to share your approaches with us. If anyone thinks that this mail
was useful, then I would add it to the documentation.

Best,

Peter

Mime
View raw message