uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mario Juric ...@unsilo.ai>
Subject Erratic block variable behaviour in Ruta
Date Sun, 29 Oct 2017 16:49:03 GMT
Hi Peter,

We encountered a problem with a Ruta rule behaving erratically in a multithreaded environment.
We isolated the problem to the following rule shown in pseudo form:
BLOCK(ForEach) EnclosingAnnotation.property==“something" {} {
    BOOLEAN ignore = false;
    EnclosedAnnotation.property==“something else"{FEATURE("value", “ignorable") ->
ASSIGN(ignore, true)};
    EnclosingAnnotation.name==“Hello"{IF(ignore == false) -> CREATE(AnotherAnnotation,
“name" = “World")};
}
We identified about 1000 documents where “AnotherAnnotation” above should be created,
and we reprocessed them several times on EC2 using Oracle JDK build 1.8.0_151
with both Ruta 2.5 and UIMA 2.9 as well as Ruta 2.6.1 and UIMA 2.10.1. The number of inconsistencies
in rule firing over many runs of the 1K appears erratic between approximately 16% down to
approximately 0,5%, but there was always inconsistencies in every run. Removing the ignore
condition made of course the issue disappear entirely, e.g.

BLOCK(ForEach) EnclosingAnnotation.property==“something" {} {
    EnclosingAnnotation.name==“Hello"{ -> CREATE(AnotherAnnotation, “name" = “World")};
}
We haven’t experienced the issue in a single threaded environment yet, but we are not entirely
sure whether it is related to multithreading, although the nature of the problem could point
in the direction of some thread-safety issues around shared data inside Ruta, but that is
just guessing. However, the workaround in our case was too rewrite the rule as follows:
BOOLEAN ignore = false;
BLOCK(ForEach) EnclosingAnnotation.property==“something" {-> ASSIGN(ignore, false)} {
    EnclosedAnnotation.property==“something else"{FEATURE("value", “ignorable") ->
ASSIGN(ignore, true)};
    EnclosingAnnotation.name==“Hello"{IF(ignore == false) -> CREATE(AnotherAnnotation,
“name" = “World")};
}
I assume the BLOCK(ForEach) action happen for every occurrence, but I haven’t actually verified
that yet since there is usually only one occurrence in this particular case, but I was hoping
you might be able to shed some light on this, and the problems we experienced with the variable
declaration inside the block.

Thanks
Mario













Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message