uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mario Juric ...@unsilo.ai>
Subject Re: Erratic block variable behaviour in Ruta
Date Mon, 06 Nov 2017 17:26:48 GMT
Hi Peter,

Thanks for the explanation, and no problem with the delayed respone. I’ll let you know about
our outcome of the change as soon as possible, but I have the feeling that your suggestion
will probably work as expected.

Best
Mario








> On 6 Nov 2017, at 17:01 , Peter Klügl <peter.kluegl@averbis.com> wrote:
> 
> Hi Mario,
> 
> 
> sorry for the delayed response... I was travelling.
> 
> 
> First of all, there should be no multithreading issues in ruta (in
> normal usage), at least, I am quite confident about that.
> 
> 
> My first guess would be that the problem is caused by the nature of
> variables and their initialization in ruta.
> 
> The initialization of variables with values (e.g., BOOLEAN ignore =
> false;) does not reset its actual value during a loop like BLOCK as the
> variables are declared only once and because they are always global. The
> value only defines the initial value of the variable to which it is
> reset when the complete environment is reset (e.g., different CAS). The
> declaration is actually ignored in the execution of the block.
> 
> So, you need to reset the value to false for each iteration in BLOCK. I
> wonder if your solution with the ASSIGN in the head rule of the block
> will work. The rule is applied in order to get a list of annotations
> (windows for the block), and so the action is already applied before the
> actual iteration starts.
> 
> Could you try something like that:
> 
> 
> BLOCK(ForEach) EnclosingAnnotation.property==“something" {} {
>     BOOLEAN ignore = false;
>     ASSIGN(ignore, false);
>     EnclosedAnnotation.property==“something else"{FEATURE("value",
> “ignorable") -> ASSIGN(ignore, true)};
>     EnclosingAnnotation.name==“Hello"{IF(ignore == false) ->
> CREATE(AnotherAnnotation, “name" = “World")};
> }
> 
> 
> 
> Best,
> 
> Peter
> 
> 
> Am 29.10.2017 um 17:49 schrieb Mario Juric:
>> Hi Peter,
>> 
>> We encountered a problem with a Ruta rule behaving erratically in a multithreaded
environment. We isolated the problem to the following rule shown in pseudo form:
>> BLOCK(ForEach) EnclosingAnnotation.property==“something" {} {
>>    BOOLEAN ignore = false;
>>    EnclosedAnnotation.property==“something else"{FEATURE("value", “ignorable")
-> ASSIGN(ignore, true)};
>>    EnclosingAnnotation.name==“Hello"{IF(ignore == false) -> CREATE(AnotherAnnotation,
“name" = “World")};
>> }
>> We identified about 1000 documents where “AnotherAnnotation” above should be
created, and we reprocessed them several times on EC2 using Oracle JDK build 1.8.0_151
>> with both Ruta 2.5 and UIMA 2.9 as well as Ruta 2.6.1 and UIMA 2.10.1. The number
of inconsistencies in rule firing over many runs of the 1K appears erratic between approximately
16% down to approximately 0,5%, but there was always inconsistencies in every run. Removing
the ignore condition made of course the issue disappear entirely, e.g.
>> 
>> BLOCK(ForEach) EnclosingAnnotation.property==“something" {} {
>>    EnclosingAnnotation.name==“Hello"{ -> CREATE(AnotherAnnotation, “name"
= “World")};
>> }
>> We haven’t experienced the issue in a single threaded environment yet, but we are
not entirely sure whether it is related to multithreading, although the nature of the problem
could point in the direction of some thread-safety issues around shared data inside Ruta,
but that is just guessing. However, the workaround in our case was too rewrite the rule as
follows:
>> BOOLEAN ignore = false;
>> BLOCK(ForEach) EnclosingAnnotation.property==“something" {-> ASSIGN(ignore,
false)} {
>>    EnclosedAnnotation.property==“something else"{FEATURE("value", “ignorable")
-> ASSIGN(ignore, true)};
>>    EnclosingAnnotation.name==“Hello"{IF(ignore == false) -> CREATE(AnotherAnnotation,
“name" = “World")};
>> }
>> I assume the BLOCK(ForEach) action happen for every occurrence, but I haven’t actually
verified that yet since there is usually only one occurrence in this particular case, but
I was hoping you might be able to shed some light on this, and the problems we experienced
with the variable declaration inside the block.
>> 
>> Thanks
>> Mario
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message