Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CA768200D36 for ; Mon, 6 Nov 2017 18:26:57 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C8A14160BEC; Mon, 6 Nov 2017 17:26:57 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E7FAF160BD5 for ; Mon, 6 Nov 2017 18:26:56 +0100 (CET) Received: (qmail 91885 invoked by uid 500); 6 Nov 2017 17:26:56 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 91869 invoked by uid 99); 6 Nov 2017 17:26:55 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Nov 2017 17:26:55 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id F33F5C92FD for ; Mon, 6 Nov 2017 17:26:54 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.48 X-Spam-Level: ** X-Spam-Status: No, score=2.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=unsilo-ai.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id qLxcisbfXTO9 for ; Mon, 6 Nov 2017 17:26:52 +0000 (UTC) Received: from mail-it0-f45.google.com (mail-it0-f45.google.com [209.85.214.45]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 604445F4E5 for ; Mon, 6 Nov 2017 17:26:52 +0000 (UTC) Received: by mail-it0-f45.google.com with SMTP id 72so5962758itk.3 for ; Mon, 06 Nov 2017 09:26:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=unsilo-ai.20150623.gappssmtp.com; s=20150623; h=from:mime-version:subject:date:references:to:in-reply-to:message-id; bh=l6NSeOEDKBLgZ0+xUDYMAV1/jGg0LCfSys2LfNUzNLs=; b=XS37zEG3SlMJAQ6m5HwX5OS4TWgkGY7OMevuoY+xduqhMlJvWefhPkDlMKO6khg/pl R/9fiy0TL9IWEGa09NaebSXb0vR10oUBhcUFYMP/irsa5X+jDA1aPw+orkpVf6A1zKLy BOt/whTUiIcO1btBhGPg/y7/v+5VIZ7URhG3KW/FuPjw8DDDFK/kHQehrp7/tP86zile ZvZ681MEiMUDdGb9k++8ciD/0fQoWzCZaHGItOnzlxyJGuYFzhcL1WRbdE+mXNpeKuzM 9ClCfJxEqGKanhnffnKyYTkCwRZ2n9iMY2G45OFFD55ZsRRz6S9bQLS4iqXVLOoaWhFx s2QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:date:references:to :in-reply-to:message-id; bh=l6NSeOEDKBLgZ0+xUDYMAV1/jGg0LCfSys2LfNUzNLs=; b=T7FDy40dpaFhkXPf1zzWVNBXoBi/9btr6FJbofKp0P9+oy5QrQMhrLhCYRYxL9NPcK 7m1xDCV67J9v3Yo0f3RS6iFShl4Z8wRqrIaLbRpGU8eGAA1GZeDZWLPwjUAv0sTNGT79 cruQeLpsudH82iaH6TZDHZugH1t+3+gka2yNtpDiW8bFpqnCk6cRAQIbCgVZzQRlZCEK wkNSHcT5ZQg0cQjce9Y/E4xEuabCjjSmViFeTPx1+VfoP2twweMrgQfZBcFH6NrrY6j8 Xf+8GOw9MoWgzma70EGzBovllwZi6dF+cBsZigoNNPegob+Sbpku8Wct0pfqeJeLXI0m 6FyQ== X-Gm-Message-State: AJaThX6+t9Btxq6txVn902piD3R7rFSPkJZ9uWzChSE/1sSvHWu+IdeP p8g6QZwXzecF2h2ImRmFGQJSIuNg56o= X-Google-Smtp-Source: ABhQp+S2naayhGLPLncy1CTrr+/b6DleoibLJsfgVEF9L1OXJ8lEXK7/HfG7XKd/5T4lqrMoLVJqYA== X-Received: by 10.36.108.81 with SMTP id w78mr10794756itb.140.1509989211527; Mon, 06 Nov 2017 09:26:51 -0800 (PST) Received: from ip-192-168-22-9.ec2.internal ([85.191.80.39]) by smtp.gmail.com with ESMTPSA id k2sm5925781iok.43.2017.11.06.09.26.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Nov 2017 09:26:50 -0800 (PST) From: Mario Juric Content-Type: multipart/alternative; boundary="Apple-Mail=_6F3710E8-A0F0-4757-B932-195123FFC2ED" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Erratic block variable behaviour in Ruta Date: Mon, 6 Nov 2017 18:26:48 +0100 References: <39C4E2C4-206C-4610-BB8B-9DCE8DFB99A6@unsilo.ai> <3cf9b963-fd1d-5beb-c32e-d6961c2ae569@averbis.com> To: user@uima.apache.org In-Reply-To: <3cf9b963-fd1d-5beb-c32e-d6961c2ae569@averbis.com> Message-Id: X-Mailer: Apple Mail (2.3273) archived-at: Mon, 06 Nov 2017 17:26:58 -0000 --Apple-Mail=_6F3710E8-A0F0-4757-B932-195123FFC2ED Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Peter, Thanks for the explanation, and no problem with the delayed respone. = I=E2=80=99ll let you know about our outcome of the change as soon as = possible, but I have the feeling that your suggestion will probably work = as expected. Best Mario > On 6 Nov 2017, at 17:01 , Peter Kl=C3=BCgl = wrote: >=20 > Hi Mario, >=20 >=20 > sorry for the delayed response... I was travelling. >=20 >=20 > First of all, there should be no multithreading issues in ruta (in > normal usage), at least, I am quite confident about that. >=20 >=20 > My first guess would be that the problem is caused by the nature of > variables and their initialization in ruta. >=20 > The initialization of variables with values (e.g., BOOLEAN ignore =3D > false;) does not reset its actual value during a loop like BLOCK as = the > variables are declared only once and because they are always global. = The > value only defines the initial value of the variable to which it is > reset when the complete environment is reset (e.g., different CAS). = The > declaration is actually ignored in the execution of the block. >=20 > So, you need to reset the value to false for each iteration in BLOCK. = I > wonder if your solution with the ASSIGN in the head rule of the block > will work. The rule is applied in order to get a list of annotations > (windows for the block), and so the action is already applied before = the > actual iteration starts. >=20 > Could you try something like that: >=20 >=20 > BLOCK(ForEach) EnclosingAnnotation.property=3D=3D=E2=80=9Csomething" = {} { > BOOLEAN ignore =3D false; > ASSIGN(ignore, false); > EnclosedAnnotation.property=3D=3D=E2=80=9Csomething = else"{FEATURE("value", > =E2=80=9Cignorable") -> ASSIGN(ignore, true)}; > EnclosingAnnotation.name=3D=3D=E2=80=9CHello"{IF(ignore =3D=3D = false) -> > CREATE(AnotherAnnotation, =E2=80=9Cname" =3D =E2=80=9CWorld")}; > } >=20 >=20 >=20 > Best, >=20 > Peter >=20 >=20 > Am 29.10.2017 um 17:49 schrieb Mario Juric: >> Hi Peter, >>=20 >> We encountered a problem with a Ruta rule behaving erratically in a = multithreaded environment. We isolated the problem to the following rule = shown in pseudo form: >> BLOCK(ForEach) EnclosingAnnotation.property=3D=3D=E2=80=9Csomething" = {} { >> BOOLEAN ignore =3D false; >> EnclosedAnnotation.property=3D=3D=E2=80=9Csomething = else"{FEATURE("value", =E2=80=9Cignorable") -> ASSIGN(ignore, true)}; >> EnclosingAnnotation.name=3D=3D=E2=80=9CHello"{IF(ignore =3D=3D = false) -> CREATE(AnotherAnnotation, =E2=80=9Cname" =3D =E2=80=9CWorld")}; >> } >> We identified about 1000 documents where =E2=80=9CAnotherAnnotation=E2=80= =9D above should be created, and we reprocessed them several times on = EC2 using Oracle JDK build 1.8.0_151 >> with both Ruta 2.5 and UIMA 2.9 as well as Ruta 2.6.1 and UIMA = 2.10.1. The number of inconsistencies in rule firing over many runs of = the 1K appears erratic between approximately 16% down to approximately = 0,5%, but there was always inconsistencies in every run. Removing the = ignore condition made of course the issue disappear entirely, e.g. >>=20 >> BLOCK(ForEach) EnclosingAnnotation.property=3D=3D=E2=80=9Csomething" = {} { >> EnclosingAnnotation.name=3D=3D=E2=80=9CHello"{ -> = CREATE(AnotherAnnotation, =E2=80=9Cname" =3D =E2=80=9CWorld")}; >> } >> We haven=E2=80=99t experienced the issue in a single threaded = environment yet, but we are not entirely sure whether it is related to = multithreading, although the nature of the problem could point in the = direction of some thread-safety issues around shared data inside Ruta, = but that is just guessing. However, the workaround in our case was too = rewrite the rule as follows: >> BOOLEAN ignore =3D false; >> BLOCK(ForEach) EnclosingAnnotation.property=3D=3D=E2=80=9Csomething" = {-> ASSIGN(ignore, false)} { >> EnclosedAnnotation.property=3D=3D=E2=80=9Csomething = else"{FEATURE("value", =E2=80=9Cignorable") -> ASSIGN(ignore, true)}; >> EnclosingAnnotation.name=3D=3D=E2=80=9CHello"{IF(ignore =3D=3D = false) -> CREATE(AnotherAnnotation, =E2=80=9Cname" =3D =E2=80=9CWorld")}; >> } >> I assume the BLOCK(ForEach) action happen for every occurrence, but I = haven=E2=80=99t actually verified that yet since there is usually only = one occurrence in this particular case, but I was hoping you might be = able to shed some light on this, and the problems we experienced with = the variable declaration inside the block. >>=20 >> Thanks >> Mario >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >=20 --Apple-Mail=_6F3710E8-A0F0-4757-B932-195123FFC2ED--