Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CDA26200C86 for ; Wed, 31 May 2017 09:59:13 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id CC567160BCB; Wed, 31 May 2017 07:59:13 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EBEDB160BBA for ; Wed, 31 May 2017 09:59:12 +0200 (CEST) Received: (qmail 27734 invoked by uid 500); 31 May 2017 07:59:11 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 27343 invoked by uid 99); 31 May 2017 07:59:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 May 2017 07:59:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 52C9AC0620 for ; Wed, 31 May 2017 07:59:10 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.796 X-Spam-Level: X-Spam-Status: No, score=-1.796 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id wUdd1YYddzSC for ; Wed, 31 May 2017 07:59:08 +0000 (UTC) Received: from mout.kundenserver.de (mout.kundenserver.de [217.72.192.75]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 376F35F5B6 for ; Wed, 31 May 2017 07:59:07 +0000 (UTC) Received: from [192.168.11.108] ([212.60.243.34]) by mrelayeu.kundenserver.de (mreue102 [212.227.15.183]) with ESMTPSA (Nemesis) id 0LmLZq-1dol3H2Oby-00Zwmv for ; Wed, 31 May 2017 09:59:01 +0200 Subject: Re: UIMA Ruta use of features in a block statement To: user@uima.apache.org References: <2146449917.2945085.1496151331791.JavaMail.zimbra@scai.fraunhofer.de> <1dec92a8-13a0-beb4-bc30-3f056227d693@averbis.com> <1928343073.2947554.1496155059954.JavaMail.zimbra@scai.fraunhofer.de> <2baef45a-0269-af9a-6fb1-48f8af8d992f@scai.fraunhofer.de> From: =?UTF-8?Q?Peter_Kl=c3=bcgl?= Message-ID: Date: Wed, 31 May 2017 09:59:01 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <2baef45a-0269-af9a-6fb1-48f8af8d992f@scai.fraunhofer.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:bgcwChm/PGNv/9hucaxv9O1Tsh4dbwKtm3NPjc904C10d8ulHIa dqhv5uxAA0I/Y/65z1JCTKIMUeQ/hpVydMtuEewYv0YaXHyKRkZXbkYR4hekAqfWyWvR47n K0t9cxgCDEyrDjodVZdYcyFNszyIarIgzaR82BSt2eAowLr2lNt3H1R1i4pSmGiiQWvgcq8 vddcPtY0Lb/opoNOc/Ecg== X-UI-Out-Filterresults: notjunk:1;V01:K0:lmlprVqJXAk=:1zBqzYYsAMOyqGdIOoeBwm oxhmAbozx204dAgoD6JYk4wHHBWwGDBzhMukpQ4vP6ZYLVkD199iWfoxjKQTFWgdDT0ZjnQzz 9r4BThAKqP68S9/EQHxUZb5x9whV+3ObM5MvnVoCLd0kMcVx38QWSInn6PcO+7RWF4A2wngYu oAIwZmCnTucOZsu9Jp163tD79z43/AdKQc7JFhHcM3q8pcoCrJIygmjWQwmEd4c0aY2Gk4DLa XbpQZOb+2gxjXofHigaNOTmzI48ifDD+4BbDylYSrGmq4K9JCSt6pCF03IbZ+Q+Rf+T7t0eBc FL5YuQP8rYVaKy1r2EBrSSSzxxE13dzsS74czAFJnmi81sFV2HvSehBlGS8Ib5CSwJEE9qz6D o6HHd9ls5cLsrOTypGXXe2rkyGj3x4dbKODGYUQx3nq3i6mSf4QpmJR48yTBW0tW1wfJ4IW3j 8dBgQ7P88J94CgdWOByWUlalLkXDMJtpRlY5rcKwfIJzE9k/lvo9zurUkMq+Wo0hXO/KtP6/Y 4rXLKh7/agf/YgrcTHIoSNIRO5MIs0taQF4K/SP27Q/57Mz1mFUVMJy2Sze3GWhjlQ0QG01ko 6x+35HFdzXXtg88/3lShA0cDzNTJgdNLO3TbE6LhTpi/U0lzdxcEIwo7JPSL1OyzXZi7JSsu6 rWD3h8805/C3QXqPOjpZWSQDD2UFF0pXPO7XwYuI/SBwOExOG8ut67vd02PVB5AoASTBMLD+a Cg9jBwncyJU92gtW archived-at: Wed, 31 May 2017 07:59:14 -0000 Hi, Am 30.05.2017 um 19:35 schrieb Sumit Madan: > > We were not able to access the annotation within the BLOCK. > > ### > STRING s; > BOOLEAN a ; > > // This is not working for us: > BLOCK(forEACH) Lemma{}{ > Lemma{->MATCHEDTEXT(s), ASSIGN(a,contains(s,"er"))}; > Lemma{a ->Test1}; > } > > // This is working: > BLOCK(forEACH) Lemma{}{ > W{->MATCHEDTEXT(s), ASSIGN(a,contains(s,"er"))}; > W{a ->Test2}; > } > > // This is also working: > BLOCK(forEACH) Lemma{}{ > Document{->MATCHEDTEXT(s), ASSIGN(a,contains(s,"er"))}; > Document{a ->Test3}; > } > ### > All three examples should work. Which ruta version do you use? It looks like a bug. >> >>> Or is the block statement not needed at all? >> Yes, they are not need at all, also not in similar use cases. If you >> have several conditions with separate actions, you can increase the >> speed using the FOREACH block. > > Alright, that makes sense. > >> >>> I used it because I found it in the user guideline in relation to >>> NER-task: >>> >>> STRING s; >>> BOOLEAN a ; >>> BLOCK(forEACH) W{}{ >>> W{->MATCHEDTEXT(s), ASSIGN(a,contains(s,"er"))}; >>> W{a ->Test}; >>> } >>> So, what's the advantage of the block statement here? >> There is no advantage. I assume that it was necessary when the string >> function have been introduced. However, the language evolved quite a bit >> since then. This example should just illustrate some usage of the >> function. Now, you can also use the function as an implicit condition. >> Thus, you do not need a separate rule after the assign action. Thus, you >> do not need the block to restrict the window in order to restrict the >> usage of the global variable (which is also not required). It's now a >> really bad example... >> >> Now I would write it as: >> >> W{contains(W.ct, "er") -> Test}; > > We weren't aware that we can use contains() as a condition too. I > think the documentation [1] doesn't describe that String functions > with return value boolean can be used as conditions. I think one can > assume if you see the source code (public class > ContainsBooleanFunction extends BooleanFunctionExpression) under [2]. > > [1] https://uima.apache.org/d/ruta-current/tools.ruta.book.html > [2]: > https://github.com/apache/uima-ruta/blob/trunk/ruta-core-ext/src/main/java/org/apache/uima/ruta/string/bool/ContainsBooleanFunction.java > Thank, I'll extend the documentation. >> >>> And one other question: >>> We implemented Ruta into the UIMA pipeline, but in the debug mode, >>> the ruta views "matched rules", "applied rules", "failed rules" etc. >>> are all empty. >>> How can we fix this? >> Can you give me more information? Did you directly launch the ruta >> script in a ruta debug launch configuration or did you include the AE in >> a pipeline and launch that in a java debug launch config? > > We integrated the AE in our pipeline as we have our own type system. > The steps we took to integrate Ruta in our environment: > 1. Read the BasicEngine.xml as AnalysisEngineDescription. > 2. Modify it and add the merged typesystem (Ruta typesystem +ours) > 3. Modify some parameters (such as extensions, debugging, statistics) > 4. Write the modified AnalysisEngineDescription as a temporary file > 5. Create a new aggregate AE (AAED.xml) with a different view name > (sofaName) and use the temporary file as the AnalysisEngine > (descriptorLocation) > 6. Produce AE, process CAS and destroy AE > > All these steps are included in a apply() method, which is very > similar to Ruta::apply(CAS cas, String script, Map > parameters). We just call our apply() method in our UIMA analysis > engines if we need to extract information with Ruta. > >> In general, >> you need to set the two debug config parameters of the RutaEngine to >> true in order to add debug information to the CAS: DEBUG and >> DEBUG_WITH_MATCHES > > They are activated and the CAS contains the debugging information but > still the views are all empty. Do we have to do something else to fill > the views as we are not in the workbench? > Hmm, I do not know, this sounds all correct. Did you switch the CAS view in the CAS Editor? Best, Peter