Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91F2218916 for ; Wed, 14 Oct 2015 10:02:43 +0000 (UTC) Received: (qmail 95425 invoked by uid 500); 14 Oct 2015 10:02:43 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 95382 invoked by uid 500); 14 Oct 2015 10:02:43 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 95368 invoked by uid 99); 14 Oct 2015 10:02:43 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Oct 2015 10:02:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id A4FA5C3BDA for ; Wed, 14 Oct 2015 10:02:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.999 X-Spam-Level: X-Spam-Status: No, score=0.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H2=-0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id K25I42dlsi5K for ; Wed, 14 Oct 2015 10:02:41 +0000 (UTC) Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.17.10]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 28A9D2054C for ; Wed, 14 Oct 2015 10:02:41 +0000 (UTC) Received: from [192.168.11.108] ([132.230.176.14]) by mrelayeu.kundenserver.de (mreue104) with ESMTPSA (Nemesis) id 0Lm4GH-1aLupF1wm1-00Zcea for ; Wed, 14 Oct 2015 12:02:40 +0200 Subject: Re: UIMA Ruta: assign distant annotations to features To: user@uima.apache.org References: <561E1AAE.9090907@averbis.com> From: =?UTF-8?Q?Peter_Kl=c3=bcgl?= Message-ID: <561E2874.1080903@averbis.com> Date: Wed, 14 Oct 2015 12:03:32 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <561E1AAE.9090907@averbis.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Provags-ID: V03:K0:o9A3q3s1R1SZvscjc7oo5rOBle6SOx1OgxwSyN5iYRMCfnzCNF3 zoJPuHkaFYrKCDm66kih8ju4/99azvaqfXCyUoXyyjWBX5Y6WNvt35wBwa1hFdQDDAlLz2V p2EjRp1bhEbV19r9u1qw4cRCAsqxXBiOPOcb7cplxiXGFK7+yqAjN9HXYfmf4e/C/isAhIt VdDFqzyOX9o4uml814ymA== X-UI-Out-Filterresults: notjunk:1;V01:K0:HmlNOWC27H4=:cmpaEk0BN1a0U9+4+2sSWv Qdn5m+r2Kf9S2Ji0lN2Zp/D+82BDDlxVBy8aI18sHaDO9IYGggl0eO3mkEZO1DXYeSveSaTYE wcWOh57arc2b2PSE0C5mwvFD2Zi3vOjuSqtLb16c5yxSmfWPAqu9jKo0SZvfST5cq/0Ys0FfM mFpzKccN5+rjfxh5F1aWC8K/k03v02I6vLDoFewfYw4vE8vV7Jm+jOTx1IfYDCYgVRV3gU9as vIMXQW9IrKwdttUKjjD/3TcfsgqiORQHFYPLklPQFxkoqhsjoScBT61vc+XGkIRe85zwsxaIr 8XEUn/BAcSJ48y9Bj7GfEHZelXMNfElS6F5h2in0om/OTQrPGPVUQCwpFb0Xctxh88NdZcMW2 s9wuK4azQlneSjbHoVceouIdomQH77Ge1jMh/2Iuo/koIVkmSE18u90N6nFS3+JxGRaooyaNb a9X7bHqiewhuUdExjAc61HgHUDzkjBZb5sCRpa8/oSvabgh39Y9IYoNGol78HRynN7gnhlW+U 4rv+luBlCxHyvF3g1XB+df4Wm0CCd1HcMbRvBmVVdo06rIN4Vz8H5vZ+Q5ZaErIKW/EK3MrCJ POxdqDoywdAGcm2GfgENwBuCsXdFIYooHqHDxVuGkfW/Og5KNKfoI9fDosQEKylhpiB7BiCIT GNhpkcQmT0MaMS6AmXT35/bq7C0Gle8GAm1VYI99Q6Ocel1Cg0wQYdvqk1u3vodWO2tILRlUx 9Swot1eR+jpY2szI Sorry, there was a copy/paste mistake: A # @B{-> GATHER(Projection, "a"=1)}; should read A # @B{-> GATHER(Projection, "value"=1)}; Best, Peter Am 14.10.2015 um 11:04 schrieb Peter Klügl: > Hi, > > there were some questions about assignments of annotations to features > lately. This can by quite annoying right now. The problem is caused by > two missing language elements: variables for storing annotations, which > can be used to refer to annotation across rules. Explicit references to > match contexts (rule elements), which facilitate the assignment within > rules (local variables). I hope I will be able to add both language > elements in UIMA Ruta 2.4.0. > > Here a short description how I solve these use cases in UIMA Ruta right now. > > Lets assume we some exemplary document: > > Some text > A > Some text > More text > B > Some text > B > > A and B shall be annotations of the type A and B. The goal is to create > an annotation of type C for each annotation B with the same offsets as > B, and to assign the annotation A to a feature "a" of the annotation C. > > In this simple case, this could be done with GATHER: > > A # @B{-> GATHER(C, "a"=1)}; > > This rule starts to match on an annotation of type B (rule element 3). > Then, it searches for the next annotation of the type A in front of it > (rule element 1). If this is worked, then the GATHER action creates a > new annotation of type C with the offsets of the current annotation B > (rule element 3) and assigns the annotation matched by rule element 1 to > the feature "a" of the annotation C. Then, the rule continues with the > next annotations of type B. Thus, we get an annotation C for each > annotation B if there is an annotation A somewhere before. > > Unfortunately, this is often not possible, e.g., because we need to > assign additional values to other features of the annotation C, which is > not supported by GATHER. Or we cannot directly use type A because the > actual annotation we want to assign is stored in a feature of A. > > In order to overcome the missing languages elements that could be used > to solve this problem, I normally introduce a projection type: an > annotation that stores an annotation but is placed somewhere else in the > document. > > DECLARE Projection (Annotation value); > A # @B{-> GATHER(Projection, "a"=1)}; > B{-> CREATE(C, "x" = "String"), C.a=Projection.value}; > > The first line declares a new type with a feature "value". > The first rule (second line) connects the annotation A with each > position of B. Then, the last rule creates the annotation C, stores > "String" in the feature "x" and assigns the annotation stored in the > feature "value" of an annotation Projection, which has the same offsets > as B (the match context) in the feature "a" of the annotation C. > > This approach does not fit all use cases, but it can be modified to do > so in most cases, e.g., the CREATE and feature assignment can be > separated in order to use them in different BLOCKs, or a feature of A > instead of A itself is stored in Projection. > > This problem can also be solved with other approaches in UIMA Ruta. Feel > free to share your approaches with us. If anyone thinks that this mail > was useful, then I would add it to the documentation. > > Best, > > Peter