Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 047EFD562 for ; Mon, 19 Nov 2012 13:02:15 +0000 (UTC) Received: (qmail 7928 invoked by uid 500); 19 Nov 2012 13:02:14 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 7548 invoked by uid 500); 19 Nov 2012 13:02:09 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 7526 invoked by uid 99); 19 Nov 2012 13:02:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 13:02:08 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of alexander.garvin.klenner@scai.fraunhofer.de designates 153.96.1.56 as permitted sender) Received: from [153.96.1.56] (HELO iron02.fraunhofer.de) (153.96.1.56) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 13:02:00 +0000 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AiUCAGUsqlCZYQcakWdsb2JhbABFhiCqBZMbIwEBAQEUEhQngh4BAQQBI2IPEQQBAQECAg0ZAiMkCggZh3sDCQUBBJ4tjlOITw2JVIEiiilpGoFMghSBEwOSSoFdBIFRgRyKFgOHfoFjNQ X-IronPort-AV: E=Sophos;i="4.83,279,1352070000"; d="scan'208";a="42442464" Received: from mail-mtas26.fraunhofer.de ([153.97.7.26]) by iron02.fraunhofer.de with ESMTP/TLS/RC4-SHA; 19 Nov 2012 14:01:39 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAFADotqlDBr6eC/2dsb2JhbABFhiCqBZMbgQiCHgEBBAEjYg8RBAEBAQICDRkCIyQKCBmHewMJBgSeLo5TiE8NiVSBIoopaRqBTIIUgRMDkkqBXQSBUYEcihYDhQ6CcIFjNQ X-IronPort-AV: E=Sophos;i="4.83,279,1352070000"; d="scan'208";a="14912886" Received: from kso.scai.fraunhofer.de ([193.175.167.130]) by mail-mtaS26.fraunhofer.de with ESMTP/TLS/DHE-RSA-AES256-SHA; 19 Nov 2012 14:01:37 +0100 Received: from zimbra.scai.fraunhofer.de (zimbra.scai.fraunhofer.de [129.26.133.5]) by kso.scai.fraunhofer.de (8.13.5+/8.13.5) with ESMTP id qAJD4DWC011747 for ; Mon, 19 Nov 2012 14:04:13 +0100 (CET) Date: Mon, 19 Nov 2012 14:01:31 +0100 (CET) From: Alexander Klenner To: user@uima.apache.org Message-ID: <1221916930.1378826.1353330091679.JavaMail.root@scai.fraunhofer.de> In-Reply-To: Subject: Re: drools for annotation sequences MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [129.26.244.40] X-Mailer: Zimbra 7.2.0_GA_2669 (ZimbraWebClient - GC23 (Mac)/7.2.0_GA_2669) X-Virus-Checked: Checked by ClamAV on apache.org Hi Yason, Roberto and others, we ran into the same problem after one day of using Drools. Using an artificial index however doesn't work for us since we have many Annotations and a complex TypeSystem, we had to retouch all our AEs to get this index working - not possible for us. What do you think about an ArrayList, where all Annotations of one Type are stored in their order of appearance (e.g. sorted by Begin). Instead of only adding all Annotationen classes to Drools we now also add this ArrayList, which we create in our Drools AE? Does such an approach make sense? We can ask for their respective ordering by using indexOf(Object). Best regards, Alex -- Alexander G. Klenner Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI) Schloss Birlinghoven, D-53754 Sankt Augustin Tel.: +49 - 2241 - 14 - 2736 E-mail: alexander.garvin.klenner@scai.fraunhofer.de Internet: http://www.scai.fraunhofer.de ----- Original Message ----- From: "Roberto Franchini" To: user@uima.apache.org, "Yasen Kiprov" Sent: Monday, November 19, 2012 12:25:02 PM Subject: Re: drools for annotation sequences On Mon, Nov 19, 2012 at 10:06 AM, Yasen Kiprov wrote: > Hello, > [cut] > > when > Token with text == Mr. and number i > Token with capital letter and number i + 1 > ... > > But it doesn't look right. > > Does anyone have any idea how such patterns can be modeled with Drools? We use the same way. So every annotation emitted before Drools grammars and from grammars too has 2 additional features posBegin and posEnd, where pos stands for "position". So a single token has posBegin equals to posEnd, while a sentence has posEnd greater than posBegin. So, in the "when" part of rules, you can match sequence of tokens (pseudo code): Token $t1 whit text == Mr. and $posBegin=posBegin Token $t2 with ortho=capitalized and posBegin== $posBegin+1 And in the "then" emit new annotations then emit NE($1.sgetStart,$t2.getEnd) And , yes, we adapted our type system to be used this way. The better this is to write a little Drools DSL that encapsulate "when" and "then" frequently used patterns. Hope this help, cheers, FRANK -- Roberto Franchini The impossible is inevitable. http://www.celi.it http://www.blogmeter.it http://github.com/celi-uim http://github.com/robfrank Tel +39.011.562.71.15 jabber:ro.franchini@gmail.com skype:ro.franchini tw:@robfrankie