Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE4FFDBF2 for ; Mon, 19 Nov 2012 11:25:51 +0000 (UTC) Received: (qmail 66800 invoked by uid 500); 19 Nov 2012 11:25:51 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 66638 invoked by uid 500); 19 Nov 2012 11:25:50 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 66460 invoked by uid 99); 19 Nov 2012 11:25:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 11:25:48 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ro.franchini@gmail.com designates 209.85.215.175 as permitted sender) Received: from [209.85.215.175] (HELO mail-ea0-f175.google.com) (209.85.215.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 11:25:43 +0000 Received: by mail-ea0-f175.google.com with SMTP id h11so1876632eaa.6 for ; Mon, 19 Nov 2012 03:25:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; bh=Y5KCYY7WltnQ8tavcRDjXf1X4EVuMvjMeWJqztUjj+g=; b=VkL7lytEQV3mXXPXnhvBmFfNGf1uslPafh2LeeariB2YvFjTGlN4jaCqHnNNlYEjxh 02zRcVG1CKtoM3HMe5efBNTu5SORPlZQKZBzWNqz5dL6c5xnr7W43c0TgBpcXL4arFa1 +lfjorXv/LjGxas83XusYiKcSaH8UM4odrXvmjmzq+0W/iz2zpTbEHtJ9FncD5npKd3r 5sMWaoTuigKmRdr7NTUeEueM8PUfIdsOVINBWEVB0Rlxdx27jCrPo2RCXuEbbMqUC4tg Z+P37xut06tVyYfkZUvMDB6Uw5v5PPBCV2yaCGGSmZubIA8FYm6SCQHvPb+m/OeBka4Y 9q8w== Received: by 10.14.209.201 with SMTP id s49mr23316922eeo.7.1353324322878; Mon, 19 Nov 2012 03:25:22 -0800 (PST) MIME-Version: 1.0 Sender: ro.franchini@gmail.com Received: by 10.223.70.144 with HTTP; Mon, 19 Nov 2012 03:25:02 -0800 (PST) In-Reply-To: <1353315982.76685.YahooMailNeo@web161203.mail.bf1.yahoo.com> References: <1353315982.76685.YahooMailNeo@web161203.mail.bf1.yahoo.com> From: Roberto Franchini Date: Mon, 19 Nov 2012 12:25:02 +0100 X-Google-Sender-Auth: q_TvhimtGlBaLTHW67I8m1WWAwg Message-ID: Subject: Re: drools for annotation sequences To: user@uima.apache.org, Yasen Kiprov Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Nov 19, 2012 at 10:06 AM, Yasen Kiprov wrote: > Hello, > [cut] > > when > Token with text == Mr. and number i > Token with capital letter and number i + 1 > ... > > But it doesn't look right. > > Does anyone have any idea how such patterns can be modeled with Drools? We use the same way. So every annotation emitted before Drools grammars and from grammars too has 2 additional features posBegin and posEnd, where pos stands for "position". So a single token has posBegin equals to posEnd, while a sentence has posEnd greater than posBegin. So, in the "when" part of rules, you can match sequence of tokens (pseudo code): Token $t1 whit text == Mr. and $posBegin=posBegin Token $t2 with ortho=capitalized and posBegin== $posBegin+1 And in the "then" emit new annotations then emit NE($1.sgetStart,$t2.getEnd) And , yes, we adapted our type system to be used this way. The better this is to write a little Drools DSL that encapsulate "when" and "then" frequently used patterns. Hope this help, cheers, FRANK -- Roberto Franchini The impossible is inevitable. http://www.celi.it http://www.blogmeter.it http://github.com/celi-uim http://github.com/robfrank Tel +39.011.562.71.15 jabber:ro.franchini@gmail.com skype:ro.franchini tw:@robfrankie