uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Franchini <franch...@celi.it>
Subject Re: drools for annotation sequences
Date Mon, 19 Nov 2012 11:25:02 GMT
On Mon, Nov 19, 2012 at 10:06 AM, Yasen Kiprov <yasenkiprov@yahoo.com> wrote:
> Hello,

> when
> Token with text == Mr. and number i
> Token with capital letter and number i + 1
> ...
> But it doesn't look right.
> Does anyone have any idea how such patterns can be modeled with Drools?

We use the same way.
So every annotation emitted before Drools grammars and from grammars
too has 2 additional features posBegin and posEnd, where pos stands
for "position".
So a single token has posBegin equals to posEnd, while a sentence has
posEnd greater than posBegin.

So, in the "when" part of rules, you can match sequence of tokens (pseudo code):

Token $t1 whit text == Mr. and $posBegin=posBegin
Token $t2 with ortho=capitalized and posBegin== $posBegin+1

And in the "then" emit new annotations
emit NE($1.sgetStart,$t2.getEnd)

And , yes, we adapted our type system to be used this way.
The better this is to write a little Drools DSL that encapsulate
"when" and "then" frequently used patterns.

Hope this help,

Roberto Franchini
The impossible is inevitable.
http://www.celi.it                     http://www.blogmeter.it
http://github.com/celi-uim       http://github.com/robfrank
Tel +39.011.562.71.15
jabber:ro.franchini@gmail.com skype:ro.franchini tw:@robfrankie

View raw message