uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kl├╝gl <pklu...@uni-wuerzburg.de>
Subject Re: Ruta unmark tokens from given possition.
Date Wed, 07 Jan 2015 16:52:58 GMT
Hi,

Am 07.01.2015 um 17:36 schrieb Silvestre Losada:
> I'm answering to my self,
>
> (ANY{->UNMARK(ANY)}){CONTEXTCOUNT(Document,20,2000)};

nice rule :-D
but it's probably a bit slow

My first guess was something with a min/max quantifier since these
counting conditions like CONTEXTCOUNT and POSITION are slow.

Something like

Document{-> MARKFIRST(FirstToken)};
FirstToken ANY[19, 19] ANY[1800,1800]{-> UNMARK(ANY)};


> Removes all annotations generated by default seeder that are in postions 20
> to 2000. This is working, however it does not work for RutaBasic
> annotations using the same expression
>
> (RutaBasic{->UNMARK(RutaBasic)}){CONTEXTCOUNT(Document,20,2000)};

RutaBasic annotations may not be removed by rules. These annotations
build the complete disjunct partitioning of the document and store
important information. The rule need them for working properly. If you
do not want them in your CAS, you can remove them after applying a ruta
analysis engine. There's a configuration parameter "removeBasics". When
activated, the RutaBasic and the seeding annotations are removed as the
last action of the process() method.


Best,

Peter


> I dont know the explanation.
>
> Regards
>
> On 7 January 2015 at 16:46, Silvestre Losada <silvestre.losada@gmail.com>
> wrote:
>
>> HI,
>>
>> I'm creating a ruta script, and I want to remove all tokens that are after
>> position X, in other words I only want to keep first X tokens. I was
>> playing with ruta conditions, actions and I dont know how to make it. do
>> you think is possible to do that?
>>
>> Kind regards.
>>



Mime
View raw message