uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Patry <alexandre.pa...@keatext.com>
Subject Re: UIMA Ruta 2.1.0 Issues
Date Tue, 17 Dec 2013 17:14:53 GMT
On 2013-12-17 12:10, Peter Klügl wrote:
> Am 17.12.2013 18:00, schrieb Alexandre Patry:
>> On 2013-12-17 11:56, Peter Klügl wrote:
>>> Hi,
>>>
>>> some of the rules behave as expected. It's maybe a bit counterintuitive,
>>> but I do not see a way to improve it. I will fix the rest in the next
>>> few days.
>>>
>>> An example:
>>>
>>> (SPECIAL ALL* SPECIAL) {-> MARK(TMP_GenericAllSTAR)};
>>>
>>> ALL is a parent type of SPECIAL and * is a greedy quantifier. Therefore
>>> ALL matches on all annotations and also on the SPECIAL annotations until
>>> the end of the document. Then, there is no SPECIAL annotation left to
>>> match and the rule fails.
>> Using a reluctant quantifier should work as expected for this specific
>> case case:
>>
>> (SPECIAL ALL*? SPECIAL) {-> MARK(TMP_GenericAllSTAR)};
>>
>>
> Just another comment that has nothing to do with the problem :-)
>
> The rule is of course somewhat "slow".
>
> I would rather rewrite it in:
>
> (SPECIAL # SPECIAL) {-> MARK(TMP_GenericAllSTAR)};
>
> Here, the wildcard searches for the next SPECIAL annotation in the index
> and has not to match on each token until the next SPECIAL annotation.
Nice trick, thanks for sharing!

Is there a cookbook somewhere where all these tricks are stored?

-- 
Alexandre Patry, Ph.D
Chercheur / Researcher
http://KeaText.com


Mime
View raw message