uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <pklu...@uni-wuerzburg.de>
Subject Re: UIMA Ruta 2.1.0 Issues
Date Tue, 17 Dec 2013 19:23:23 GMT
Am 17.12.2013 18:14, schrieb Alexandre Patry:
> On 2013-12-17 12:10, Peter Klügl wrote:
>> Am 17.12.2013 18:00, schrieb Alexandre Patry:
>>> On 2013-12-17 11:56, Peter Klügl wrote:
>>>> Hi,
>>>>
>>>> some of the rules behave as expected. It's maybe a bit 
>>>> counterintuitive,
>>>> but I do not see a way to improve it. I will fix the rest in the next
>>>> few days.
>>>>
>>>> An example:
>>>>
>>>> (SPECIAL ALL* SPECIAL) {-> MARK(TMP_GenericAllSTAR)};
>>>>
>>>> ALL is a parent type of SPECIAL and * is a greedy quantifier. 
>>>> Therefore
>>>> ALL matches on all annotations and also on the SPECIAL annotations 
>>>> until
>>>> the end of the document. Then, there is no SPECIAL annotation left to
>>>> match and the rule fails.
>>> Using a reluctant quantifier should work as expected for this specific
>>> case case:
>>>
>>> (SPECIAL ALL*? SPECIAL) {-> MARK(TMP_GenericAllSTAR)};
>>>
>>>
>> Just another comment that has nothing to do with the problem :-)
>>
>> The rule is of course somewhat "slow".
>>
>> I would rather rewrite it in:
>>
>> (SPECIAL # SPECIAL) {-> MARK(TMP_GenericAllSTAR)};
>>
>> Here, the wildcard searches for the next SPECIAL annotation in the index
>> and has not to match on each token until the next SPECIAL annotation.
> Nice trick, thanks for sharing!
>
> Is there a cookbook somewhere where all these tricks are stored?
>

Nope, but I am thinking for some time about adding another chapter in 
the documentation for such stuff, e.g., how to easily include DKPro 
components in Ruta scripts or how to apply Ruta scripts for 
transformation-based part-of-speech tagging.

However, no time...

Best,

Peter



Mime
View raw message